NLTK has a Downloader Path Traversal Vulnerability (AFO) - Arbitrary File Overwrite
Description
NLTK (Natural Language Toolkit) is a suite of open source Python modules, data sets, and tutorials supporting research and development in Natural Language Processing. In versions 3.9.3 and prior, the NLTK downloader does not validate the subdir and id attributes when processing remote XML index files. Attackers can control a remote XML index server to provide malicious values containing path traversal sequences (such as ../), which can lead to arbitrary directory creation, arbitrary file creation, and arbitrary file overwrite. Commit 89fe2ec2c6bae6e2e7a46dad65cc34231976ed8a patches the issue.
Affected packages
Versions sourced from the GitHub Security Advisory.
| Package | Affected versions | Patched versions |
|---|---|---|
nltkPyPI | <= 3.9.2 | — |
Affected products
1Patches
11 file changed · +24 −0
nltk/downloader.py+24 −0 modified@@ -216,10 +216,22 @@ def __init__( self.name = name or id """A string name for this package.""" + # Validate subdir to prevent path traversal from malicious XML index + if os.path.isabs(subdir) or ".." in subdir.replace("\\", "/").split("/"): + raise ValueError( + f"Invalid package subdir {subdir!r}: must be a relative path " + f"without parent directory references" + ) self.subdir = subdir """The subdirectory where this package should be installed. E.g., ``'corpora'`` or ``'taggers'``.""" + # Validate id to prevent path traversal + if os.sep in id or "/" in id or "\\" in id or ".." in id: + raise ValueError( + f"Invalid package id {id!r}: must not contain path separators" + ) + self.url = url """A URL that can be used to download this package's file.""" @@ -677,6 +689,18 @@ def _download_package(self, info, download_dir, force): # Check for (and remove) any old/stale version. filepath = os.path.join(download_dir, info.filename) + + # Defense-in-depth: verify filepath stays within download_dir + real_download = os.path.realpath(os.path.abspath(download_dir)) + real_filepath = os.path.realpath(os.path.abspath(filepath)) + if not real_filepath.startswith(real_download + os.sep): + yield ErrorMessage( + info, + f"Path traversal blocked: package '{info.id}' attempted to " + f"write outside download directory (subdir='{info.subdir}')", + ) + return + if os.path.exists(filepath): if status == self.STALE: yield StaleMessage(info)
Vulnerability mechanics
Generated by null/stub on May 9, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.
References
4- github.com/advisories/GHSA-469j-vmhf-r6v7ghsaADVISORY
- nvd.nist.gov/vuln/detail/CVE-2026-33236ghsaADVISORY
- github.com/nltk/nltk/commit/89fe2ec2c6bae6e2e7a46dad65cc34231976ed8aghsax_refsource_MISCWEB
- github.com/nltk/nltk/security/advisories/GHSA-469j-vmhf-r6v7ghsax_refsource_CONFIRMWEB
News mentions
0No linked articles in our index yet.