Apache PDFBox Examples: Path Traversal in PDFBox ExtractEmbeddedFiles Example Code
Description
This issue affects the ExtractEmbeddedFiles example in Apache PDFBox: from 2.0.24 through 2.0.35, from 3.0.0 through 3.0.6.
The ExtractEmbeddedFiles example contains a path traversal vulnerability (CWE-22) because the filename that is obtained from PDComplexFileSpecification.getFilename() is appended to the extraction path.
Users who have copied this example into their production code should review it to ensure that the extraction path is acceptable. The example has been changed accordingly, now the initial path and the extraction paths are converted into canonical paths and it is verified that extraction path contains the initial path. The documentation has also been adjusted.
AI Insight
LLM-synthesized narrative grounded in this CVE's description and references.
Apache PDFBox ExtractEmbeddedFiles example has a path traversal vulnerability (CWE-22) allowing attackers to write files outside the intended directory.
Vulnerability
Overview
The ExtractEmbeddedFiles example in Apache PDFBox versions 2.0.24 through 2.0.35 and 3.0.0 through 3.0.6 contains a path traversal vulnerability (CWE-22). The issue arises because the filename obtained from PDComplexFileSpecification.getFilename() is directly appended to the extraction path without proper validation [1][4]. This allows a malicious PDF to specify a filename containing path traversal sequences (e.g., ../) to write extracted files to arbitrary locations on the filesystem.
Exploitation
Details
An attacker can exploit this vulnerability by crafting a PDF with an embedded file whose filename includes path traversal characters. When the vulnerable example code extracts the embedded file, it concatenates the attacker-controlled filename with the base extraction directory, resulting in file writes outside the intended directory. No authentication is required beyond the ability to supply a malicious PDF to a service using the affected example code [1][4].
Impact
Successful exploitation allows an attacker to write arbitrary file content to arbitrary locations on the server's filesystem, potentially leading to remote code execution, overwriting of critical files, or other unauthorized modifications. The vulnerability is rated as moderate severity [4].
Mitigation
The Apache PDFBox project has fixed the example by converting both the commit at [3], which converts both the initial path and the extraction path to canonical paths and verifies that the extraction path contains the initial path. Users who have copied this example into production code should review and update their code to include similar path validation. The fix is available in PDFBox 2.0.36 and 3.0.7 [4].
AI Insight generated on May 18, 2026. Synthesized from this CVE's description and the cited reference URLs; citations are validated against the source bundle.
Affected packages
Versions sourced from the GitHub Security Advisory.
| Package | Affected versions | Patched versions |
|---|---|---|
org.apache.pdfbox:pdfbox-examplesMaven | >= 2.0.24, < 3.0.7 | 3.0.7 |
Affected products
2- Apache Software Foundation/Apache PDFBox Examplesv5Range: 2.0.24
Patches
1b028eafdf101PDFBOX-5660: use more accurate names
1 file changed · +14 −14
examples/src/main/java/org/apache/pdfbox/examples/pdmodel/ExtractEmbeddedFiles.java+14 −14 modified@@ -62,26 +62,26 @@ public static void main( String[] args ) throws IOException } File pdfFile = new File(args[0]); - String filePath = pdfFile.getParent() + FileSystems.getDefault().getSeparator(); + String directoryPath = pdfFile.getParent() + FileSystems.getDefault().getSeparator(); try (PDDocument document = Loader.loadPDF(pdfFile)) { PDDocumentNameDictionary namesDictionary = new PDDocumentNameDictionary(document.getDocumentCatalog()); PDEmbeddedFilesNameTreeNode efTree = namesDictionary.getEmbeddedFiles(); if (efTree != null) { - extractFilesFromEFTree(efTree, filePath); + extractFilesFromEFTree(efTree, directoryPath); } // extract files from page annotations for (PDPage page : document.getPages()) { - extractFilesFromPage(page, filePath); + extractFilesFromPage(page, directoryPath); } } } - private static void extractFilesFromPage(PDPage page, String filePath) throws IOException + private static void extractFilesFromPage(PDPage page, String directoryPath) throws IOException { for (PDAnnotation annotation : page.getAnnotations()) { @@ -95,19 +95,19 @@ private static void extractFilesFromPage(PDPage page, String filePath) throws IO PDEmbeddedFile embeddedFile = getEmbeddedFile(complexFileSpec); if (embeddedFile != null) { - extractFile(filePath, complexFileSpec.getFilename(), embeddedFile); + extractFile(complexFileSpec.getFilename(), embeddedFile, directoryPath); } } } } } - private static void extractFilesFromEFTree(PDNameTreeNode<PDComplexFileSpecification> efTree, String filePath) throws IOException + private static void extractFilesFromEFTree(PDNameTreeNode<PDComplexFileSpecification> efTree, String directoryPath) throws IOException { Map<String, PDComplexFileSpecification> names = efTree.getNames(); if (names != null) { - extractFiles(names, filePath); + extractFiles(names, directoryPath); } else { @@ -118,29 +118,29 @@ private static void extractFilesFromEFTree(PDNameTreeNode<PDComplexFileSpecifica } for (PDNameTreeNode<PDComplexFileSpecification> node : kids) { - extractFilesFromEFTree(node, filePath); + extractFilesFromEFTree(node, directoryPath); } } } - private static void extractFiles(Map<String, PDComplexFileSpecification> names, String filePath) + private static void extractFiles(Map<String, PDComplexFileSpecification> names, String directoryPath) throws IOException { for (Entry<String, PDComplexFileSpecification> entry : names.entrySet()) { - PDComplexFileSpecification fileSpec = entry.getValue(); - PDEmbeddedFile embeddedFile = getEmbeddedFile(fileSpec); + PDComplexFileSpecification complexFileSpec = entry.getValue(); + PDEmbeddedFile embeddedFile = getEmbeddedFile(complexFileSpec); if (embeddedFile != null) { - extractFile(filePath, fileSpec.getFilename(), embeddedFile); + extractFile(complexFileSpec.getFilename(), embeddedFile, directoryPath); } } } - private static void extractFile(String filePath, String filename, PDEmbeddedFile embeddedFile) + private static void extractFile(String filename, PDEmbeddedFile embeddedFile, String directoryPath) throws IOException { - String embeddedFilename = filePath + filename; + String embeddedFilename = directoryPath + filename; File file = new File(embeddedFilename); File parentDir = file.getParentFile(); if (!parentDir.exists())
Vulnerability mechanics
Generated by null/stub on May 9, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.
References
5- github.com/advisories/GHSA-jjwr-xmw6-gf78ghsaADVISORY
- lists.apache.org/thread/gyfq5tcrxfv7rx0z2yyx4hb3h53ndffwghsavendor-advisoryWEB
- nvd.nist.gov/vuln/detail/CVE-2026-23907ghsaADVISORY
- www.openwall.com/lists/oss-security/2026/03/10/1ghsaWEB
- github.com/apache/pdfbox/commit/b028eafdf101b58e4ee95430c3be25e3e3aa29d7ghsaWEB
News mentions
0No linked articles in our index yet.