VYPR
Critical severityNVD Advisory· Published Aug 20, 2025· Updated Feb 26, 2026

Apache Tika PDF parser module: XXE vulnerability in PDFParser's handling of XFA

CVE-2025-54988

Description

Critical XXE in Apache Tika (tika-parser-pdf-module) in Apache Tika 1.13 through and including 3.2.1 on all platforms allows an attacker to carry out XML External Entity injection via a crafted XFA file inside of a PDF. An attacker may be able to read sensitive data or trigger malicious requests to internal resources or third-party servers. Note that the tika-parser-pdf-module is used as a dependency in several Tika packages including at least: tika-parsers-standard-modules, tika-parsers-standard-package, tika-app, tika-grpc and tika-server-standard.

Users are recommended to upgrade to version 3.2.2, which fixes this issue.

AI Insight

LLM-synthesized narrative grounded in this CVE's description and references.

Critical XXE vulnerability in Apache Tika's tika-parser-pdf-module allows reading sensitive data via crafted XFA file in PDF; update to 3.2.2.

CVE-2025-54988 is a critical XML External Entity (XXE) injection vulnerability in the Apache Tika PDF parser module (tika-parser-pdf-module). The flaw arises from improper handling of XFA (XML Forms Architecture) files embedded within PDFs, allowing an attacker to define malicious external entities that the parser processes without restriction [4].

Exploitation requires only a crafted PDF containing a malicious XFA file; no authentication is needed if the parser processes user-supplied documents. The XXE can be used to read arbitrary files from the server's filesystem (e.g., configuration files, credentials) or to send HTTP requests to internal resources or third-party servers (SSRF) [4]. The vulnerable module is included in multiple Tika packages—such as tika-parsers-standard-modules, tika-app, and tika-server—widening the potential attack surface.

The impact includes sensitive data exfiltration and potential pivoting to internal networks via SSRF. As a fix, Apache Tika 3.2.2 (released 2025-08-06) resolves the issue [1]. Users on versions 1.13 through 3.2.1 should upgrade immediately; no workarounds are documented.

AI Insight generated on May 19, 2026. Synthesized from this CVE's description and the cited reference URLs; citations are validated against the source bundle.

Affected packages

Versions sourced from the GitHub Security Advisory.

PackageAffected versionsPatched versions
org.apache.tika:tika-parser-pdf-moduleMaven
>= 1.13, < 3.2.23.2.2
org.apache.tika:tika-parsersMaven
>= 1.13, < 2.0.0-ALPHA2.0.0-ALPHA

Affected products

2
  • Apache/Tikallm-fuzzy
    Range: >=1.13 <=3.2.1
  • Apache Software Foundation/Apache Tika PDF parser modulev5
    Range: 1.13

Patches

1
2b52257304f4

TIKA-4459 -- force stream to zip file to handle encrypted od* documents correctly (#2291)

https://github.com/apache/tikaTim AllisonJul 31, 2025via ghsa
2 files changed · +20 52
  • tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-miscoffice-module/src/main/java/org/apache/tika/parser/odf/OpenDocumentParser.java+13 52 modified
    @@ -20,16 +20,13 @@
     
     import java.io.IOException;
     import java.io.InputStream;
    -import java.util.ArrayList;
     import java.util.Arrays;
     import java.util.Collections;
     import java.util.Enumeration;
     import java.util.HashSet;
    -import java.util.List;
     import java.util.Set;
     import java.util.zip.ZipEntry;
     import java.util.zip.ZipFile;
    -import java.util.zip.ZipInputStream;
     
     import org.apache.commons.io.IOUtils;
     import org.apache.commons.io.input.CloseShieldInputStream;
    @@ -40,7 +37,6 @@
     import org.apache.tika.config.Field;
     import org.apache.tika.exception.EncryptedDocumentException;
     import org.apache.tika.exception.TikaException;
    -import org.apache.tika.exception.WriteLimitReachedException;
     import org.apache.tika.extractor.EmbeddedDocumentUtil;
     import org.apache.tika.io.TikaInputStream;
     import org.apache.tika.metadata.Metadata;
    @@ -134,21 +130,21 @@ public void parse(InputStream stream, ContentHandler baseHandler, Metadata metad
             // Open the Zip stream
             // Use a File if we can, and an already open zip is even better
             ZipFile zipFile = null;
    -        ZipInputStream zipStream = null;
    +        TikaInputStream tmpTis = null;
             if (stream instanceof TikaInputStream) {
                 TikaInputStream tis = (TikaInputStream) stream;
                 Object container = ((TikaInputStream) stream).getOpenContainer();
                 if (container instanceof ZipFile) {
                     zipFile = (ZipFile) container;
    -            } else if (tis.hasFile()) {
    -                zipFile = new ZipFile(tis.getFile());
                 } else {
    -                zipStream = new ZipInputStream(stream);
    +                zipFile = new ZipFile(tis.getFile());
    +                tis.setOpenContainer(zipFile);
                 }
             } else {
    -            zipStream = new ZipInputStream(stream);
    +            tmpTis = TikaInputStream.get(stream);
    +            tmpTis.setOpenContainer(new ZipFile(tmpTis.getFile()));
    +            zipFile = (ZipFile) tmpTis.getOpenContainer();
             }
    -
             // Prepare to handle the content
             XHTMLContentHandler xhtml = new XHTMLContentHandler(baseHandler, metadata);
             xhtml.startDocument();
    @@ -157,19 +153,13 @@ public void parse(InputStream stream, ContentHandler baseHandler, Metadata metad
             EndDocumentShieldingContentHandler handler = new EndDocumentShieldingContentHandler(xhtml);
     
             try {
    -            if (zipFile != null) {
    -                try {
    -                    handleZipFile(zipFile, metadata, context, handler, embeddedDocumentUtil);
    -                } finally {
    -                    //Do we want to close silently == catch an exception here?
    -                    zipFile.close();
    -                }
    -            } else {
    -                try {
    -                    handleZipStream(zipStream, metadata, context, handler, embeddedDocumentUtil);
    -                } finally {
    -                    //Do we want to close silently == catch an exception here?
    -                    zipStream.close();
    +            try {
    +                handleZipFile(zipFile, metadata, context, handler, embeddedDocumentUtil);
    +            } finally {
    +                //Do we want to close silently == catch an exception here?
    +                if (tmpTis != null) {
    +                    //tmpTis handles closing of the open zip container
    +                    tmpTis.close();
                     }
                 }
             } catch (SAXException e) {
    @@ -194,35 +184,6 @@ public boolean isExtractMacros() {
             return extractMacros;
         }
     
    -    private void handleZipStream(ZipInputStream zipStream, Metadata metadata, ParseContext context,
    -                                 EndDocumentShieldingContentHandler handler,
    -                                 EmbeddedDocumentUtil embeddedDocumentUtil)
    -            throws IOException, TikaException, SAXException {
    -        ZipEntry entry = zipStream.getNextEntry();
    -        if (entry == null) {
    -            throw new IOException("No entries found in ZipInputStream");
    -        }
    -        List<SAXException> exceptions = new ArrayList<>();
    -        do {
    -            try {
    -                handleZipEntry(entry, zipStream, metadata, context, handler,
    -                        embeddedDocumentUtil);
    -            } catch (SAXException e) {
    -                WriteLimitReachedException.throwIfWriteLimitReached(e);
    -                if (e.getCause() instanceof EncryptedDocumentException) {
    -                    throw (EncryptedDocumentException)e.getCause();
    -                } else {
    -                    exceptions.add(e);
    -                }
    -            }
    -            entry = zipStream.getNextEntry();
    -        } while (entry != null);
    -
    -        if (exceptions.size() > 0) {
    -            throw exceptions.get(0);
    -        }
    -    }
    -
         private void handleZipFile(ZipFile zipFile, Metadata metadata, ParseContext context,
                                    EndDocumentShieldingContentHandler handler,
                                    EmbeddedDocumentUtil embeddedDocumentUtil)
    
  • tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-miscoffice-module/src/test/java/org/apache/tika/parser/odf/ODFParserTest.java+7 0 modified
    @@ -25,6 +25,7 @@
     import java.io.IOException;
     import java.io.InputStream;
     import java.nio.charset.StandardCharsets;
    +import java.nio.file.Files;
     import java.nio.file.Path;
     import java.nio.file.Paths;
     import java.util.Arrays;
    @@ -415,6 +416,12 @@ public void testEncryptedODTFile() throws Exception {
                 getRecursiveMetadata(p, false);
             });
     
    +        assertThrows(EncryptedDocumentException.class, () -> {
    +            try (InputStream is = Files.newInputStream(p)) {
    +                getRecursiveMetadata(is, false);
    +            }
    +        });
    +
             List<Metadata> metadataList = getRecursiveMetadata(p, true);
             assertEquals("true", metadataList.get(0).get(TikaCoreProperties.IS_ENCRYPTED));
         }
    

Vulnerability mechanics

Generated on May 9, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.

References

11

News mentions

0

No linked articles in our index yet.