VYPR
advisoryPublished Jun 3, 2026· 1 source

Docling Project: Eight High-Severity Vulnerabilities Disclosed Together

Key findings • Eight high-severity vulnerabilities disclosed for Docling Project on June 3, 2026. • Vulnerabilities include SSRF, Path Traversal, XXE, and unsafe archive extraction. • Imp…

Key findings

  • Eight high-severity vulnerabilities disclosed for Docling Project on June 3, 2026.
  • Vulnerabilities include SSRF, Path Traversal, XXE, and unsafe archive extraction.
  • Impacts core functionalities, HTML, LaTeX, USPTO Patent, METS-GBS, and EasyOCR backends.
  • Multiple CVEs allow arbitrary file read access on the server.
  • Affected versions range from pre-1.5.0 up to versions just below 2.91.0.
  • Patches are available for specific versions, requiring prompt updates.

On June 3, 2026, a cluster of eight high-severity vulnerabilities was disclosed for the Docling Project, affecting multiple components including its core, HTML backend, LaTeX backend, and specific parsers. These vulnerabilities, disclosed within a span of just over an hour, highlight significant security weaknesses in how the project handles remote resources, file paths, and data parsing.

The vulnerabilities span several attack vectors, including Server-Side Request Forgery (SSRF), Path Traversal, XML External Entity (XXE) attacks, and unsafe archive extraction. The docling-core itself is impacted by CVE-2026-44023, which involves unsafe remote filename resolution that could lead to SSRF attacks targeting local files. Additionally, CVE-2026-44019 in docling-core allows local file:// image references and unconstrained data: URIs, potentially exposing local files or causing excessive memory use.

The HTML backend is particularly affected by CVE-2026-47214, which permits local file system access via file:// URIs, directory traversal using ../ sequences, and the use of absolute paths. This could allow attackers to access sensitive files outside intended directories. Similarly, the LaTeX backend is vulnerable to path traversal through commands like \includegraphics and \input via CVE-2026-44022, enabling attackers to read arbitrary files on the server.

XML parsing vulnerabilities are also prominent in this batch. CVE-2026-44020, affecting the USPTO Patent Backend, suffers from unprotected XML External Entity (XXE) attacks due to the use of standard XML parsing without proper safeguards. This allows for arbitrary file reads and SSRF. The METS-GBS Backend is also susceptible to XXE attacks and other issues through CVE-2026-44018, which includes unsafe archive extraction, decompression bombs, and unbounded extraction, alongside XXE vulnerabilities.

Further impacting the project's security posture are vulnerabilities related to rendering and model downloads. CVE-2026-44016, concerning the Playwright-based HTML rendering feature, could allow JavaScript execution and unrestricted network access when processing untrusted HTML documents if the rendering option is explicitly configured. Finally, CVE-2026-44017 highlights an unsafe Zip extraction in the EasyOCR model download functionality, enabling Zip Slip attacks that could write arbitrary files to any location accessible by the process.

Most of these vulnerabilities affect versions prior to specific patch releases. For instance, CVE-2026-44023 and CVE-2026-44019 are patched in versions >= 1.5.0, < 2.74.1 and >= 2.5.0, < 2.74.1 respectively. CVE-2026-44016 is fixed in versions >= 2.82.0, < 2.91.0, and CVE-2026-44017 is addressed in versions < 2.91.0. The specific patch versions for other CVEs in this batch are not detailed but are implied to be within the same release cycle.

Users of the Docling Project are strongly advised to review the specific versioning information for each vulnerability and apply the necessary updates promptly. The widespread nature of these issues across different modules underscores the importance of a comprehensive security audit for applications relying on Docling for document processing and rendering.

Synthesized by Vypr AI