Docling: Unsafe Playwright-based HTML Rendering
Description
Impact
In versions >= 2.82.0, < 2.91.0, if the HTML backend was explicitly configured for rendering (rendering option by default deactivated), then the Playwright-based rendering feature could allow JavaScript execution and unrestricted network access when processing untrusted HTML documents. An attacker could craft malicious HTML that executes arbitrary JavaScript in the rendering context or makes unauthorized network requests to internal services, potentially leading to SSRF attacks, data exfiltration, or remote code execution in the rendering environment.
Patches
Fixed in version 2.91.0. The rendering context now explicitly disables JavaScript execution (java_script_enabled=False) and implements network isolation controls. When enable_remote_fetch is disabled, the browser operates in offline mode, preventing all network requests.
Workarounds
Refrain from using render_page=True when processing untrusted HTML documents.
### References - Fix release: v2.91.0
Affected products
2- Range: >=2.82.0,<2.91.0
Patches
19813190ab412fix: Fixes to html_backend (#3342)
1 file changed · +39 −0
docling/backend/html_backend.py+39 −0 modified@@ -529,6 +529,25 @@ def _coerce_base_url(self, value: str) -> str: return value return Path(value).resolve().as_uri() + def _get_browser_request_block_reason(self, request_url: str) -> Optional[str]: + parsed = urlparse(request_url) + scheme = (parsed.scheme or "").lower() + if scheme in {"file", "data", "about", "blob"}: + return None + + if HTMLDocumentBackend._is_remote_url(request_url): + if self.options.enable_remote_fetch: + return None + return ( + "remote fetch is disabled " + "(set options.enable_remote_fetch=True to allow)" + ) + + return f"URL scheme '{scheme or '<empty>'}' is not allowed" + + def _is_browser_request_allowed(self, request_url: str) -> bool: + return self._get_browser_request_block_reason(request_url) is None + def _get_render_html_text(self) -> str: if self._raw_html_bytes is None: return "" @@ -582,10 +601,30 @@ def _render_with_browser(self) -> None: with sync_playwright() as playwright: browser = playwright.chromium.launch(headless=True) + # If remote fetch is disabled, keep Chromium offline. + offline_mode = not options.enable_remote_fetch context = browser.new_context( viewport={"width": width, "height": height}, device_scale_factor=options.render_device_scale, + # Disable page JavaScript execution for deterministic static rendering. + java_script_enabled=False, + offline=offline_mode, + service_workers="block", ) + + def _route_request(route, request) -> None: + block_reason = self._get_browser_request_block_reason(request.url) + if block_reason is None: + route.continue_() + else: + warnings.warn( + "Blocked browser request during HTML rendering: " + f"{request.method} {request.url} ({block_reason})" + ) + route.abort("blockedbyclient") + + context.route("**/*", _route_request) + page = context.new_page() if options.render_print_media: page.emulate_media(media="print")
Vulnerability mechanics
Root cause
"The Playwright rendering backend did not disable JavaScript execution and lacked network isolation for untrusted HTML."
Attack vector
An attacker can provide untrusted HTML documents to the application when the HTML backend is explicitly configured for rendering (`render_page=True`). This can lead to arbitrary JavaScript execution within the rendering context or unauthorized network requests to internal services, potentially causing SSRF or data exfiltration [CWE-918].
Affected code
The vulnerability lies within the `_render_with_browser` function in `docling/backend/html_backend.py`. The patch modifies this function to explicitly set `java_script_enabled=False` and configure `offline_mode` based on the `enable_remote_fetch` option [patch_id=4714026].
What the fix does
The patch disables JavaScript execution by default in the Playwright rendering context by setting `java_script_enabled=False`. Additionally, it implements network isolation controls; when `enable_remote_fetch` is disabled, the browser operates in offline mode, preventing all network requests and mitigating SSRF risks [patch_id=4714026].
Preconditions
- configThe HTML backend must be explicitly configured for rendering (e.g., `render_page=True`).
- inputThe application must process untrusted HTML documents.
Generated on Jun 3, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.
References
3News mentions
1- Docling Project: Eight High-Severity Vulnerabilities Disclosed TogetherVypr Intelligence · Jun 3, 2026