CVE-2026-48156
Description
pypdf is a free and open-source pure-python PDF library. Prior to 6.12.0, an attacker who uses this vulnerability can craft a PDF which leads to long runtimes. This requires cross-reference streams with /W [0 0 0] values and large /Size values. This vulnerability is fixed in 6.12.0.
AI Insight
LLM-synthesized narrative grounded in this CVE's description and references.
pypdf prior to 6.12.0 is vulnerable to long runtimes via crafted PDFs with zero-width cross-reference streams and large size values.
Vulnerability
In pypdf versions prior to 6.12.0, the cross-reference stream parser does not validate the width array /W and the size entry /Size. An attacker can craft a PDF containing a cross-reference stream with /W [0 0 0] and an arbitrarily large /Size value. When pypdf processes such a PDF, it attempts to read a large number of cross-reference entries, leading to excessive processing time. This affects all versions before 6.12.0 [1][2][3].
Exploitation
An attacker needs only the ability to supply a malicious PDF to a system or user that uses pypdf to read it. No authentication or special privileges are required. The attacker crafts a PDF with a cross-reference stream where the /W array is set to [0 0 0] and the /Size field is set to a very large number. When pypdf's reader parses this stream, it attempts to iterate over the number of entries specified by /Size, resulting in a long runtime [3].
Impact
Successful exploitation causes pypdf to consume excessive CPU time, potentially leading to a denial of service (DoS) condition. The impact is limited to availability; there is no evidence of information disclosure, data corruption, or remote code execution. The severity is considered medium due to the resource exhaustion vector [3].
Mitigation
The vulnerability is fixed in pypdf version 6.12.0, released on 2026-05-21 [2]. Users should upgrade to this version or later. If upgrading is not immediately possible, a workaround is to apply the changes from pull request #3791, which disallows cross-reference streams with zero-only width values [1][3].
AI Insight generated on May 28, 2026. Synthesized from this CVE's description and the cited reference URLs; citations are validated against the source bundle.
Affected products
2Patches
1507d7c9aa6eaSEC: Disallow cross-reference streams with zero-only width values (#3791)
2 files changed · +37 −4
pypdf/_reader.py+9 −3 modified@@ -1089,9 +1089,15 @@ def _sanitize_pdf15_xref_stream_index_pairs( self, index_pairs: list[int], entry_sizes: list[int], xref_stream: ContentStream ) -> list[int]: # `entry_sizes` holds the byte widths for the entries. Summing determines the total number of bytes per entry. - # We expect up to 3 values, clamping to at least 1 avoids ZeroDivisionError in next step. - # `min_entry_bytes` will be the smallest plausible size of one xref entry. - min_entry_bytes = max(sum(int(entry_sizes[i]) for i in range(min(len(entry_sizes), 3))), 1) + # We expect up to 3 values. `min_entry_bytes` will be the smallest plausible size of one xref entry. + min_entry_bytes = sum(int(entry_sizes[i]) for i in range(min(len(entry_sizes), 3))) + if min_entry_bytes == 0: + message = "Cross-reference stream encodes no entry data." + if self.strict: + raise PdfStreamError(message) + logger_warning(message, source=__name__) + return [] + # maximum number of entries that could physically fit max_entries = len(xref_stream.get_data()) // min_entry_bytes + 1
tests/test_reader.py+28 −1 modified@@ -2199,7 +2199,7 @@ def test_xref_table_with_comments_before_trailer(): @pytest.mark.timeout(10) -def test_read_pdf15_xref_stream__size_limit(caplog): +def test_read_pdf15_xref_stream__w_0_0_0(caplog): pdf = b"%PDF-1.7\n" pdf += b"1 0 obj\n<< /Type /Catalog /Pages 2 0 R >>\nendobj\n" pdf += b"2 0 obj\n<< /Type /Pages /Kids [] /Count 0 >>\nendobj\n" @@ -2212,6 +2212,33 @@ def test_read_pdf15_xref_stream__size_limit(caplog): pdf += encoded + b"\nendstream\nendobj\n" pdf += f"startxref\n{startxref}\n%%EOF\n".encode() + with pytest.raises( + PdfReadError, + match=r"^Trailer cannot be read: Cross\-reference stream encodes no entry data\.$" + ): + _ = PdfReader(BytesIO(pdf), strict=True) + assert caplog.messages == [] + + _ = PdfReader(BytesIO(pdf), strict=False) + assert caplog.messages == [ + "Cross-reference stream encodes no entry data.", + ] + + +@pytest.mark.timeout(10) +def test_read_pdf15_xref_stream__size_limit(caplog): + pdf = b"%PDF-1.7\n" + pdf += b"1 0 obj\n<< /Type /Catalog /Pages 2 0 R >>\nendobj\n" + pdf += b"2 0 obj\n<< /Type /Pages /Kids [] /Count 0 >>\nendobj\n" + startxref = len(pdf) + encoded = FlateDecode.encode(b"") + pdf += ( + f"3 0 obj\n<< /Type /XRef /Size 50000000 /W [1 0 0] /Root 1 0 R /Filter /FlateDecode /Length {len(encoded)} >>" + f"\nstream\n" + ).encode() + pdf += encoded + b"\nendstream\nendobj\n" + pdf += f"startxref\n{startxref}\n%%EOF\n".encode() + with pytest.raises( PdfReadError, match=r"^Trailer cannot be read: Total XRef entries 50000000 exceed maximum allowed value 1\.$"
Vulnerability mechanics
Synthesis attempt was rejected by the grounding validator. Re-run pending.
References
3News mentions
0No linked articles in our index yet.