Medium severity6.5NVD Advisory· Published Apr 22, 2026· Updated Apr 27, 2026
CVE-2026-41312
CVE-2026-41312
Description
pypdf is a free and open-source pure-python PDF library. An attacker who uses a vulnerability present in versions prior to 6.10.2 can craft a PDF which leads to the RAM being exhausted. This requires accessing a stream compressed using /FlateDecode with a /Predictor unequal 1 and large predictor parameters. This has been fixed in pypdf 6.10.2. As a workaround, one may apply the changes from the patch manually.
Affected packages
Versions sourced from the GitHub Security Advisory.
| Package | Affected versions | Patched versions |
|---|---|---|
pypdfPyPI | < 6.10.2 | 6.10.2 |
Affected products
1Patches
1ac734dab4eefSEC: Introduce limits for FlateDecode parameters and image decoding (#3734)
5 files changed · +136 −52
docs/user/security.md+17 −14 modified@@ -6,28 +6,31 @@ We strive to provide a library with secure defaults. ### Filters -*pypdf* currently employs output size limits for some filters which are known to possibly have large compression ratios. +*pypdf* currently employs output size limits for some filters which are known to possibly have large compression ratios +and other related issues. The usual limit is at 75 MB of uncompressed data during decompression. If this is too low for your use case, and you are -aware of the possible side effects, you can modify the following constants which define the desired maximal output size in bytes: +aware of the possible side effects, you can modify the following constants: -* `pypdf.filters.ZLIB_MAX_OUTPUT_LENGTH` for the *FlateDecode* filter (zlib compression) -* `pypdf.filters.LZW_MAX_OUTPUT_LENGTH` for the *LZWDecode* filter (LZW compression) -* `pypdf.filters.RUN_LENGTH_MAX_OUTPUT_LENGTH` for the *RunLengthDecode* filter (run-length compression) +* `pypdf.filters.JBIG2_MAX_OUTPUT_LENGTH` for the *JBIG2Decode* filter (JBIG2 images) +* `pypdf.filters.LZW_MAX_OUTPUT_LENGTH` for the maximum output length of the *LZWDecode* filter (LZW compression) +* `pypdf.filters.RUN_LENGTH_MAX_OUTPUT_LENGTH` for the maximum output length of the *RunLengthDecode* filter (run-length compression) +* `pypdf.filters.ZLIB_MAX_OUTPUT_LENGTH` for the maximum output length of the *FlateDecode* filter (zlib compression) +* `pypdf.filters.ZLIB_MAX_RECOVERY_INPUT_LENGTH` for the number of bytes to attempt the recovery with for the *FlateDecode* filter. + It defaults to 5 MB due to the much more complex recovery approach. -For JBIG2 images, there is a similar parameter to limit the memory usage during decoding: `pypdf.filters.JBIG2_MAX_OUTPUT_LENGTH` -It defaults to 75 MB as well. +The following general stream length limits apply, defaulting to 75 MB as well: -For all streams, the maximum allowed value for the `/Length` field is limited to `pypdf.filters.MAX_DECLARED_STREAM_LENGTH`, which -defaults to 75 MB as well. +* `pypdf.filters.MAX_DECLARED_STREAM_LENGTH` for the `/Length` field of streams. +* `pypdf.filters.MAX_ARRAY_BASED_STREAM_OUTPUT_LENGTH` for the maximum allowed output length of array-based streams. -For all array-based streams, the maximum allowed output length is limited to `pypdf.filters.MAX_ARRAY_BASED_STREAM_OUTPUT_LENGTH`, -which defaults to 75 MB as well. +For the *JBIG2Decode* filter, calling the external *jbig2dec* tool can be disabled by setting `pypdf.filters.JBIG2DEC_BINARY = None`. -For the *FlateDecode* filter, the number of bytes to attempt recovery with can be set by `pypdf.filters.ZLIB_MAX_RECOVERY_INPUT_LENGTH`. -It defaults to 5 MB due to the much more complex recovery approach. +For the *FlateDecode* filter, the following additional limits apply: -For the *JBIG2Decode* filter, calling the external *jbig2dec* tool can be disabled by setting `pypdf.filters.JBIG2DEC_BINARY = None`. +* `pypdf.filters.FLATE_MAX_BUFFER_SIZE` for the maximum buffer size to allocate for images, defaulting to 75 MB +* `pypdf.filters.FLATE_MAX_COLUMNS` for the maximum number of columns, defaulting to 250 000 +* `pypdf.filters.FLATE_MAX_ROW_LENGTH` for the maximum row length, defaulting to 4 MB ### Reading
pypdf/filters.py+51 −32 modified@@ -80,6 +80,9 @@ RUN_LENGTH_MAX_OUTPUT_LENGTH = 75_000_000 ZLIB_MAX_OUTPUT_LENGTH = 75_000_000 ZLIB_MAX_RECOVERY_INPUT_LENGTH = 5_000_000 +FLATE_MAX_COLUMNS = 250_000 +FLATE_MAX_ROW_LENGTH = 4_000_000 +FLATE_MAX_BUFFER_SIZE = 75_000_000 # Reuse cached 1-byte values in the fallback loop to avoid per-byte allocations. _SINGLE_BYTES = tuple(bytes((i,)) for i in range(256)) @@ -201,23 +204,27 @@ def decode( columns, colors, bits_per_component = FlateDecode._get_parameters(parameters) # PNG predictor can vary by row and so is the lead byte on each row - rowlength = ( + row_length = ( math.ceil(columns * colors * bits_per_component / 8) + 1 ) # number of bytes + if row_length > FLATE_MAX_ROW_LENGTH: + raise LimitReachedError( + f"Row length of {row_length} exceeds defined limit of {FLATE_MAX_ROW_LENGTH}." + ) # TIFF prediction: if predictor == 2: - rowlength -= 1 # remove the predictor byte - bpp = rowlength // columns + row_length -= 1 # remove the predictor byte + bpp = row_length // columns str_data = bytearray(str_data) for i in range(len(str_data)): - if i % rowlength >= bpp: + if i % row_length >= bpp: str_data[i] = (str_data[i] + str_data[i - bpp]) % 256 str_data = bytes(str_data) # PNG prediction: elif 10 <= predictor <= 15: str_data = FlateDecode._decode_png_prediction( - str_data, columns, rowlength + str_data, columns, row_length ) else: raise PdfReadError(f"Unsupported flatedecode predictor {predictor!r}") @@ -233,52 +240,64 @@ def get(key: str, default: int) -> int: return _value columns = get(key=LZW.COLUMNS, default=1) + if columns > FLATE_MAX_COLUMNS: + raise LimitReachedError(f"Number of columns {columns} exceeds defined limit of {FLATE_MAX_COLUMNS}.") + colors = get(key=LZW.COLORS, default=1) + if colors > 16: + raise LimitReachedError( + f"Color value {colors} exceeds limit of 16. " + f"Please open an issue if this limits valid use cases." + ) + bits_per_component = get(key=LZW.BITS_PER_COMPONENT, default=8) + if bits_per_component > 16: + raise PdfReadError(f"More than 16 bits per component are not allowed: {bits_per_component}") + return columns, colors, bits_per_component @staticmethod - def _decode_png_prediction(data: bytes, columns: int, rowlength: int) -> bytes: + def _decode_png_prediction(data: bytes, columns: int, row_length: int) -> bytes: # PNG prediction can vary from row to row - if (remainder := len(data) % rowlength) != 0: + if (remainder := len(data) % row_length) != 0: logger_warning("Image data is not rectangular. Adding padding.", __name__) - data += b"\x00" * (rowlength - remainder) - assert len(data) % rowlength == 0 + data += b"\x00" * (row_length - remainder) + assert len(data) % row_length == 0 output = [] - prev_rowdata = (0,) * rowlength - bpp = (rowlength - 1) // columns # recomputed locally to not change params - for row in range(0, len(data), rowlength): - rowdata: list[int] = list(data[row : row + rowlength]) - filter_byte = rowdata[0] + previous_row_data = (0,) * row_length + bpp = (row_length - 1) // columns # recomputed locally to not change params + for row in range(0, len(data), row_length): + row_data: list[int] = list(data[row : row + row_length]) + filter_byte = row_data[0] if filter_byte == 0: # PNG None Predictor pass elif filter_byte == 1: # PNG Sub Predictor - for i in range(bpp + 1, rowlength): - rowdata[i] = (rowdata[i] + rowdata[i - bpp]) % 256 + for i in range(bpp + 1, row_length): + row_data[i] = (row_data[i] + row_data[i - bpp]) % 256 elif filter_byte == 2: # PNG Up Predictor - for i in range(1, rowlength): - rowdata[i] = (rowdata[i] + prev_rowdata[i]) % 256 + for i in range(1, row_length): + row_data[i] = (row_data[i] + previous_row_data[i]) % 256 elif filter_byte == 3: # PNG Average Predictor for i in range(1, bpp + 1): - floor = prev_rowdata[i] // 2 - rowdata[i] = (rowdata[i] + floor) % 256 - for i in range(bpp + 1, rowlength): - left = rowdata[i - bpp] - floor = (left + prev_rowdata[i]) // 2 - rowdata[i] = (rowdata[i] + floor) % 256 + floor = previous_row_data[i] // 2 + row_data[i] = (row_data[i] + floor) % 256 + for i in range(bpp + 1, row_length): + left = row_data[i - bpp] + floor = (left + previous_row_data[i]) // 2 + row_data[i] = (row_data[i] + floor) % 256 elif filter_byte == 4: # PNG Paeth Predictor for i in range(1, bpp + 1): - rowdata[i] = (rowdata[i] + prev_rowdata[i]) % 256 - for i in range(bpp + 1, rowlength): - left = rowdata[i - bpp] - up = prev_rowdata[i] - up_left = prev_rowdata[i - bpp] + row_data[i] = (row_data[i] + previous_row_data[i]) % 256 + for i in range(bpp + 1, row_length): + left = row_data[i - bpp] + up = previous_row_data[i] + up_left = previous_row_data[i - bpp] p = left + up - up_left dist_left = abs(p - left) @@ -292,13 +311,13 @@ def _decode_png_prediction(data: bytes, columns: int, rowlength: int) -> bytes: else: paeth = up_left - rowdata[i] = (rowdata[i] + paeth) % 256 + row_data[i] = (row_data[i] + paeth) % 256 else: raise PdfReadError( f"Unsupported PNG filter {filter_byte!r}" ) # pragma: no cover - prev_rowdata = tuple(rowdata) - output.extend(rowdata[1:]) + previous_row_data = tuple(row_data) + output.extend(row_data[1:]) return bytes(output) @staticmethod
pypdf/generic/_image_xobject.py+8 −2 modified@@ -8,7 +8,7 @@ from ..constants import ColorSpaces, StreamAttributes from ..constants import FilterTypes as FT from ..constants import ImageAttributes as IA -from ..errors import EmptyImageDataError, PdfReadError +from ..errors import EmptyImageDataError, LimitReachedError, PdfReadError from ..generic import ( ArrayObject, DecodedStreamObject, @@ -122,8 +122,14 @@ def _get_image_mode( def bits2byte(data: bytes, size: tuple[int, int], bits: int) -> bytes: + from pypdf.filters import FLATE_MAX_BUFFER_SIZE # noqa: PLC0415 + + buffer_size = size[0] * size[1] + if buffer_size > FLATE_MAX_BUFFER_SIZE: + raise LimitReachedError(f"Requested buffer size {buffer_size} exceeds limit of {FLATE_MAX_BUFFER_SIZE}.") + + byte_buffer = bytearray(buffer_size) mask = (1 << bits) - 1 - byte_buffer = bytearray(size[0] * size[1]) data_index = 0 bit = 8 - bits for y in range(size[1]):
tests/generic/test_image_xobject.py+10 −2 modified@@ -7,9 +7,9 @@ from pypdf import PdfReader from pypdf._utils import Version from pypdf.constants import FilterTypes, ImageAttributes, StreamAttributes -from pypdf.errors import EmptyImageDataError, PdfReadError +from pypdf.errors import EmptyImageDataError, LimitReachedError, PdfReadError from pypdf.generic import ArrayObject, DecodedStreamObject, NameObject, NumberObject, StreamObject, TextStringObject -from pypdf.generic._image_xobject import _extended_image_from_bytes, _handle_flate, _xobj_to_image +from pypdf.generic._image_xobject import _extended_image_from_bytes, _handle_flate, _xobj_to_image, bits2byte from .. import RESOURCE_ROOT, get_data_from_url from ..utils import get_image_data @@ -267,3 +267,11 @@ def test_handle_jpx__explicit_decode() -> None: for x in range(16): assert result.getpixel((x, y)) == (255 * (x != y), 255, 255, 255), (x, y) assert image.getpixel((x, y)) == (255 * (x == y), 0, 0, 0), (x, y) + + +def test_bits2byte__limit() -> None: + with pytest.raises( + expected_exception=LimitReachedError, + match=r"^Requested buffer size 76500000 exceeds limit of 75000000\.$" + ): + bits2byte(data=b"TEST", size=(9000, 8500), bits=8)
tests/test_filters.py+50 −2 modified@@ -4,6 +4,7 @@ import subprocess import sys import zlib +from copy import deepcopy from io import BytesIO from itertools import product as cartesian_product from pathlib import Path @@ -1017,15 +1018,14 @@ def test_deprecate_inline_image_filters(): def test_flatedecode__columns_is_zero(): - codec = FlateDecode() data = b"Hello World!" parameters = DictionaryObject({ NameObject("/Predictor"): NumberObject(13), NameObject("/Columns"): NumberObject(0) }) with pytest.raises(expected_exception=PdfReadError, match=r"^Expected positive number for /Columns, got 0!$"): - codec.decode(codec.encode(data), parameters) + FlateDecode.decode(FlateDecode.encode(data), parameters) def test_runlengthdecode__decode_limit(): @@ -1049,3 +1049,51 @@ def test_runlengthdecode__decode_limit(): def test_asciihexdecode__speed(): encoded = (b"41" * 1_200_000) + b">" ASCIIHexDecode.decode(encoded) + + +def test_flatedecode__upper_limits(): + data = b"Hello World!" + default_parameters = DictionaryObject({ + NameObject("/Predictor"): NumberObject(13), + NameObject("/Columns"): NumberObject(200_000), + NameObject("/Colors"): NumberObject(8), + NameObject("/BitsPerComponent"): NumberObject(16), + }) + encoded = FlateDecode.encode(data) + + # Colors + parameters = deepcopy(default_parameters) + parameters[NameObject("/Colors")] = NumberObject(128) + with pytest.raises( + expected_exception=LimitReachedError, + match=r"^Color value 128 exceeds limit of 16\. Please open an issue if this limits valid use cases\.$" + ): + FlateDecode.decode(data=encoded, decode_parms=parameters) + + # BitsPerComponent + parameters = deepcopy(default_parameters) + parameters[NameObject("/BitsPerComponent")] = NumberObject(32) + with pytest.raises( + expected_exception=PdfReadError, + match=r"^More than 16 bits per component are not allowed: 32$" + ): + FlateDecode.decode(data=encoded, decode_parms=parameters) + + # Columns + parameters = deepcopy(default_parameters) + parameters[NameObject("/Columns")] = NumberObject(300_000) + with pytest.raises( + expected_exception=LimitReachedError, + match=r"^Number of columns 300000 exceeds defined limit of 250000\.$" + ): + FlateDecode.decode(data=encoded, decode_parms=parameters) + + # Row length + parameters = deepcopy(default_parameters) + parameters[NameObject("/Columns")] = NumberObject(130_000) + parameters[NameObject("/Colors")] = NumberObject(16) + with pytest.raises( + expected_exception=LimitReachedError, + match=r"^Row length of 4160001 exceeds defined limit of 4000000\.$" + ): + FlateDecode.decode(data=encoded, decode_parms=parameters)
Vulnerability mechanics
Generated by null/stub on May 9, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.
References
6- github.com/py-pdf/pypdf/commit/ac734dab4eef92bcce50d503949b4d9887d89f11nvdPatchWEB
- github.com/py-pdf/pypdf/pull/3734nvdIssue TrackingPatchWEB
- github.com/py-pdf/pypdf/security/advisories/GHSA-7gw9-cf7v-778fnvdPatchVendor AdvisoryWEB
- github.com/advisories/GHSA-7gw9-cf7v-778fghsaADVISORY
- nvd.nist.gov/vuln/detail/CVE-2026-41312ghsaADVISORY
- github.com/py-pdf/pypdf/releases/tag/6.10.2nvdRelease NotesWEB
News mentions
0No linked articles in our index yet.