VYPR
Medium severity6.5NVD Advisory· Published Apr 22, 2026· Updated Apr 27, 2026

CVE-2026-41312

CVE-2026-41312

Description

pypdf is a free and open-source pure-python PDF library. An attacker who uses a vulnerability present in versions prior to 6.10.2 can craft a PDF which leads to the RAM being exhausted. This requires accessing a stream compressed using /FlateDecode with a /Predictor unequal 1 and large predictor parameters. This has been fixed in pypdf 6.10.2. As a workaround, one may apply the changes from the patch manually.

Affected packages

Versions sourced from the GitHub Security Advisory.

PackageAffected versionsPatched versions
pypdfPyPI
< 6.10.26.10.2

Affected products

1

Patches

1
ac734dab4eef

SEC: Introduce limits for FlateDecode parameters and image decoding (#3734)

https://github.com/py-pdf/pypdfStefanApr 15, 2026via ghsa
5 files changed · +136 52
  • docs/user/security.md+17 14 modified
    @@ -6,28 +6,31 @@ We strive to provide a library with secure defaults.
     
     ### Filters
     
    -*pypdf* currently employs output size limits for some filters which are known to possibly have large compression ratios.
    +*pypdf* currently employs output size limits for some filters which are known to possibly have large compression ratios
    +and other related issues.
     
     The usual limit is at 75 MB of uncompressed data during decompression. If this is too low for your use case, and you are
    -aware of the possible side effects, you can modify the following constants which define the desired maximal output size in bytes:
    +aware of the possible side effects, you can modify the following constants:
     
    -* `pypdf.filters.ZLIB_MAX_OUTPUT_LENGTH` for the *FlateDecode* filter (zlib compression)
    -* `pypdf.filters.LZW_MAX_OUTPUT_LENGTH` for the *LZWDecode* filter (LZW compression)
    -* `pypdf.filters.RUN_LENGTH_MAX_OUTPUT_LENGTH` for the *RunLengthDecode* filter (run-length compression)
    +* `pypdf.filters.JBIG2_MAX_OUTPUT_LENGTH` for the *JBIG2Decode* filter (JBIG2 images)
    +* `pypdf.filters.LZW_MAX_OUTPUT_LENGTH` for the maximum output length of the *LZWDecode* filter (LZW compression)
    +* `pypdf.filters.RUN_LENGTH_MAX_OUTPUT_LENGTH` for the maximum output length of the *RunLengthDecode* filter (run-length compression)
    +* `pypdf.filters.ZLIB_MAX_OUTPUT_LENGTH` for the maximum output length of the *FlateDecode* filter (zlib compression)
    +* `pypdf.filters.ZLIB_MAX_RECOVERY_INPUT_LENGTH` for the number of bytes to attempt the recovery with for the *FlateDecode* filter.
    +  It defaults to 5 MB due to the much more complex recovery approach.
     
    -For JBIG2 images, there is a similar parameter to limit the memory usage during decoding: `pypdf.filters.JBIG2_MAX_OUTPUT_LENGTH`
    -It defaults to 75 MB as well.
    +The following general stream length limits apply, defaulting to 75 MB as well:
     
    -For all streams, the maximum allowed value for the `/Length` field is limited to `pypdf.filters.MAX_DECLARED_STREAM_LENGTH`, which
    -defaults to 75 MB as well.
    +* `pypdf.filters.MAX_DECLARED_STREAM_LENGTH` for the `/Length` field of streams.
    +* `pypdf.filters.MAX_ARRAY_BASED_STREAM_OUTPUT_LENGTH` for the maximum allowed output length of array-based streams.
     
    -For all array-based streams, the maximum allowed output length is limited to `pypdf.filters.MAX_ARRAY_BASED_STREAM_OUTPUT_LENGTH`,
    -which defaults to 75 MB as well.
    +For the *JBIG2Decode* filter, calling the external *jbig2dec* tool can be disabled by setting `pypdf.filters.JBIG2DEC_BINARY = None`.
     
    -For the *FlateDecode* filter, the number of bytes to attempt recovery with can be set by `pypdf.filters.ZLIB_MAX_RECOVERY_INPUT_LENGTH`.
    -It defaults to 5 MB due to the much more complex recovery approach.
    +For the *FlateDecode* filter, the following additional limits apply:
     
    -For the *JBIG2Decode* filter, calling the external *jbig2dec* tool can be disabled by setting `pypdf.filters.JBIG2DEC_BINARY = None`.
    +* `pypdf.filters.FLATE_MAX_BUFFER_SIZE` for the maximum buffer size to allocate for images, defaulting to 75 MB
    +* `pypdf.filters.FLATE_MAX_COLUMNS` for the maximum number of columns, defaulting to 250 000
    +* `pypdf.filters.FLATE_MAX_ROW_LENGTH` for the maximum row length, defaulting to 4 MB
     
     ### Reading
     
    
  • pypdf/filters.py+51 32 modified
    @@ -80,6 +80,9 @@
     RUN_LENGTH_MAX_OUTPUT_LENGTH = 75_000_000
     ZLIB_MAX_OUTPUT_LENGTH = 75_000_000
     ZLIB_MAX_RECOVERY_INPUT_LENGTH = 5_000_000
    +FLATE_MAX_COLUMNS = 250_000
    +FLATE_MAX_ROW_LENGTH = 4_000_000
    +FLATE_MAX_BUFFER_SIZE = 75_000_000
     
     # Reuse cached 1-byte values in the fallback loop to avoid per-byte allocations.
     _SINGLE_BYTES = tuple(bytes((i,)) for i in range(256))
    @@ -201,23 +204,27 @@ def decode(
                 columns, colors, bits_per_component = FlateDecode._get_parameters(parameters)
     
                 # PNG predictor can vary by row and so is the lead byte on each row
    -            rowlength = (
    +            row_length = (
                     math.ceil(columns * colors * bits_per_component / 8) + 1
                 )  # number of bytes
    +            if row_length > FLATE_MAX_ROW_LENGTH:
    +                raise LimitReachedError(
    +                    f"Row length of {row_length} exceeds defined limit of {FLATE_MAX_ROW_LENGTH}."
    +                )
     
                 # TIFF prediction:
                 if predictor == 2:
    -                rowlength -= 1  # remove the predictor byte
    -                bpp = rowlength // columns
    +                row_length -= 1  # remove the predictor byte
    +                bpp = row_length // columns
                     str_data = bytearray(str_data)
                     for i in range(len(str_data)):
    -                    if i % rowlength >= bpp:
    +                    if i % row_length >= bpp:
                             str_data[i] = (str_data[i] + str_data[i - bpp]) % 256
                     str_data = bytes(str_data)
                 # PNG prediction:
                 elif 10 <= predictor <= 15:
                     str_data = FlateDecode._decode_png_prediction(
    -                    str_data, columns, rowlength
    +                    str_data, columns, row_length
                     )
                 else:
                     raise PdfReadError(f"Unsupported flatedecode predictor {predictor!r}")
    @@ -233,52 +240,64 @@ def get(key: str, default: int) -> int:
                 return _value
     
             columns = get(key=LZW.COLUMNS, default=1)
    +        if columns > FLATE_MAX_COLUMNS:
    +            raise LimitReachedError(f"Number of columns {columns} exceeds defined limit of {FLATE_MAX_COLUMNS}.")
    +
             colors = get(key=LZW.COLORS, default=1)
    +        if colors > 16:
    +            raise LimitReachedError(
    +                f"Color value {colors} exceeds limit of 16. "
    +                f"Please open an issue if this limits valid use cases."
    +            )
    +
             bits_per_component = get(key=LZW.BITS_PER_COMPONENT, default=8)
    +        if bits_per_component > 16:
    +            raise PdfReadError(f"More than 16 bits per component are not allowed: {bits_per_component}")
    +
             return columns, colors, bits_per_component
     
         @staticmethod
    -    def _decode_png_prediction(data: bytes, columns: int, rowlength: int) -> bytes:
    +    def _decode_png_prediction(data: bytes, columns: int, row_length: int) -> bytes:
             # PNG prediction can vary from row to row
    -        if (remainder := len(data) % rowlength) != 0:
    +        if (remainder := len(data) % row_length) != 0:
                 logger_warning("Image data is not rectangular. Adding padding.", __name__)
    -            data += b"\x00" * (rowlength - remainder)
    -            assert len(data) % rowlength == 0
    +            data += b"\x00" * (row_length - remainder)
    +            assert len(data) % row_length == 0
             output = []
    -        prev_rowdata = (0,) * rowlength
    -        bpp = (rowlength - 1) // columns  # recomputed locally to not change params
    -        for row in range(0, len(data), rowlength):
    -            rowdata: list[int] = list(data[row : row + rowlength])
    -            filter_byte = rowdata[0]
    +        previous_row_data = (0,) * row_length
    +        bpp = (row_length - 1) // columns  # recomputed locally to not change params
    +        for row in range(0, len(data), row_length):
    +            row_data: list[int] = list(data[row : row + row_length])
    +            filter_byte = row_data[0]
     
                 if filter_byte == 0:
                     # PNG None Predictor
                     pass
                 elif filter_byte == 1:
                     # PNG Sub Predictor
    -                for i in range(bpp + 1, rowlength):
    -                    rowdata[i] = (rowdata[i] + rowdata[i - bpp]) % 256
    +                for i in range(bpp + 1, row_length):
    +                    row_data[i] = (row_data[i] + row_data[i - bpp]) % 256
                 elif filter_byte == 2:
                     # PNG Up Predictor
    -                for i in range(1, rowlength):
    -                    rowdata[i] = (rowdata[i] + prev_rowdata[i]) % 256
    +                for i in range(1, row_length):
    +                    row_data[i] = (row_data[i] + previous_row_data[i]) % 256
                 elif filter_byte == 3:
                     # PNG Average Predictor
                     for i in range(1, bpp + 1):
    -                    floor = prev_rowdata[i] // 2
    -                    rowdata[i] = (rowdata[i] + floor) % 256
    -                for i in range(bpp + 1, rowlength):
    -                    left = rowdata[i - bpp]
    -                    floor = (left + prev_rowdata[i]) // 2
    -                    rowdata[i] = (rowdata[i] + floor) % 256
    +                    floor = previous_row_data[i] // 2
    +                    row_data[i] = (row_data[i] + floor) % 256
    +                for i in range(bpp + 1, row_length):
    +                    left = row_data[i - bpp]
    +                    floor = (left + previous_row_data[i]) // 2
    +                    row_data[i] = (row_data[i] + floor) % 256
                 elif filter_byte == 4:
                     # PNG Paeth Predictor
                     for i in range(1, bpp + 1):
    -                    rowdata[i] = (rowdata[i] + prev_rowdata[i]) % 256
    -                for i in range(bpp + 1, rowlength):
    -                    left = rowdata[i - bpp]
    -                    up = prev_rowdata[i]
    -                    up_left = prev_rowdata[i - bpp]
    +                    row_data[i] = (row_data[i] + previous_row_data[i]) % 256
    +                for i in range(bpp + 1, row_length):
    +                    left = row_data[i - bpp]
    +                    up = previous_row_data[i]
    +                    up_left = previous_row_data[i - bpp]
     
                         p = left + up - up_left
                         dist_left = abs(p - left)
    @@ -292,13 +311,13 @@ def _decode_png_prediction(data: bytes, columns: int, rowlength: int) -> bytes:
                         else:
                             paeth = up_left
     
    -                    rowdata[i] = (rowdata[i] + paeth) % 256
    +                    row_data[i] = (row_data[i] + paeth) % 256
                 else:
                     raise PdfReadError(
                         f"Unsupported PNG filter {filter_byte!r}"
                     )  # pragma: no cover
    -            prev_rowdata = tuple(rowdata)
    -            output.extend(rowdata[1:])
    +            previous_row_data = tuple(row_data)
    +            output.extend(row_data[1:])
             return bytes(output)
     
         @staticmethod
    
  • pypdf/generic/_image_xobject.py+8 2 modified
    @@ -8,7 +8,7 @@
     from ..constants import ColorSpaces, StreamAttributes
     from ..constants import FilterTypes as FT
     from ..constants import ImageAttributes as IA
    -from ..errors import EmptyImageDataError, PdfReadError
    +from ..errors import EmptyImageDataError, LimitReachedError, PdfReadError
     from ..generic import (
         ArrayObject,
         DecodedStreamObject,
    @@ -122,8 +122,14 @@ def _get_image_mode(
     
     
     def bits2byte(data: bytes, size: tuple[int, int], bits: int) -> bytes:
    +    from pypdf.filters import FLATE_MAX_BUFFER_SIZE  # noqa: PLC0415
    +
    +    buffer_size = size[0] * size[1]
    +    if buffer_size > FLATE_MAX_BUFFER_SIZE:
    +        raise LimitReachedError(f"Requested buffer size {buffer_size} exceeds limit of {FLATE_MAX_BUFFER_SIZE}.")
    +
    +    byte_buffer = bytearray(buffer_size)
         mask = (1 << bits) - 1
    -    byte_buffer = bytearray(size[0] * size[1])
         data_index = 0
         bit = 8 - bits
         for y in range(size[1]):
    
  • tests/generic/test_image_xobject.py+10 2 modified
    @@ -7,9 +7,9 @@
     from pypdf import PdfReader
     from pypdf._utils import Version
     from pypdf.constants import FilterTypes, ImageAttributes, StreamAttributes
    -from pypdf.errors import EmptyImageDataError, PdfReadError
    +from pypdf.errors import EmptyImageDataError, LimitReachedError, PdfReadError
     from pypdf.generic import ArrayObject, DecodedStreamObject, NameObject, NumberObject, StreamObject, TextStringObject
    -from pypdf.generic._image_xobject import _extended_image_from_bytes, _handle_flate, _xobj_to_image
    +from pypdf.generic._image_xobject import _extended_image_from_bytes, _handle_flate, _xobj_to_image, bits2byte
     
     from .. import RESOURCE_ROOT, get_data_from_url
     from ..utils import get_image_data
    @@ -267,3 +267,11 @@ def test_handle_jpx__explicit_decode() -> None:
             for x in range(16):
                 assert result.getpixel((x, y)) == (255 * (x != y), 255, 255, 255), (x, y)
                 assert image.getpixel((x, y)) == (255 * (x == y), 0, 0, 0), (x, y)
    +
    +
    +def test_bits2byte__limit() -> None:
    +    with pytest.raises(
    +            expected_exception=LimitReachedError,
    +            match=r"^Requested buffer size 76500000 exceeds limit of 75000000\.$"
    +    ):
    +        bits2byte(data=b"TEST", size=(9000, 8500), bits=8)
    
  • tests/test_filters.py+50 2 modified
    @@ -4,6 +4,7 @@
     import subprocess
     import sys
     import zlib
    +from copy import deepcopy
     from io import BytesIO
     from itertools import product as cartesian_product
     from pathlib import Path
    @@ -1017,15 +1018,14 @@ def test_deprecate_inline_image_filters():
     
     
     def test_flatedecode__columns_is_zero():
    -    codec = FlateDecode()
         data = b"Hello World!"
         parameters = DictionaryObject({
             NameObject("/Predictor"): NumberObject(13),
             NameObject("/Columns"): NumberObject(0)
         })
     
         with pytest.raises(expected_exception=PdfReadError, match=r"^Expected positive number for /Columns, got 0!$"):
    -        codec.decode(codec.encode(data), parameters)
    +        FlateDecode.decode(FlateDecode.encode(data), parameters)
     
     
     def test_runlengthdecode__decode_limit():
    @@ -1049,3 +1049,51 @@ def test_runlengthdecode__decode_limit():
     def test_asciihexdecode__speed():
         encoded = (b"41" * 1_200_000) + b">"
         ASCIIHexDecode.decode(encoded)
    +
    +
    +def test_flatedecode__upper_limits():
    +    data = b"Hello World!"
    +    default_parameters = DictionaryObject({
    +        NameObject("/Predictor"): NumberObject(13),
    +        NameObject("/Columns"): NumberObject(200_000),
    +        NameObject("/Colors"): NumberObject(8),
    +        NameObject("/BitsPerComponent"): NumberObject(16),
    +    })
    +    encoded = FlateDecode.encode(data)
    +
    +    # Colors
    +    parameters = deepcopy(default_parameters)
    +    parameters[NameObject("/Colors")] = NumberObject(128)
    +    with pytest.raises(
    +            expected_exception=LimitReachedError,
    +            match=r"^Color value 128 exceeds limit of 16\. Please open an issue if this limits valid use cases\.$"
    +    ):
    +        FlateDecode.decode(data=encoded, decode_parms=parameters)
    +
    +    # BitsPerComponent
    +    parameters = deepcopy(default_parameters)
    +    parameters[NameObject("/BitsPerComponent")] = NumberObject(32)
    +    with pytest.raises(
    +            expected_exception=PdfReadError,
    +            match=r"^More than 16 bits per component are not allowed: 32$"
    +    ):
    +        FlateDecode.decode(data=encoded, decode_parms=parameters)
    +
    +    # Columns
    +    parameters = deepcopy(default_parameters)
    +    parameters[NameObject("/Columns")] = NumberObject(300_000)
    +    with pytest.raises(
    +            expected_exception=LimitReachedError,
    +            match=r"^Number of columns 300000 exceeds defined limit of 250000\.$"
    +    ):
    +        FlateDecode.decode(data=encoded, decode_parms=parameters)
    +
    +    # Row length
    +    parameters = deepcopy(default_parameters)
    +    parameters[NameObject("/Columns")] = NumberObject(130_000)
    +    parameters[NameObject("/Colors")] = NumberObject(16)
    +    with pytest.raises(
    +            expected_exception=LimitReachedError,
    +            match=r"^Row length of 4160001 exceeds defined limit of 4000000\.$"
    +    ):
    +        FlateDecode.decode(data=encoded, decode_parms=parameters)
    

Vulnerability mechanics

Generated by null/stub on May 9, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.

References

6

News mentions

0

No linked articles in our index yet.