High severity7.5GHSA Advisory· Published Oct 31, 2025· Updated Apr 15, 2026

CVE-2025-6176

Description

Scrapy versions up to 2.13.2 are vulnerable to a denial of service (DoS) attack due to a flaw in its brotli decompression implementation. The protection mechanism against decompression bombs fails to mitigate the brotli variant, allowing remote servers to crash clients with less than 80GB of available memory. This occurs because brotli can achieve extremely high compression ratios for zero-filled data, leading to excessive memory consumption during decompression.

Affected packages

Versions sourced from the GitHub Security Advisory.

Package	Affected versions	Patched versions
brotliPyPI	< 1.2.0	1.2.0
ScrapyPyPI	< 2.13.4	2.13.4

Affected products

Google/BrotliGHSA
Range: <= 2.13.3

Patches

14737e91edc5

Mitigate brotli and deflate decompression bombs DoS (#7134)

https://github.com/scrapy/scrapyRui XiNov 17, 2025via ghsa

commit

9 files changed · +148 −144

conftest.py+10 −0 modified

@@ -117,6 +117,16 @@ def requires_boto3(request):
         pytest.skip("boto3 is not installed")
 
 
+@pytest.fixture(autouse=True)
+def requires_mitmproxy(request):
+    if not request.node.get_closest_marker("requires_mitmproxy"):
+        return
+    try:
+        import mitmproxy  # noqa: F401, PLC0415
+    except ImportError:
+        pytest.skip("mitmproxy is not installed")
+
+
 def pytest_configure(config):
     if config.getoption("--reactor") == "asyncio":
         # Needed on Windows to switch from proactor to selector for Twisted reactor compatibility.

.github/workflows/tests-ubuntu.yml+3 −0 modified

@@ -62,6 +62,9 @@ jobs:
         - python-version: "3.13"
           env:
             TOXENV: botocore
+        - python-version: "3.13"
+          env:
+            TOXENV: mitmproxy
 
     steps:
     - uses: actions/checkout@v5

pyproject.toml+1 −0 modified

@@ -232,6 +232,7 @@ markers = [
     "requires_uvloop: marks tests as only enabled when uvloop is known to be working",
     "requires_botocore: marks tests that need botocore (but not boto3)",
     "requires_boto3: marks tests that need botocore and boto3",
+    "requires_mitmproxy: marks tests that need mitmproxy",
 ]
 filterwarnings = [
     "ignore::DeprecationWarning:twisted.web.static"

scrapy/downloadermiddlewares/httpcompression.py+17 −11 modified

@@ -31,14 +31,20 @@
 ACCEPTED_ENCODINGS: list[bytes] = [b"gzip", b"deflate"]
 
 try:
-    try:
-        import brotli  # noqa: F401
-    except ImportError:
-        import brotlicffi  # noqa: F401
+    import brotli
 except ImportError:
     pass
 else:
-    ACCEPTED_ENCODINGS.append(b"br")
+    try:
+        brotli.Decompressor.can_accept_more_data
+    except AttributeError:  # pragma: no cover
+        warnings.warn(
+            "You have brotli installed. But 'br' encoding support now requires "
+            "brotli version >= 1.2.0. Please upgrade brotli version to make Scrapy "
+            "decode 'br' encoded responses.",
+        )
+    else:
+        ACCEPTED_ENCODINGS.append(b"br")
 
 try:
     import zstandard  # noqa: F401
@@ -116,13 +122,13 @@ def process_response(
                     decoded_body, content_encoding = self._handle_encoding(
                         response.body, content_encoding, max_size
                     )
-                except _DecompressionMaxSizeExceeded:
+                except _DecompressionMaxSizeExceeded as e:
                     raise IgnoreRequest(
                         f"Ignored response {response} because its body "
-                        f"({len(response.body)} B compressed) exceeded "
-                        f"DOWNLOAD_MAXSIZE ({max_size} B) during "
-                        f"decompression."
-                    )
+                        f"({len(response.body)} B compressed, "
+                        f"{e.decompressed_size} B decompressed so far) exceeded "
+                        f"DOWNLOAD_MAXSIZE ({max_size} B) during decompression."
+                    ) from e
                 if len(response.body) < warn_size <= len(decoded_body):
                     logger.warning(
                         f"{response} body size after decompression "
@@ -202,7 +208,7 @@ def _warn_unknown_encoding(
             f"from unsupported encoding(s) '{encodings_str}'."
         )
         if b"br" in encodings:
-            msg += " You need to install brotli or brotlicffi to decode 'br'."
+            msg += " You need to install brotli >= 1.2.0 to decode 'br'."
         if b"zstd" in encodings:
             msg += " You need to install zstandard to decode 'zstd'."
         logger.warning(msg)

scrapy/utils/_compression.py+48 −85 modified

@@ -1,42 +1,9 @@
 import contextlib
 import zlib
 from io import BytesIO
-from warnings import warn
-
-from scrapy.exceptions import ScrapyDeprecationWarning
-
-try:
-    try:
-        import brotli
-    except ImportError:
-        import brotlicffi as brotli
-except ImportError:
-    pass
-else:
-    try:
-        brotli.Decompressor.process
-    except AttributeError:
-        warn(
-            (
-                "You have brotlipy installed, and Scrapy will use it, but "
-                "Scrapy support for brotlipy is deprecated and will stop "
-                "working in a future version of Scrapy. brotlipy itself is "
-                "deprecated, it has been superseded by brotlicffi. Please, "
-                "uninstall brotlipy and install brotli or brotlicffi instead. "
-                "brotlipy has the same import name as brotli, so keeping both "
-                "installed is strongly discouraged."
-            ),
-            ScrapyDeprecationWarning,
-        )
-
-        def _brotli_decompress(decompressor, data):
-            return decompressor.decompress(data)
-
-    else:
-
-        def _brotli_decompress(decompressor, data):
-            return decompressor.process(data)
 
+with contextlib.suppress(ImportError):
+    import brotli
 
 with contextlib.suppress(ImportError):
     import zstandard
@@ -46,62 +13,64 @@ def _brotli_decompress(decompressor, data):
 
 
 class _DecompressionMaxSizeExceeded(ValueError):
-    pass
+    def __init__(self, decompressed_size: int, max_size: int) -> None:
+        self.decompressed_size = decompressed_size
+        self.max_size = max_size
+
+    def __str__(self) -> str:
+        return (
+            f"The number of bytes decompressed so far "
+            f"({self.decompressed_size} B) exceeded the specified maximum "
+            f"({self.max_size} B)."
+        )
+
+
+def _check_max_size(decompressed_size: int, max_size: int) -> None:
+    if max_size and decompressed_size > max_size:
+        raise _DecompressionMaxSizeExceeded(decompressed_size, max_size)
 
 
 def _inflate(data: bytes, *, max_size: int = 0) -> bytes:
     decompressor = zlib.decompressobj()
-    raw_decompressor = zlib.decompressobj(wbits=-15)
-    input_stream = BytesIO(data)
+    try:
+        first_chunk = decompressor.decompress(data, max_length=_CHUNK_SIZE)
+    except zlib.error:
+        # to work with raw deflate content that may be sent by microsoft servers.
+        decompressor = zlib.decompressobj(wbits=-15)
+        first_chunk = decompressor.decompress(data, max_length=_CHUNK_SIZE)
+    decompressed_size = len(first_chunk)
+    _check_max_size(decompressed_size, max_size)
     output_stream = BytesIO()
-    output_chunk = b"."
-    decompressed_size = 0
-    while output_chunk:
-        input_chunk = input_stream.read(_CHUNK_SIZE)
-        try:
-            output_chunk = decompressor.decompress(input_chunk)
-        except zlib.error:
-            if decompressor != raw_decompressor:
-                # ugly hack to work with raw deflate content that may
-                # be sent by microsoft servers. For more information, see:
-                # http://carsten.codimi.de/gzip.yaws/
-                # http://www.port80software.com/200ok/archive/2005/10/31/868.aspx
-                # http://www.gzip.org/zlib/zlib_faq.html#faq38
-                decompressor = raw_decompressor
-                output_chunk = decompressor.decompress(input_chunk)
-            else:
-                raise
+    output_stream.write(first_chunk)
+    while decompressor.unconsumed_tail:
+        output_chunk = decompressor.decompress(
+            decompressor.unconsumed_tail, max_length=_CHUNK_SIZE
+        )
         decompressed_size += len(output_chunk)
-        if max_size and decompressed_size > max_size:
-            raise _DecompressionMaxSizeExceeded(
-                f"The number of bytes decompressed so far "
-                f"({decompressed_size} B) exceed the specified maximum "
-                f"({max_size} B)."
-            )
+        _check_max_size(decompressed_size, max_size)
         output_stream.write(output_chunk)
-    output_stream.seek(0)
-    return output_stream.read()
+    if tail := decompressor.flush():
+        decompressed_size += len(tail)
+        _check_max_size(decompressed_size, max_size)
+        output_stream.write(tail)
+    return output_stream.getvalue()
 
 
 def _unbrotli(data: bytes, *, max_size: int = 0) -> bytes:
     decompressor = brotli.Decompressor()
-    input_stream = BytesIO(data)
+    first_chunk = decompressor.process(data, output_buffer_limit=_CHUNK_SIZE)
+    decompressed_size = len(first_chunk)
+    _check_max_size(decompressed_size, max_size)
     output_stream = BytesIO()
-    output_chunk = b"."
-    decompressed_size = 0
-    while output_chunk:
-        input_chunk = input_stream.read(_CHUNK_SIZE)
-        output_chunk = _brotli_decompress(decompressor, input_chunk)
+    output_stream.write(first_chunk)
+    while not decompressor.is_finished():
+        output_chunk = decompressor.process(b"", output_buffer_limit=_CHUNK_SIZE)
+        if not output_chunk:
+            break
         decompressed_size += len(output_chunk)
-        if max_size and decompressed_size > max_size:
-            raise _DecompressionMaxSizeExceeded(
-                f"The number of bytes decompressed so far "
-                f"({decompressed_size} B) exceed the specified maximum "
-                f"({max_size} B)."
-            )
+        _check_max_size(decompressed_size, max_size)
         output_stream.write(output_chunk)
-    output_stream.seek(0)
-    return output_stream.read()
+    return output_stream.getvalue()
 
 
 def _unzstd(data: bytes, *, max_size: int = 0) -> bytes:
@@ -113,12 +82,6 @@ def _unzstd(data: bytes, *, max_size: int = 0) -> bytes:
     while output_chunk:
         output_chunk = stream_reader.read(_CHUNK_SIZE)
         decompressed_size += len(output_chunk)
-        if max_size and decompressed_size > max_size:
-            raise _DecompressionMaxSizeExceeded(
-                f"The number of bytes decompressed so far "
-                f"({decompressed_size} B) exceed the specified maximum "
-                f"({max_size} B)."
-            )
+        _check_max_size(decompressed_size, max_size)
         output_stream.write(output_chunk)
-    output_stream.seek(0)
-    return output_stream.read()
+    return output_stream.getvalue()

scrapy/utils/gz.py+3 −9 modified

@@ -5,7 +5,7 @@
 from io import BytesIO
 from typing import TYPE_CHECKING
 
-from ._compression import _CHUNK_SIZE, _DecompressionMaxSizeExceeded
+from ._compression import _CHUNK_SIZE, _check_max_size
 
 if TYPE_CHECKING:
     from scrapy.http import Response
@@ -31,15 +31,9 @@ def gunzip(data: bytes, *, max_size: int = 0) -> bytes:
                 break
             raise
         decompressed_size += len(chunk)
-        if max_size and decompressed_size > max_size:
-            raise _DecompressionMaxSizeExceeded(
-                f"The number of bytes decompressed so far "
-                f"({decompressed_size} B) exceed the specified maximum "
-                f"({max_size} B)."
-            )
+        _check_max_size(decompressed_size, max_size)
         output_stream.write(chunk)
-    output_stream.seek(0)
-    return output_stream.read()
+    return output_stream.getvalue()
 
 
 def gzip_magic_number(response: Response) -> bool:

tests/test_downloadermiddleware_httpcompression.py+48 −20 modified

@@ -52,11 +52,10 @@
 
 def _skip_if_no_br() -> None:
     try:
-        try:
-            import brotli  # noqa: F401,PLC0415
-        except ImportError:
-            import brotlicffi  # noqa: F401,PLC0415
-    except ImportError:
+        import brotli  # noqa: PLC0415
+
+        brotli.Decompressor.can_accept_more_data
+    except (ImportError, AttributeError):
         pytest.skip("no brotli support")
 
 
@@ -153,14 +152,9 @@ def test_process_response_br(self):
 
     def test_process_response_br_unsupported(self):
         try:
-            try:
-                import brotli  # noqa: F401,PLC0415
-
-                pytest.skip("Requires not having brotli support")
-            except ImportError:
-                import brotlicffi  # noqa: F401,PLC0415
+            import brotli  # noqa: F401,PLC0415
 
-                pytest.skip("Requires not having brotli support")
+            pytest.skip("Requires not having brotli support")
         except ImportError:
             pass
         response = self._getresponse("br")
@@ -179,7 +173,7 @@ def test_process_response_br_unsupported(self):
                 (
                     "HttpCompressionMiddleware cannot decode the response for"
                     " http://scrapytest.org/ from unsupported encoding(s) 'br'."
-                    " You need to install brotli or brotlicffi to decode 'br'."
+                    " You need to install brotli >= 1.2.0 to decode 'br'."
                 ),
             ),
         )
@@ -511,15 +505,16 @@ def test_process_response_head_request_no_decode_required(self):
         self.assertStatsEqual("httpcompression/response_bytes", None)
 
     def _test_compression_bomb_setting(self, compression_id):
-        settings = {"DOWNLOAD_MAXSIZE": 10_000_000}
+        settings = {"DOWNLOAD_MAXSIZE": 1_000_000}
         crawler = get_crawler(Spider, settings_dict=settings)
         spider = crawler._create_spider("scrapytest.org")
         mw = HttpCompressionMiddleware.from_crawler(crawler)
         mw.open_spider(spider)
 
-        response = self._getresponse(f"bomb-{compression_id}")
-        with pytest.raises(IgnoreRequest):
+        response = self._getresponse(f"bomb-{compression_id}")  # 11_511_612 B
+        with pytest.raises(IgnoreRequest) as exc_info:
             mw.process_response(response.request, response)
+        assert exc_info.value.__cause__.decompressed_size < 1_100_000
 
     def test_compression_bomb_setting_br(self):
         _skip_if_no_br()
@@ -539,16 +534,17 @@ def test_compression_bomb_setting_zstd(self):
 
     def _test_compression_bomb_spider_attr(self, compression_id):
         class DownloadMaxSizeSpider(Spider):
-            download_maxsize = 10_000_000
+            download_maxsize = 1_000_000
 
         crawler = get_crawler(DownloadMaxSizeSpider)
         spider = crawler._create_spider("scrapytest.org")
         mw = HttpCompressionMiddleware.from_crawler(crawler)
         mw.open_spider(spider)
 
         response = self._getresponse(f"bomb-{compression_id}")
-        with pytest.raises(IgnoreRequest):
+        with pytest.raises(IgnoreRequest) as exc_info:
             mw.process_response(response.request, response)
+        assert exc_info.value.__cause__.decompressed_size < 1_100_000
 
     @pytest.mark.filterwarnings("ignore::scrapy.exceptions.ScrapyDeprecationWarning")
     def test_compression_bomb_spider_attr_br(self):
@@ -577,9 +573,10 @@ def _test_compression_bomb_request_meta(self, compression_id):
         mw.open_spider(spider)
 
         response = self._getresponse(f"bomb-{compression_id}")
-        response.meta["download_maxsize"] = 10_000_000
-        with pytest.raises(IgnoreRequest):
+        response.meta["download_maxsize"] = 1_000_000
+        with pytest.raises(IgnoreRequest) as exc_info:
             mw.process_response(response.request, response)
+        assert exc_info.value.__cause__.decompressed_size < 1_100_000
 
     def test_compression_bomb_request_meta_br(self):
         _skip_if_no_br()
@@ -728,3 +725,34 @@ def test_download_warnsize_request_meta_zstd(self):
         _skip_if_no_zstd()
 
         self._test_download_warnsize_request_meta("zstd")
+
+    def _get_truncated_response(self, compression_id):
+        crawler = get_crawler(Spider)
+        spider = crawler._create_spider("scrapytest.org")
+        mw = HttpCompressionMiddleware.from_crawler(crawler)
+        mw.open_spider(spider)
+        response = self._getresponse(compression_id)
+        truncated_body = response.body[: len(response.body) // 2]
+        response = response.replace(body=truncated_body)
+        return mw.process_response(response.request, response)
+
+    def test_process_truncated_response_br(self):
+        _skip_if_no_br()
+        resp = self._get_truncated_response("br")
+        assert resp.body.startswith(b"<!DOCTYPE")
+
+    def test_process_truncated_response_zlibdeflate(self):
+        resp = self._get_truncated_response("zlibdeflate")
+        assert resp.body.startswith(b"<!DOCTYPE")
+
+    def test_process_truncated_response_gzip(self):
+        resp = self._get_truncated_response("gzip")
+        assert resp.body.startswith(b"<!DOCTYPE")
+
+    def test_process_truncated_response_zstd(self):
+        _skip_if_no_zstd()
+        for check_key in FORMAT:
+            if not check_key.startswith("zstd-"):
+                continue
+            resp = self._get_truncated_response(check_key)
+            assert len(resp.body) == 0

tests/test_proxy_connect.py+1 −6 modified

@@ -61,6 +61,7 @@ def _wrong_credentials(proxy_url):
     return urlunsplit(bad_auth_proxy)
 
 
+@pytest.mark.requires_mitmproxy
 class TestProxyConnect:
     @classmethod
     def setup_class(cls):
@@ -72,13 +73,7 @@ def teardown_class(cls):
         cls.mockserver.__exit__(None, None, None)
 
     def setup_method(self):
-        try:
-            import mitmproxy  # noqa: F401,PLC0415
-        except ImportError:
-            pytest.skip("mitmproxy is not installed")
-
         self._oldenv = os.environ.copy()
-
         self._proxy = MitmProxy()
         proxy_url = self._proxy.start()
         os.environ["https_proxy"] = proxy_url

tox.ini+17 −13 modified

@@ -25,9 +25,6 @@ deps =
 deps =
     {[test-requirements]deps}
     pytest >= 8.4.1  # https://github.com/pytest-dev/pytest/pull/13502
-
-    # mitmproxy does not support PyPy
-    mitmproxy; implementation_name != "pypy"
 passenv =
     S3_TEST_FILE_URI
     AWS_ACCESS_KEY_ID
@@ -112,9 +109,6 @@ deps =
     w3lib==1.17.0
     zope.interface==5.1.0
     {[test-requirements]deps}
-
-    # mitmproxy 8.0.0 requires upgrading some of the pinned dependencies
-    # above, hence we do not install it in pinned environments at the moment
 setenv =
     _SCRAPY_PINNED=true
 commands =
@@ -137,8 +131,7 @@ deps =
     Twisted[http2]
     boto3
     bpython  # optional for shell wrapper tests
-    brotli; implementation_name != "pypy"  # optional for HTTP compress downloader middleware tests
-    brotlicffi; implementation_name == "pypy"  # optional for HTTP compress downloader middleware tests
+    brotli >= 1.2.0  # optional for HTTP compress downloader middleware tests
     google-cloud-storage
     ipython
     robotexclusionrulesparser
@@ -152,9 +145,7 @@ deps =
     Pillow==8.3.2
     boto3==1.20.0
     bpython==0.7.1
-    brotli==0.5.2; implementation_name != "pypy"
-    brotlicffi==0.8.0; implementation_name == "pypy"
-    brotlipy
+    brotli==1.2.0
     google-cloud-storage==1.29.0
     ipython==7.1.0
     robotexclusionrulesparser==1.6.2
@@ -254,7 +245,7 @@ deps =
     {[testenv]deps}
     botocore>=1.13.45
 commands =
-    pytest {posargs:--cov-config=pyproject.toml --cov=scrapy --cov-report=xml --cov-report= tests --junitxml=botocore.junit.xml -o junit_family=legacy -m requires_botocore}
+    pytest {posargs:--cov-config=pyproject.toml --cov=scrapy --cov-report=xml --cov-report= tests --junitxml=botocore.junit.xml -o junit_family=legacy} -m requires_botocore
 
 [testenv:botocore-pinned]
 basepython = {[pinned]basepython}
@@ -264,4 +255,17 @@ deps =
 setenv =
     {[pinned]setenv}
 commands =
-    pytest {posargs:--cov-config=pyproject.toml --cov=scrapy --cov-report=xml --cov-report= tests --junitxml=botocore-pinned.junit.xml -o junit_family=legacy -m requires_botocore}
+    pytest {posargs:--cov-config=pyproject.toml --cov=scrapy --cov-report=xml --cov-report= tests --junitxml=botocore-pinned.junit.xml -o junit_family=legacy} -m requires_botocore
+
+
+# Run proxy tests that use mitmproxy in a separate env to avoid installing
+# numerous mitmproxy deps in other envs (even in extra-deps), as they can
+# conflict with other deps we want, or don't want, to have installed there.
+
+[testenv:mitmproxy]
+deps =
+    {[testenv]deps}
+    # mitmproxy does not support PyPy
+    mitmproxy; implementation_name != "pypy"
+commands =
+    pytest {posargs:--cov-config=pyproject.toml --cov=scrapy --cov-report=xml --cov-report= tests --junitxml=botocore.junit.xml -o junit_family=legacy} -m requires_mitmproxy

67d78bc41db1

Merge pull request #1234 from robryk:sizelimit

https://github.com/google/brotliCopybara-ServiceJan 8, 2025via ghsa

commit

4 files changed · +197 −67

python/_brotli.c+155 −67 modified

@@ -23,6 +23,7 @@ typedef struct {
     PyObject *list;
     /* Number of whole allocated size. */
     Py_ssize_t allocated;
+    Py_ssize_t size_limit;
 } BlocksOutputBuffer;
 
 static const char unable_allocate_msg[] = "Unable to allocate output buffer.";
@@ -69,11 +70,17 @@ static const Py_ssize_t BUFFER_BLOCK_SIZE[] =
    Return -1 on failure
 */
 static inline int
-BlocksOutputBuffer_InitAndGrow(BlocksOutputBuffer *buffer,
+BlocksOutputBuffer_InitAndGrow(BlocksOutputBuffer *buffer, Py_ssize_t size_limit,
                                size_t *avail_out, uint8_t **next_out)
 {
     PyObject *b;
-    const Py_ssize_t block_size = BUFFER_BLOCK_SIZE[0];
+    Py_ssize_t block_size = BUFFER_BLOCK_SIZE[0];
+
+    assert(size_limit > 0);
+
+    if (size_limit < block_size) {
+      block_size = size_limit;
+    }
 
     // Ensure .list was set to NULL, for BlocksOutputBuffer_OnError().
     assert(buffer->list == NULL);
@@ -94,6 +101,7 @@ BlocksOutputBuffer_InitAndGrow(BlocksOutputBuffer *buffer,
 
     // Set variables
     buffer->allocated = block_size;
+    buffer->size_limit = size_limit;
 
     *avail_out = (size_t) block_size;
     *next_out = (uint8_t*) PyBytes_AS_STRING(b);
@@ -122,10 +130,16 @@ BlocksOutputBuffer_Grow(BlocksOutputBuffer *buffer,
         block_size = BUFFER_BLOCK_SIZE[Py_ARRAY_LENGTH(BUFFER_BLOCK_SIZE) - 1];
     }
 
-    // Check buffer->allocated overflow
-    if (block_size > PY_SSIZE_T_MAX - buffer->allocated) {
-        PyErr_SetString(PyExc_MemoryError, unable_allocate_msg);
-        return -1;
+    if (block_size > buffer->size_limit - buffer->allocated) {
+      block_size = buffer->size_limit - buffer->allocated;
+    }
+
+    if (block_size == 0) {
+      // We are at the size_limit (either the provided one, in which case we
+      // shouldn't have been called, or the implicit PY_SSIZE_T_MAX one, in
+      // which case we wouldn't be able to concatenate the blocks at the end).
+      PyErr_SetString(PyExc_MemoryError, "too long");
+      return -1;
     }
 
     // Create the block
@@ -291,7 +305,7 @@ static PyObject* compress_stream(BrotliEncoderState* enc, BrotliEncoderOperation
   BlocksOutputBuffer buffer = {.list=NULL};
   PyObject *ret;
 
-  if (BlocksOutputBuffer_InitAndGrow(&buffer, &available_out, &next_out) < 0) {
+  if (BlocksOutputBuffer_InitAndGrow(&buffer, PY_SSIZE_T_MAX, &available_out, &next_out) < 0) {
     goto error;
   }
 
@@ -592,57 +606,6 @@ static PyTypeObject brotli_CompressorType = {
   brotli_Compressor_new,                 /* tp_new */
 };
 
-static PyObject* decompress_stream(BrotliDecoderState* dec,
-                                   uint8_t* input, size_t input_length) {
-  BrotliDecoderResult result;
-
-  size_t available_in = input_length;
-  const uint8_t* next_in = input;
-
-  size_t available_out;
-  uint8_t* next_out;
-  BlocksOutputBuffer buffer = {.list=NULL};
-  PyObject *ret;
-
-  if (BlocksOutputBuffer_InitAndGrow(&buffer, &available_out, &next_out) < 0) {
-    goto error;
-  }
-
-  while (1) {
-    Py_BEGIN_ALLOW_THREADS
-    result = BrotliDecoderDecompressStream(dec,
-                                           &available_in, &next_in,
-                                           &available_out, &next_out, NULL);
-    Py_END_ALLOW_THREADS
-
-    if (result == BROTLI_DECODER_RESULT_NEEDS_MORE_OUTPUT) {
-      if (available_out == 0) {
-        if (BlocksOutputBuffer_Grow(&buffer, &available_out, &next_out) < 0) {
-          goto error;
-        }
-      }
-      continue;
-    }
-
-    break;
-  }
-
-  if (result == BROTLI_DECODER_RESULT_ERROR || available_in != 0) {
-    goto error;
-  }
-
-  ret = BlocksOutputBuffer_Finish(&buffer, available_out);
-  if (ret != NULL) {
-    goto finally;
-  }
-
-error:
-  BlocksOutputBuffer_OnError(&buffer);
-  ret = NULL;
-finally:
-  return ret;
-}
-
 PyDoc_STRVAR(brotli_Decompressor_doc,
 "An object to decompress a byte string.\n"
 "\n"
@@ -655,10 +618,14 @@ PyDoc_STRVAR(brotli_Decompressor_doc,
 typedef struct {
   PyObject_HEAD
   BrotliDecoderState* dec;
+  uint8_t* unconsumed_data;
+  size_t unconsumed_data_length;
 } brotli_Decompressor;
 
 static void brotli_Decompressor_dealloc(brotli_Decompressor* self) {
   BrotliDecoderDestroyInstance(self->dec);
+  if (self->unconsumed_data)
+    free(self->unconsumed_data);
   #if PY_MAJOR_VERSION >= 3
   Py_TYPE(self)->tp_free((PyObject*)self);
   #else
@@ -674,6 +641,9 @@ static PyObject* brotli_Decompressor_new(PyTypeObject *type, PyObject *args, PyO
     self->dec = BrotliDecoderCreateInstance(0, 0, 0);
   }
 
+  self->unconsumed_data = NULL;
+  self->unconsumed_data_length = 0;
+
   return (PyObject *)self;
 }
 
@@ -692,35 +662,118 @@ static int brotli_Decompressor_init(brotli_Decompressor *self, PyObject *args, P
   return 0;
 }
 
+static PyObject* decompress_stream(brotli_Decompressor* self,
+                                   uint8_t* input, size_t input_length, Py_ssize_t max_output_length) {
+  BrotliDecoderResult result;
+
+  size_t available_in = input_length;
+  const uint8_t* next_in = input;
+
+  size_t available_out;
+  uint8_t* next_out;
+  uint8_t* new_tail;
+  BlocksOutputBuffer buffer = {.list=NULL};
+  PyObject *ret;
+
+  if (BlocksOutputBuffer_InitAndGrow(&buffer, max_output_length, &available_out, &next_out) < 0) {
+    goto error;
+  }
+
+  while (1) {
+    Py_BEGIN_ALLOW_THREADS
+    result = BrotliDecoderDecompressStream(self->dec,
+                                           &available_in, &next_in,
+                                           &available_out, &next_out, NULL);
+    Py_END_ALLOW_THREADS
+
+    if (result == BROTLI_DECODER_RESULT_NEEDS_MORE_OUTPUT) {
+      if (available_out == 0) {
+        if (buffer.allocated == PY_SSIZE_T_MAX) {
+          PyErr_SetString(PyExc_MemoryError, unable_allocate_msg);
+          goto error;
+        }
+        if (buffer.allocated == max_output_length) {
+          // We've reached the output length limit.
+          break;
+        }
+        if (BlocksOutputBuffer_Grow(&buffer, &available_out, &next_out) < 0) {
+          goto error;
+        }
+      }
+      continue;
+    }
+
+    if (result == BROTLI_DECODER_RESULT_ERROR || available_in != 0) {
+      available_in = 0;
+      goto error;
+    }
+
+    break;
+  }
+
+  ret = BlocksOutputBuffer_Finish(&buffer, available_out);
+  if (ret != NULL) {
+    goto finally;
+  }
+
+error:
+  BlocksOutputBuffer_OnError(&buffer);
+  ret = NULL;
+
+finally:
+  new_tail = available_in > 0 ? malloc(available_in) : NULL;
+  if (available_in > 0) {
+    memcpy(new_tail, next_in, available_in);
+  }
+  if (self->unconsumed_data) {
+    free(self->unconsumed_data);
+  }
+  self->unconsumed_data = new_tail;
+  self->unconsumed_data_length = available_in;
+
+  return ret;
+}
+
+
 PyDoc_STRVAR(brotli_Decompressor_process_doc,
 "Process \"string\" for decompression, returning a string that contains \n"
 "decompressed output data.  This data should be concatenated to the output \n"
 "produced by any preceding calls to the \"process()\" method. \n"
 "Some or all of the input may be kept in internal buffers for later \n"
 "processing, and the decompressed output data may be empty until enough input \n"
 "has been accumulated.\n"
+"If max_output_length is set, no more than max_output_length bytes will be\n"
+"returned. If the limit is reached, further calls to process (potentially with\n"
+"empty input) will continue to yield more data. If, after returning a string of\n"
+"the length equal to limit, can_accept_more_data() returns False, process()\n"
+"must only be called with empty input until can_accept_more_data() once again\n"
+"returns True.\n"
 "\n"
 "Signature:\n"
-"  decompress(string)\n"
+"  decompress(string, max_output_length=int)\n"
 "\n"
 "Args:\n"
 "  string (bytes): The input data\n"
-"\n"
-"Returns:\n"
+"\n""Returns:\n"
 "  The decompressed output data (bytes)\n"
 "\n"
 "Raises:\n"
 "  brotli.error: If decompression fails\n");
 
-static PyObject* brotli_Decompressor_process(brotli_Decompressor *self, PyObject *args) {
+static PyObject* brotli_Decompressor_process(brotli_Decompressor *self, PyObject *args, PyObject* keywds) {
   PyObject* ret;
   Py_buffer input;
   int ok;
+  Py_ssize_t max_output_length = PY_SSIZE_T_MAX;
+  uint8_t* data;
+  size_t data_length;
+
+  static char* kwlist[] = { "", "max_output_length", NULL };
 
 #if PY_MAJOR_VERSION >= 3
-  ok = PyArg_ParseTuple(args, "y*:process", &input);
+  ok = PyArg_ParseTupleAndKeywords(args, keywds, "y*|n:process", kwlist, &input, &max_output_length);
 #else
-  ok = PyArg_ParseTuple(args, "s*:process", &input);
+  ok = PyArg_ParseTupleAndKeywords(args, keywds, "s*|n:process", kwlist, &input, &max_output_length);
 #endif
 
   if (!ok) {
@@ -731,7 +784,20 @@ static PyObject* brotli_Decompressor_process(brotli_Decompressor *self, PyObject
     goto error;
   }
 
-  ret = decompress_stream(self->dec, (uint8_t*) input.buf, input.len);
+  if (self->unconsumed_data_length > 0) {
+    if (input.len > 0) {
+      PyErr_SetString(BrotliError, "process called with data when accept_more_data is False");
+      ret = NULL;
+      goto finally;
+    }
+    data = self->unconsumed_data;
+    data_length = self->unconsumed_data_length;
+  } else {
+    data = (uint8_t*)input.buf;
+    data_length = input.len;
+  }
+
+  ret = decompress_stream(self, data, data_length, max_output_length);
   if (ret != NULL) {
     goto finally;
   }
@@ -773,13 +839,35 @@ static PyObject* brotli_Decompressor_is_finished(brotli_Decompressor *self) {
   }
 }
 
+PyDoc_STRVAR(brotli_Decompressor_can_accept_more_data_doc,
+"Checks if the decoder instance can accept more compressed data. If the decompress()\n"
+"method on this instance of decompressor was never called with max_length,\n"
+"this method will always return True.\n"
+"\n"
+"Signature:"
+"  can_accept_more_data()\n"
+"\n"
+"Returns:\n"
+"  True  if the decoder is ready to accept more compressed data via decompress()\n"
+"  False if the decoder needs to output some data via decompress(b'') before\n"
+"        being provided any more compressed data\n");
+
+static PyObject* brotli_Decompressor_can_accept_more_data(brotli_Decompressor* self) {
+  if (self->unconsumed_data_length > 0) {
+    Py_RETURN_FALSE;
+  } else {
+    Py_RETURN_TRUE;
+  }
+}
+
 static PyMemberDef brotli_Decompressor_members[] = {
   {NULL}  /* Sentinel */
 };
 
 static PyMethodDef brotli_Decompressor_methods[] = {
-  {"process", (PyCFunction)brotli_Decompressor_process, METH_VARARGS, brotli_Decompressor_process_doc},
+  {"process", (PyCFunction)brotli_Decompressor_process, METH_VARARGS | METH_KEYWORDS, brotli_Decompressor_process_doc},
   {"is_finished", (PyCFunction)brotli_Decompressor_is_finished, METH_NOARGS, brotli_Decompressor_is_finished_doc},
+  {"can_accept_more_data", (PyCFunction)brotli_Decompressor_can_accept_more_data, METH_NOARGS, brotli_Decompressor_can_accept_more_data_doc},
   {NULL}  /* Sentinel */
 };
 
@@ -877,7 +965,7 @@ static PyObject* brotli_decompress(PyObject *self, PyObject *args, PyObject *key
   next_in = (uint8_t*) input.buf;
   available_in = input.len;
 
-  if (BlocksOutputBuffer_InitAndGrow(&buffer, &available_out, &next_out) < 0) {
+  if (BlocksOutputBuffer_InitAndGrow(&buffer, PY_SSIZE_T_MAX, &available_out, &next_out) < 0) {
     goto error;
   }

python/tests/decompressor_test.py+42 −0 modified

@@ -4,6 +4,7 @@
 # See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
 
 import functools
+import os
 import unittest
 
 from . import _test_utils
@@ -39,10 +40,51 @@ def _decompress(self, test_data):
                     out_file.write(self.decompressor.process(data))
         self.assertTrue(self.decompressor.is_finished())
 
+    def _decompress_with_limit(self, test_data, max_output_length):
+        temp_uncompressed = _test_utils.get_temp_uncompressed_name(test_data)
+        with open(temp_uncompressed, 'wb') as out_file:
+            with open(test_data, 'rb') as in_file:
+                chunk_iter = iter(functools.partial(in_file.read, 10 * 1024), b'')
+                while not self.decompressor.is_finished():
+                    data = b''
+                    if self.decompressor.can_accept_more_data():
+                        data = next(chunk_iter, b'')
+                    decompressed_data = self.decompressor.process(data, max_output_length=max_output_length)
+                    self.assertTrue(len(decompressed_data) <= max_output_length)
+                    out_file.write(decompressed_data)
+                self.assertTrue(next(chunk_iter, None) == None)
+
     def _test_decompress(self, test_data):
         self._decompress(test_data)
         self._check_decompression(test_data)
 
+    def _test_decompress_with_limit(self, test_data):
+        self._decompress_with_limit(test_data, max_output_length=20)
+        self._check_decompression(test_data)
+
+    def test_too_much_input(self):
+        with open(os.path.join(_test_utils.TESTDATA_DIR, "zerosukkanooa.compressed"), 'rb') as in_file:
+            compressed = in_file.read()
+            self.decompressor.process(compressed[:-1], max_output_length=1)
+            # the following assertion checks whether the test setup is correct
+            self.assertTrue(not self.decompressor.can_accept_more_data())
+            with self.assertRaises(brotli.error):
+                self.decompressor.process(compressed[-1:])
+
+    def test_changing_limit(self):
+        test_data = os.path.join(_test_utils.TESTDATA_DIR, "zerosukkanooa.compressed")
+        temp_uncompressed = _test_utils.get_temp_uncompressed_name(test_data)
+        with open(temp_uncompressed, 'wb') as out_file:
+            with open(test_data, 'rb') as in_file:
+                compressed = in_file.read()
+                uncompressed = self.decompressor.process(compressed[:-1], max_output_length=1)
+                self.assertTrue(len(uncompressed) <= 1)
+                out_file.write(uncompressed)
+                while not self.decompressor.can_accept_more_data():
+                    out_file.write(self.decompressor.process(b''))
+                out_file.write(self.decompressor.process(compressed[-1:]))
+        self._check_decompression(test_data)
+
     def test_garbage_appended(self):
         with self.assertRaises(brotli.error):
             self.decompressor.process(brotli.compress(b'a') + b'a')

tests/testdata/zerosukkanooa+0 −0 added
tests/testdata/zerosukkanooa.compressed+0 −0 added

Vulnerability mechanics

Generated by null/stub on May 9, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.

References

github.com/advisories/GHSA-2qfp-q593-8484ghsaADVISORY
nvd.nist.gov/vuln/detail/CVE-2025-6176ghsaADVISORY
github.com/google/brotli/commit/67d78bc41db1a0d03f2e763497748f2f69946627ghsaWEB
github.com/google/brotli/issues/1327ghsaWEB
github.com/google/brotli/issues/1375ghsaWEB
github.com/google/brotli/pull/1234ghsaWEB
github.com/google/brotli/releases/tag/v1.2.0ghsaWEB
github.com/scrapy/scrapy/commit/14737e91edc513967f516fc839cc9c8a4f8d91daghsaWEB
github.com/scrapy/scrapy/pull/7134ghsaWEB
huntr.com/bounties/2c26a886-5984-47ee-a421-0d5fe1344eb0nvdWEB

News mentions

No linked articles in our index yet.

cvss	0.488
epss	0.000
exploit	0.000
kev	0.000
patch	-0.070
ransomware	0.000