PickleScan Security Bypass Using Misleading File Extension
Description
An Improper Input Validation vulnerability in the scanning logic of mmaitre314 picklescan versions up to and including 0.0.30 allows a remote attacker to bypass pickle files security checks by supplying a standard pickle file with a PyTorch-related file extension. When the pickle file incorrectly considered safe is loaded, it can lead to the execution of malicious code.
AI Insight
LLM-synthesized narrative grounded in this CVE's description and references.
PickleScan <=0.0.30 fails to scan malicious pickle files disguised with PyTorch extensions like .bin, allowing remote code execution by bypassing security checks.
What the vulnerability is CVE-2025-10155 is an improper input validation vulnerability in PickleScan (versions up to 0.0.30), a security scanner for Python pickle files. The root cause lies in the scan_bytes function, which first checks the file extension against a list of PyTorch-related extensions (e.g., .bin). If the extension matches, it calls scan_pytorch; if that fails (as it would for a standard pickle file), the scanner returns an error without falling back to scan_pickle_bytes, thus completely bypassing malicious pickle detection [2][3].
How it is exploited An attacker can disguise a malicious pickle file—normally detected when using a .pkl extension—by simply renaming it to a PyTorch-related extension such as .bin. When PickleScan processes the renamed file, it attempts to parse it as a PyTorch format, which fails, and then exits with an error instead of scanning the file as a standard pickle. This means the malicious content (e.g., dangerous imports like builtins.exec) is never flagged [1][2]. The attack requires no authentication and can be delivered remotely, for example by hosting the disguised file on a model repository like Hugging Face.
Impact If a user or automated system loads a file that was incorrectly deemed safe by PickleScan, the malicious pickle can execute arbitrary code, leading to full system compromise [3]. The vulnerability undermines the scanner's core purpose, allowing an attacker to evade security checks and potentially achieve remote code execution when the pickle is loaded in a context where PickleScan was relied upon for protection.
Mitigation The vulnerability has been patched in the commit 28a7b4ef753466572bda3313737116eeb9b4e5c5, which modifies the logic to ensure that when PyTorch parsing fails, the scanner falls back to standard pickle analysis [1][2]. Users should update PickleScan to version 0.0.31 or later. No workaround is available; downgrading or relying on file extension renaming is not recommended.
AI Insight generated on May 19, 2026. Synthesized from this CVE's description and the cited reference URLs; citations are validated against the source bundle.
Affected packages
Versions sourced from the GitHub Security Advisory.
| Package | Affected versions | Patched versions |
|---|---|---|
picklescanPyPI | < 0.0.31 | 0.0.31 |
Affected products
2- Range: <=0.0.30
- mmaitre314/picklescanv5Range: 0
Patches
128a7b4ef7534Fix various vulnerabilities (#50)
6 files changed · +74 −17
setup.cfg+1 −1 modified@@ -1,6 +1,6 @@ [metadata] name = picklescan -version = 0.0.30 +version = 0.0.31 author = Matthieu Maitre author_email = mmaitre314@users.noreply.github.com description = Security scanner detecting Python Pickle files performing suspicious actions
src/picklescan/relaxed_zipfile.py+7 −2 modified@@ -1,7 +1,7 @@ # A more forgiving implementation of zipfile.ZipFile # Modified from Python code at # https://github.com/python/cpython/blob/edb69578ed74ff04ab78ab953355faa343a7e0ee/Lib/zipfile/__init__.py#L1606 -# Changes: removed flag/password/filename checks to align better with PyTorch's zip decoding +# Changes: removed flag/password/filename/CRC checks to align better with PyTorch's zip decoding import struct import zipfile @@ -85,7 +85,12 @@ def open(self, name, mode="r", pwd=None, *, force_zip64=False): if fheader[_FH_EXTRA_FIELD_LENGTH]: zef_file.read(fheader[_FH_EXTRA_FIELD_LENGTH]) - return zipfile.ZipExtFile(zef_file, mode, zinfo, pwd, True) + zef = zipfile.ZipExtFile(zef_file, mode, zinfo, pwd, True) + + # Disable CRC validation as PyTorch may not use it + zef._expected_crc = None + + return zef except BaseException: zef_file.close() raise
src/picklescan/scanner.py+42 −14 modified@@ -116,7 +116,7 @@ def __str__(self) -> str: "open", "breakpoint", }, # Pickle versions 3, 4 have those function under 'builtins' - "aiohttp.client": "*", + "aiohttp": "*", "asyncio": "*", "bdb": "*", "commands": "*", # Python 2 precursor to subprocess @@ -134,7 +134,6 @@ def __str__(self) -> str: "ssl": "*", # DNS exfiltration via ssl.get_server_certificate() "subprocess": "*", "sys": "*", - "asyncio.unix_events": {"_UnixSubprocessTransport._start"}, "code": {"InteractiveInterpreter.runcode"}, "cProfile": {"runctx", "run"}, "doctest": {"debug_script"}, @@ -257,6 +256,7 @@ def _list_globals(data: IO[bytes], multiple_pickles=True) -> Set[Tuple[str, str] for op in pickletools.genops(data): ops.append(op) except Exception as e: + _log.debug(f"Error parsing pickle: {e}", exc_info=True) parsing_pkl_error = str(e) last_byte = data.read(1) data.seek(-1, 1) @@ -329,6 +329,11 @@ def _build_scan_result_from_raw_globals( g = Global(rg[0], rg[1], SafetyLevel.Dangerous) safe_filter = _safe_globals.get(g.module) unsafe_filter = _unsafe_globals.get(g.module) + + # If the module as a whole is marked as dangerous, submodules are also dangerous + if unsafe_filter is None and "." in g.module and _unsafe_globals.get(g.module.split(".")[0]) == "*": + unsafe_filter = "*" + if "unknown" in g.module or "unknown" in g.name: g.safety = SafetyLevel.Dangerous _log.warning("%s: %s import '%s %s' FOUND", file_id, g.safety.value, g.module, g.name) @@ -348,11 +353,12 @@ def _build_scan_result_from_raw_globals( def scan_pickle_bytes(data: IO[bytes], file_id, multiple_pickles=True) -> ScanResult: """Disassemble a Pickle stream and report issues""" + _log.debug(f"scan_pickle_bytes({file_id})") try: raw_globals = _list_globals(data, multiple_pickles) except GenOpsError as e: - _log.error(f"ERROR: parsing pickle in {file_id}: {e}") + _log.error(f"ERROR: parsing pickle in {file_id}: {e}", exc_info=_log.isEnabledFor(logging.DEBUG)) if e.globals is not None: return _build_scan_result_from_raw_globals(e.globals, file_id, scan_err=True) else: @@ -365,6 +371,8 @@ def scan_pickle_bytes(data: IO[bytes], file_id, multiple_pickles=True) -> ScanRe # XXX: it appears there is not way to get the byte stream for a given file within the 7z archive and thus forcing us to unzip to disk before scanning def scan_7z_bytes(data: IO[bytes], file_id) -> ScanResult: + _log.debug(f"scan_7z_bytes({file_id})") + try: import py7zr except ImportError: @@ -387,6 +395,8 @@ def scan_7z_bytes(data: IO[bytes], file_id) -> ScanResult: def scan_zip_bytes(data: IO[bytes], file_id) -> ScanResult: + _log.debug(f"scan_zip_bytes({file_id})") + result = ScanResult([]) with RelaxedZipFile(data, "r") as zip: @@ -415,6 +425,8 @@ def scan_zip_bytes(data: IO[bytes], file_id) -> ScanResult: def scan_numpy(data: IO[bytes], file_id) -> ScanResult: + _log.debug(f"scan_numpy({file_id})") + # Delay import to avoid dependency on NumPy import numpy as np @@ -445,6 +457,8 @@ def scan_numpy(data: IO[bytes], file_id) -> ScanResult: def scan_pytorch(data: IO[bytes], file_id) -> ScanResult: + _log.debug(f"scan_pytorch({file_id})") + # new pytorch format if _is_zipfile(data): return scan_zip_bytes(data, file_id) @@ -473,26 +487,34 @@ def scan_pytorch(data: IO[bytes], file_id) -> ScanResult: def scan_bytes(data: IO[bytes], file_id, file_ext: Optional[str] = None) -> ScanResult: + _log.debug(f"scan_bytes({file_id})") + if file_ext is not None and file_ext in _pytorch_file_extensions: try: return scan_pytorch(data, file_id) except InvalidMagicError as e: - _log.error(f"ERROR: Invalid magic number for file {e}") - return ScanResult([], scan_err=True) - elif file_ext is not None and file_ext in _numpy_file_extensions: + _log.warning( + f"WARNING: Invalid PyTorch magic number for file {e}. Trying to scan as non-PyTorch file.", + exc_info=_log.isEnabledFor(logging.DEBUG), + ) + data.seek(0) + + if file_ext is not None and file_ext in _numpy_file_extensions: return scan_numpy(data, file_id) + + is_zip = zipfile.is_zipfile(data) + data.seek(0) + if is_zip: + return scan_zip_bytes(data, file_id) + elif _is_7z_file(data): + return scan_7z_bytes(data, file_id) else: - is_zip = zipfile.is_zipfile(data) - data.seek(0) - if is_zip: - return scan_zip_bytes(data, file_id) - elif _is_7z_file(data): - return scan_7z_bytes(data, file_id) - else: - return scan_pickle_bytes(data, file_id) + return scan_pickle_bytes(data, file_id) def scan_huggingface_model(repo_id): + _log.debug(f"scan_huggingface_model({repo_id})") + # List model files model = json.loads(_http_get(f"https://huggingface.co/api/models/{repo_id}").decode("utf-8")) file_names = [file_name for file_name in (sibling.get("rfilename") for sibling in model["siblings"]) if file_name is not None] @@ -512,6 +534,8 @@ def scan_huggingface_model(repo_id): def scan_directory_path(path) -> ScanResult: + _log.debug(f"scan_directory_path({path})") + scan_result = ScanResult([]) for base_path, _, file_names in os.walk(path): @@ -532,10 +556,14 @@ def scan_directory_path(path) -> ScanResult: def scan_file_path(path) -> ScanResult: + _log.debug(f"scan_file_path({path})") + file_ext = os.path.splitext(path)[1] with open(path, "rb") as file: return scan_bytes(file, path, file_ext) def scan_url(url) -> ScanResult: + _log.debug(f"scan_url({url})") + return scan_bytes(io.BytesIO(_http_get(url)), url)
tests/data2/GHSA-jgw4-cr84-mqxg.bin+0 −0 addedtests/data2/malicious1_crc.zip+0 −0 addedtests/test_scanner.py+24 −0 modified@@ -438,6 +438,21 @@ def initialize_corrupt_zip_file_central_directory(path: str, file_name: str, dat f.write(modified_data) +def initialize_corrupt_zip_file_crc(path: str, file_name: str, data: bytes): + if not os.path.exists(path): + with io.BytesIO() as buffer: + with zipfile.ZipFile(buffer, "w") as zip: + zip.writestr(file_name, data) + data = buffer.getbuffer().tobytes() + + # Corrupt the data, leading to a CRC mismatch + modified_data = data.replace(b"print('456')", b"print('123')", 1) + + # Write the corrupted content + with open(path, "wb") as f: + f.write(modified_data) + + def initialize_numpy_files(): import numpy as np @@ -687,6 +702,12 @@ def initialize_pickle_files(): pickle.dumps(Malicious1(), protocol=4), ) + initialize_corrupt_zip_file_crc( + f"{_root_path}/data2/malicious1_crc.zip", + "data.pkl", + pickle.dumps(Malicious1(), protocol=4), + ) + initialize_zip_file( f"{_root_path}/data/malicious1_wrong_ext.zip", "data.txt", # Pickle file with a non-standard extension @@ -744,6 +765,7 @@ def initialize_pickle_files(): initialize_pickle_file_from_reduce("GHSA-9w88-8rmg-7g2p.pkl", reduce_GHSA_9w88_8rmg_7g2p) initialize_pickle_file_from_reduce("GHSA-49gj-c84q-6qm9.pkl", reduce_GHSA_49gj_c84q_6qm9) initialize_pickle_file_from_reduce("GHSA-q77w-mwjj-7mqx.pkl", reduce_GHSA_q77w_mwjj_7mqx) + initialize_pickle_file_from_reduce("GHSA-jgw4-cr84-mqxg.bin", reduce_GHSA_q77w_mwjj_7mqx) initialize_pickle_files() @@ -1022,6 +1044,8 @@ def test_scan_file_path(): assert_scan("GHSA-9w88-8rmg-7g2p.pkl", [Global("cProfile", "runctx", SafetyLevel.Dangerous)]) assert_scan("GHSA-49gj-c84q-6qm9.pkl", [Global("cProfile", "run", SafetyLevel.Dangerous)]) assert_scan("GHSA-q77w-mwjj-7mqx.pkl", [Global("asyncio.unix_events", "_UnixSubprocessTransport._start", SafetyLevel.Dangerous)]) + assert_scan("GHSA-jgw4-cr84-mqxg.bin", [Global("asyncio.unix_events", "_UnixSubprocessTransport._start", SafetyLevel.Dangerous)]) + assert_scan("malicious1_crc.zip", [Global("builtins", name="eval", safety=SafetyLevel.Dangerous)]) def test_scan_file_path_npz():
Vulnerability mechanics
Generated on May 9, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.
References
5- github.com/mmaitre314/picklescan/security/advisories/GHSA-jgw4-cr84-mqxgghsaexploitvendor-advisoryWEB
- github.com/advisories/GHSA-jgw4-cr84-mqxgghsaADVISORY
- nvd.nist.gov/vuln/detail/CVE-2025-10155ghsaADVISORY
- github.com/mmaitre314/picklescan/blob/58983e1c20973ac42f2df7ff15d7c8cd32f9b688/src/picklescan/scanner.pyghsaWEB
- github.com/mmaitre314/picklescan/commit/28a7b4ef753466572bda3313737116eeb9b4e5c5ghsaWEB
News mentions
0No linked articles in our index yet.