CVE-2026-10801
Description
modelscope ms-swift uses a weak hash for PIL image caching, allowing different images to share cache keys and potentially leading to incorrect image loading.
AI Insight
LLM-synthesized narrative grounded in this CVE's description and references.
modelscope ms-swift uses a weak hash for PIL image caching, allowing different images to share cache keys and potentially leading to incorrect image loading.
Vulnerability
A vulnerability exists in modelscope ms-swift up to version 4.2.0 within the Template._save_pil_image function in swift/template/base.py. The component responsible for the PIL Image Cache Key Handler uses a weak hashing mechanism, specifically SHA256(image.tobytes()), which does not account for image dimensions or mode. This allows visually distinct images with identical raw pixel bytes but different dimensions to generate the same cache path [2].
Exploitation
An attacker with local access can exploit this vulnerability by creating two images with the same raw pixel bytes but different dimensions (e.g., 120x80 and 80x120). When Template._save_pil_image is called for both images, they will result in the same cache path. If the first image is cached, the second image will incorrectly reuse the first image's cached file instead of saving its own content [2]. The exploit requires a high degree of complexity and is indicated as difficult to execute [1].
Impact
Successful exploitation can lead to multimodal inference or training processes receiving the wrong image. This occurs because a later sample submitting one image might inadvertently load a previously cached, but different, image due to the hash collision. This could corrupt training data or lead to incorrect model outputs during inference, impacting the integrity of the system's operations [2].
Mitigation
A pull request has been submitted to fix this issue by including the image mode and dimensions in the hash input before the raw pixel bytes [3]. The fixed version and release date are not yet disclosed. As of the available references, no specific workaround is provided, and the vulnerability is not listed as actively exploited in the wild [1, 3].
- GitHub - modelscope/ms-swift: Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.6, DeepSeek-V4, GLM-5.1, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Gemma4, Llava, Phi4, ...) (AAAI 2025).
- Image Cache Hash Collision via Missing Dimension Metadata
- Fix PIL image cache key collisions across dimensions by 3em0 · Pull Request #9359 · modelscope/ms-swift
AI Insight generated on Jun 4, 2026. Synthesized from this CVE's description and the cited reference URLs; citations are validated against the source bundle.
Affected products
2(expand)+ 1 more
- (no CPE)
- (no CPE)range: <=4.2.0
Patches
146ec65acb6c0Merge 0a921cc95651fd983ebe63f7357b93b37faa42f4 into 01b3864ed56359aea70d0935ee58736044b7581c
2 files changed · +28 −2
swift/template/base.py+2 −2 modified@@ -792,8 +792,8 @@ def prepare_generate_kwargs(self, generate_kwargs: Dict[str, Any], *, model=None @staticmethod def _save_pil_image(image: Image.Image) -> str: - img_bytes = image.tobytes() - img_hash = hashlib.sha256(img_bytes).hexdigest() + img_meta = f'{image.mode}:{image.width}:{image.height}:'.encode() + img_hash = hashlib.sha256(img_meta + image.tobytes()).hexdigest() tmp_dir = os.path.join(get_cache_dir(), 'tmp', 'images') logger.info_once(f'create tmp_dir: {tmp_dir}') os.makedirs(tmp_dir, exist_ok=True)
tests/llm/test_template.py+26 −0 modified@@ -1,10 +1,14 @@ import os +import tempfile import torch import unittest +from unittest.mock import patch +from PIL import Image from swift.infer_engine import RequestConfig, TransformersEngine from swift.model import get_processor from swift.template import get_template +from swift.template.base import Template from swift.utils import get_logger, seed_everything # os.environ['CUDA_VISIBLE_DEVICES'] = '0' @@ -103,6 +107,28 @@ def test_tool_message_join(self): f'{observation}tool2\n{observation}tool3\n') assert res == ground_truth + def test_save_pil_image_uses_dimensions_in_cache_key(self): + width_a, height_a = 120, 80 + width_b, height_b = 80, 120 + self.assertEqual(width_a * height_a, width_b * height_b) + + pixels = bytearray() + for i in range(width_a * height_a): + row = i // width_a + pixels.extend((255, 60, 60) if row % 10 < 5 else (60, 60, 255)) + img_bytes = bytes(pixels) + + image_a = Image.frombytes('RGB', (width_a, height_a), img_bytes) + image_b = Image.frombytes('RGB', (width_b, height_b), img_bytes) + + with tempfile.TemporaryDirectory() as cache_dir, patch('swift.template.base.get_cache_dir', return_value=cache_dir): + path_a = Template._save_pil_image(image_a) + path_b = Template._save_pil_image(image_b) + + self.assertNotEqual(path_a, path_b) + self.assertEqual(Image.open(path_a).size, (width_a, height_a)) + self.assertEqual(Image.open(path_b).size, (width_b, height_b)) + if __name__ == '__main__': unittest.main()
Vulnerability mechanics
Root cause
"The image cache hash generation did not include image dimensions, leading to collisions."
Attack vector
An attacker must have local access to the affected system. The vulnerability is triggered when the `Template._save_pil_image` function is used. By providing images with identical raw bytes but different dimensions, an attacker can cause cache key collisions. This manipulation exploits the weak hashing mechanism to reuse cached file paths.
Affected code
The vulnerability resides in the `Template._save_pil_image` function located in the file `swift/template/base.py`. This function was responsible for generating a hash for the PIL image cache key using only `image.tobytes()`.
What the fix does
The pull request modifies the hashing mechanism within `Template._save_pil_image` to include image mode and dimensions along with the raw pixel bytes. This ensures that images with the same byte content but different dimensions will generate unique cache keys. A regression test was added to verify that images of different sizes but identical content receive distinct cache paths and retain their correct dimensions upon caching.
Preconditions
- inputLocal access to the system.
- configHigh degree of complexity required for the attack.
Generated on Jun 4, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.
References
7News mentions
0No linked articles in our index yet.