VYPR
Low severity3.6NVD Advisory· Published Jun 4, 2026· Updated Jun 4, 2026

CVE-2026-10813

CVE-2026-10813

Description

LMCache's KV Cache Handler uses a weak 16-bit hash for image identifiers, allowing cache poisoning and incorrect model output.

AI Insight

LLM-synthesized narrative grounded in this CVE's description and references.

LMCache's KV Cache Handler uses a weak 16-bit hash for image identifiers, allowing cache poisoning and incorrect model output.

Vulnerability

A flaw exists in LMCache up to version 0.4.6 within the hex_hash_to_int16 function in lmcache/integration/vllm/utils.py. This function truncates multimodal image identifiers to a 16-bit value, creating a limited hash space of 65,536 possible values. This affects the KV Cache Handler component.

Exploitation

An attacker can exploit this vulnerability by generating a few hundred distinct multimodal image identifiers. Due to the limited 16-bit hash space, collisions are likely to occur via the birthday paradox. When two different images produce the same 16-bit hash, their token IDs become identical in a specific region, leading to the same cache key. This requires local access and a high level of complexity, making exploitability difficult [2].

Impact

Successful exploitation allows an attacker to poison the KV cache. This means LMCache can incorrectly return KV cache data generated for a different image, leading to incorrect model output. It may also expose cross-user visual context through this cache poisoning mechanism [2].

Mitigation

A pull request has been submitted to address this issue by expanding the hash entropy to 64 bits, significantly increasing the collision threshold. The proposed fix renames hex_hash_to_int16 to hex_hash_to_int64 and masks with 0xFFFFFFFFFFFFFFFF for the hex path, and reads 8 bytes from the SHA-256 fallback instead of 2. As of the available references, a patched version is not yet released, and the pull request awaits acceptance [3].

AI Insight generated on Jun 4, 2026. Synthesized from this CVE's description and the cited reference URLs; citations are validated against the source bundle.

Affected products

2
  • Lmcache/Lmcachereferences2 versions
    (expand)+ 1 more
    • (no CPE)
    • (no CPE)range: <=0.4.6

Patches

1
8401bc587fec

Merge fe7b3e432d492fae6288e1859d7ea39acdf65e86 into 038f27d6c7d8ea7992f4f41c5f5a261b657edf3c

https://github.com/lmcache/lmcache3em0Jun 4, 2026via nvd-ref
2 files changed · +75 33
  • lmcache/integration/vllm/tests/test_mm_hash_utils.py+47 28 modified
    @@ -3,82 +3,101 @@
     import dataclasses
     
     # Third Party
    +import pytest
     import torch
     
     # First Party
     from lmcache.integration.vllm.utils import (
         apply_mm_hashes_to_token_ids,
         hex_hash_to_int16,
    +    hex_hash_to_int64,
     )
     
    +INT64_MAX = 0x7FFFFFFFFFFFFFFF
    +
     
     @dataclasses.dataclass(frozen=True)
     class DummyPlaceholderRange:
         offset: int
         length: int
     
     
    -def test_hex_hash_to_int16_accepts_hex_and_non_hex() -> None:
    +def test_hex_hash_to_int64_accepts_hex_and_non_hex() -> None:
         # Hex behavior preserved (with and without 0x prefix).
    -    assert hex_hash_to_int16("0000") == 0
    -    assert hex_hash_to_int16("ffff") == 0xFFFF
    -    assert hex_hash_to_int16("0xFFFF") == 0xFFFF
    -    assert hex_hash_to_int16("0x0001") == 1
    +    assert hex_hash_to_int64("0000") == 0
    +    assert hex_hash_to_int64("ffff") == 0xFFFF
    +    assert hex_hash_to_int64("0xFFFF") == 0xFFFF
    +    assert hex_hash_to_int64("0x0001") == 1
     
         # Non-hex identifiers must not raise and must be deterministic.
         s = "chatcmpl-a2a48871c4aad192-image-0"
    -    v1 = hex_hash_to_int16(s)
    -    v2 = hex_hash_to_int16(s)
    +    v1 = hex_hash_to_int64(s)
    +    v2 = hex_hash_to_int64(s)
         assert isinstance(v1, int)
    -    assert 0 <= v1 <= 0xFFFF
    +    assert 0 <= v1 <= INT64_MAX
         assert v1 == v2
     
     
    -def test_hex_hash_to_int16_hex_variants_whitespace_and_truncation() -> None:
    +def test_hex_hash_to_int64_hex_variants_whitespace_and_truncation() -> None:
         # Whitespace should be ignored and case should not matter.
    -    assert hex_hash_to_int16(" FfFf ") == 0xFFFF
    -    assert hex_hash_to_int16("\n0x00aB\t") == 0x00AB
    +    assert hex_hash_to_int64(" FfFf ") == 0xFFFF
    +    assert hex_hash_to_int64("\n0x00aB\t") == 0x00AB
     
    -    # Long hex should be truncated to 16 bits via masking.
    -    assert hex_hash_to_int16("123456") == 0x3456
    -    assert hex_hash_to_int16("0x123456") == 0x3456
    +    # Long hex should be masked to signed int64 range.
    +    assert hex_hash_to_int64("123456") == 0x123456
    +    assert hex_hash_to_int64("0xFFFFFFFFFFFFFFFF") == INT64_MAX
     
     
    -def test_hex_hash_to_int16_empty_and_invalid_hex_are_safe_and_deterministic() -> None:
    +def test_hex_hash_to_int64_empty_and_invalid_hex_are_safe_and_deterministic() -> None:
         # Empty (or effectively empty) values should not raise.
         for s in ("", "   ", "0x"):
    -        v1 = hex_hash_to_int16(s)
    -        v2 = hex_hash_to_int16(s)
    +        v1 = hex_hash_to_int64(s)
    +        v2 = hex_hash_to_int64(s)
             assert isinstance(v1, int)
    -        assert 0 <= v1 <= 0xFFFF
    +        assert 0 <= v1 <= INT64_MAX
             assert v1 == v2
     
         # Invalid "hex-looking" strings must fall back to hashing.
         for s in ("0xGG", "deadbeeg", "0x12xz"):
    -        v1 = hex_hash_to_int16(s)
    -        v2 = hex_hash_to_int16(s)
    +        v1 = hex_hash_to_int64(s)
    +        v2 = hex_hash_to_int64(s)
             assert isinstance(v1, int)
    -        assert 0 <= v1 <= 0xFFFF
    +        assert 0 <= v1 <= INT64_MAX
             assert v1 == v2
     
     
    -def test_hex_hash_to_int16_non_string_inputs_are_safe() -> None:
    +def test_hex_hash_to_int64_non_string_inputs_are_safe() -> None:
         # Be defensive: callers may pass None or other non-string types.
         for val in (None, 0, 12345, 3.14, b"deadbeef"):
    -        v1 = hex_hash_to_int16(val)  # type: ignore[arg-type]
    -        v2 = hex_hash_to_int16(val)  # type: ignore[arg-type]
    +        v1 = hex_hash_to_int64(val)  # type: ignore[arg-type]
    +        v2 = hex_hash_to_int64(val)  # type: ignore[arg-type]
             assert isinstance(v1, int)
    -        assert 0 <= v1 <= 0xFFFF
    +        assert 0 <= v1 <= INT64_MAX
             assert v1 == v2
     
     
    +def test_hex_hash_to_int16_deprecated_alias_matches_int64() -> None:
    +    s = "chatcmpl-a2a48871c4aad192-image-0"
    +    with pytest.deprecated_call(match="hex_hash_to_int16 is deprecated"):
    +        legacy_value = hex_hash_to_int16(s)
    +    assert legacy_value == hex_hash_to_int64(s)
    +
    +
    +def test_hex_hash_to_int64_different_inputs_do_not_collide_in_working_set() -> None:
    +    seen: set[int] = set()
    +    for i in range(10_000):
    +        h = hex_hash_to_int64(f"chatcmpl-test-{i:05d}-image-0")
    +        assert h not in seen
    +        seen.add(h)
    +
    +
     def test_apply_mm_hashes_to_token_ids_handles_non_hex_mm_hash() -> None:
         token_ids = torch.arange(0, 10, dtype=torch.long)
         mm_hashes = ["chatcmpl-a2a48871c4aad192-image-0"]
         mm_positions = [DummyPlaceholderRange(offset=2, length=4)]
     
         out = apply_mm_hashes_to_token_ids(token_ids.clone(), mm_hashes, mm_positions)
    -    expected_val = hex_hash_to_int16(mm_hashes[0])
    +    expected_val = hex_hash_to_int64(mm_hashes[0])
         assert out[2:6].tolist() == [expected_val] * 4
     
     
    @@ -102,8 +121,8 @@ def test_apply_mm_hashes_to_token_ids_multiple_placeholders_and_length_mismatch(
         ]
     
         out = apply_mm_hashes_to_token_ids(token_ids.clone(), mm_hashes, mm_positions)
    -    v0 = hex_hash_to_int16(mm_hashes[0])
    -    v1 = hex_hash_to_int16(mm_hashes[1])
    +    v0 = hex_hash_to_int64(mm_hashes[0])
    +    v1 = hex_hash_to_int64(mm_hashes[1])
     
         assert out[0:3].tolist() == [v0] * 3
         assert out[5:9].tolist() == [v1] * 4
    
  • lmcache/integration/vllm/utils.py+28 5 modified
    @@ -5,6 +5,7 @@
     import os
     import string
     import threading
    +import warnings
     
     if TYPE_CHECKING:
         from vllm.config import ModelConfig, VllmConfig
    @@ -141,14 +142,33 @@ def create_lmcache_ec_config() -> LMCacheEngineConfig:
     
     
     def hex_hash_to_int16(s: str) -> int:
    +    """Deprecated: use ``hex_hash_to_int64`` instead.
    +
    +    This wrapper exists for backward compatibility. It now returns the
    +    signed-int64-safe value from ``hex_hash_to_int64`` rather than truncating
    +    multimodal identifiers to 16 bits.
         """
    -    Convert a hash identifier into a 16-bit integer.
    +    warnings.warn(
    +        "hex_hash_to_int16 is deprecated and now returns a signed-int64-safe "
    +        "value. Use hex_hash_to_int64 instead.",
    +        DeprecationWarning,
    +        stacklevel=2,
    +    )
    +    return hex_hash_to_int64(s)
    +
    +
    +def hex_hash_to_int64(s: str) -> int:
    +    """Convert a hash identifier into a signed-int64-safe integer.
     
         Historically, LMCache expected multimodal identifiers to be hex strings.
         In practice (e.g., OpenAI-style multimodal requests), identifiers may be
         arbitrary strings like `chatcmpl-...-image-0`. This function therefore:
           - Parses hex strings (optionally prefixed with `0x`) as before, or
           - Falls back to a stable string hash (SHA-256) when the input is not hex.
    +
    +    The result is masked to 63 bits so it fits in a torch.long (signed int64)
    +    tensor without overflow. Previous versions truncated to 16 bits, which made
    +    birthday-paradox collisions likely after only a few hundred distinct images.
         """
         # Be defensive: vLLM may pass non-string identifiers.
         s = "" if s is None else str(s)
    @@ -158,14 +178,17 @@ def hex_hash_to_int16(s: str) -> int:
         hex_part = s_stripped[2:] if s_stripped.lower().startswith("0x") else s_stripped
         if hex_part and all(c in string.hexdigits for c in hex_part):
             try:
    -            return int(hex_part, 16) & 0xFFFF
    +            return int(hex_part, 16) & 0x7FFFFFFFFFFFFFFF
             except ValueError:
                 # Extremely unlikely (e.g., oversized/odd formatting); fall back to hashing.
                 pass
     
    -    # Fallback: stable 16-bit value derived from the full identifier string.
    +    # Fallback: stable 63-bit value derived from the full identifier string.
         digest = hashlib.sha256(s_stripped.encode("utf-8")).digest()
    -    return int.from_bytes(digest[:2], byteorder="big", signed=False)
    +    return (
    +        int.from_bytes(digest[:8], byteorder="big", signed=False)
    +        & 0x7FFFFFFFFFFFFFFF
    +    )
     
     
     def apply_mm_hashes_to_token_ids(
    @@ -183,7 +206,7 @@ def apply_mm_hashes_to_token_ids(
             if start >= n:
                 continue
             end = min(start + length, n)
    -        token_ids[start:end] = hex_hash_to_int16(hash_str)
    +        token_ids[start:end] = hex_hash_to_int64(hash_str)
         return token_ids
     
     
    

Vulnerability mechanics

Root cause

"The function hex_hash_to_int16 reduces multimodal identifiers to a 16-bit value, causing collisions and KV cache poisoning."

Attack vector

An attacker can trigger this vulnerability by submitting multiple distinct multimodal image identifiers to LMCache. These identifiers are processed by the `hex_hash_to_int16` function, which collapses them into a 16-bit value. When two different images produce the same 16-bit hash, LMCache can incorrectly associate KV cache data from one image with a request for another. This can lead to incorrect model outputs and potential exposure of cross-user visual context through cache poisoning [ref_id=1]. The attack requires local access and a high level of complexity, making exploitability difficult [ref_id=1].

Affected code

The vulnerability resides in the `hex_hash_to_int16` function located in the file `lmcache/integration/vllm/utils.py`. This function is responsible for reducing multimodal identifiers, such as image hashes, to a 16-bit value for use in cache key computation [ref_id=1].

What the fix does

The advisory suggests expanding the derived multimodal hash value from 16 bits to at least 64 bits to prevent collisions. This would involve modifying the `hex_hash_to_int16` function to return a larger integer, aligning with the 64-bit token IDs used in cache key computation. The advisory also recommends renaming the function to reflect the change, such as `hex_hash_to_int64`. The pull request to address this issue is awaiting acceptance [ref_id=1].

Preconditions

  • inputThe attacker must submit multiple distinct multimodal image identifiers.
  • networkThe deployment must accept multimodal requests.

Reproduction

Use LMCache with vLLM multimodal requests where image identifiers are passed through hex_hash_to_int16() and apply_mm_hashes_to_token_ids(). Generate many distinct multimodal image identifiers, for example OpenAI-style identifiers such as chatcmpl-<uuid>-image-<n>. Convert each identifier with the same 16-bit logic used by hex_hash_to_int16(). Observe that collisions typically occur after only a few hundred inputs due to the birthday paradox. Submit two requests whose distinct image identifiers collide in the 16-bit multimodal hash value and share the relevant prompt/token structure. Observe that the modified token IDs can produce the same cache key for different image contexts, allowing a cache hit for the wrong KV cache [ref_id=1].

Generated on Jun 4, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.

References

6

News mentions

0

No linked articles in our index yet.