VYPR
Low severityNVD Advisory· Published Feb 7, 2025· Updated Feb 12, 2025

vLLM using built-in hash() from Python 3.12 leads to predictable hash collisions in vLLM prefix cache

CVE-2025-25183

Description

vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Maliciously constructed statements can lead to hash collisions, resulting in cache reuse, which can interfere with subsequent responses and cause unintended behavior. Prefix caching makes use of Python's built-in hash() function. As of Python 3.12, the behavior of hash(None) has changed to be a predictable constant value. This makes it more feasible that someone could try exploit hash collisions. The impact of a collision would be using cache that was generated using different content. Given knowledge of prompts in use and predictable hashing behavior, someone could intentionally populate the cache using a prompt known to collide with another prompt in use. This issue has been addressed in version 0.7.2 and all users are advised to upgrade. There are no known workarounds for this vulnerability.

Affected packages

Versions sourced from the GitHub Security Advisory.

PackageAffected versionsPatched versions
vllmPyPI
< 0.7.20.7.2

Affected products

1

Patches

2
73b35cca7f37

[Core] Improve hash collision avoidance in prefix caching (#12621)

https://github.com/vllm-project/vllmRussell BryantFeb 4, 2025via ghsa
3 files changed · +45 10
  • tests/core/block/test_prefix_caching_block.py+2 2 modified
    @@ -65,8 +65,8 @@ def test_nth_block_has_correct_content_hash(seed: int, block_size: int,
     
             previous_block = MagicMock(spec=PrefixCachingBlock)
             prev_block_hash = random.randint(0, 1000)
    -        previous_block.content_hash = (prev_block_hash
    -                                       if prev_block_has_hash else None)
    +        previous_block.content_hash = (prev_block_hash if prev_block_has_hash
    +                                       else hash('None'))
     
             num_to_fill = block_size if is_curr_block_full else random.randint(
                 0, block_size - 1)
    
  • vllm/core/block/prefix_caching_block.py+34 8 modified
    @@ -65,6 +65,15 @@ class PrefixCachingBlockAllocator(BlockAllocator):
                 from 0 to num_blocks - 1.
         """
     
    +    # Note that we use 'None' as a string here instead of None because
    +    # as of Python 3.12, hash(None) returns a constant predictable value.
    +    # This could possibly make it easier to find and exploit hash
    +    # collisions. 'None' as a string will be hashed differently per process,
    +    # but consistently within the same process. This is the same as the
    +    # behavior of None prior to Python 3.12.
    +    _none_hash: int = hash('None')
    +
    +    # Implements Block.Factory.
         def __init__(
             self,
             num_blocks: int,
    @@ -122,7 +131,6 @@ def __init__(
     
             self.metric_data = CacheMetricData()
     
    -    # Implements Block.Factory.
         def _create_block(
             self,
             prev_block: Optional[Block],
    @@ -737,6 +745,14 @@ class PrefixCachingBlock(Block):
                 such as adapters that influence the block, apart from the token_ids.
         """
     
    +    # Note that we use 'None' as a string here instead of None because
    +    # as of Python 3.12, hash(None) returns a constant predictable value.
    +    # This could possibly make it easier to find and exploit hash
    +    # collisions. 'None' as a string will be hashed differently per process,
    +    # but consistently within the same process. This is the same as the
    +    # behavior of None prior to Python 3.12.
    +    _none_hash: int = hash('None')
    +
         def __init__(
             self,
             prev_block: Optional[Block],
    @@ -891,13 +907,13 @@ def content_hash(self) -> Optional[int]:
     
             is_first_block = self._prev_block is None
             prev_block_hash = (
    -            None if is_first_block else
    +            self._none_hash if is_first_block else
                 self._prev_block.content_hash  # type: ignore
             )
     
             # Previous block exists but does not yet have a hash.
             # Return no hash in this case.
    -        if prev_block_hash is None and not is_first_block:
    +        if prev_block_hash == self._none_hash and not is_first_block:
                 return None
     
             self._cached_content_hash = PrefixCachingBlock.hash_block_tokens(
    @@ -907,8 +923,9 @@ def content_hash(self) -> Optional[int]:
                 extra_hash=self._extra_hash)
             return self._cached_content_hash
     
    -    @staticmethod
    -    def hash_block_tokens(is_first_block: bool,
    +    @classmethod
    +    def hash_block_tokens(cls,
    +                          is_first_block: bool,
                               prev_block_hash: Optional[int],
                               cur_block_token_ids: List[int],
                               extra_hash: Optional[int] = None) -> int:
    @@ -929,7 +946,8 @@ def hash_block_tokens(is_first_block: bool,
             Returns:
             - int: The computed hash value for the block.
             """
    -        assert (prev_block_hash is None) == is_first_block
    +        if is_first_block and prev_block_hash is None:
    +            prev_block_hash = cls._none_hash
             return hash((is_first_block, prev_block_hash, *cur_block_token_ids,
                          extra_hash))
     
    @@ -949,6 +967,14 @@ class ComputedBlocksTracker:
         cached block hashes in the allocator.
         """
     
    +    # Note that we use 'None' as a string here instead of None because
    +    # as of Python 3.12, hash(None) returns a constant predictable value.
    +    # This could possibly make it easier to find and exploit hash
    +    # collisions. 'None' as a string will be hashed differently per process,
    +    # but consistently within the same process. This is the same as the
    +    # behavior of None prior to Python 3.12.
    +    _none_hash: int = hash('None')
    +
         def __init__(
             self,
             allocator: DeviceAwareBlockAllocator,
    @@ -994,7 +1020,7 @@ def _update_seq_hashes(self, seq: Sequence) -> None:
             # We need to know the hash of the previous block to compute the hash of
             # the current block so that blocks could be uniquely identified across
             # sequences of prefixes.
    -        prev_block_hash = (None if cur_num_blocks_recorded == 0 else
    +        prev_block_hash = (self._none_hash if cur_num_blocks_recorded == 0 else
                                block_hashes_recorded[-1])
             # Only update the computed block hashes for the new blocks
             for i in range(cur_num_blocks_recorded, num_computed_blocks):
    @@ -1009,7 +1035,7 @@ def _update_seq_hashes(self, seq: Sequence) -> None:
                 # This has to be kept in sync with the allocator's hash
                 # calculation.
                 block_hash = PrefixCachingBlock.hash_block_tokens(
    -                is_first_block=prev_block_hash is None,
    +                is_first_block=prev_block_hash == self._none_hash,
                     prev_block_hash=prev_block_hash,
                     cur_block_token_ids=block_token_ids,
                     extra_hash=extra_hash,
    
  • vllm/v1/core/kv_cache_utils.py+9 0 modified
    @@ -263,6 +263,15 @@ def hash_block_tokens(
             The hash value of the block and the token ids in the block.
             The entire tuple is used as the hash key of the block.
         """
    +    if not parent_block_hash:
    +        # Note that we use 'None' as a string here instead of None because
    +        # as of Python 3.12, hash(None) returns a constant predictable value.
    +        # This could possibly make it easier to find and exploit hash
    +        # collisions. 'None' as a string will be hashed differently per process,
    +        # but consistently within the same process. This is the same as the
    +        # behavior of None prior to Python 3.12.
    +        parent_block_hash = hash('None')
    +
         curr_block_token_ids_tuple = tuple(curr_block_token_ids)
         return BlockHashType(
             hash((parent_block_hash, curr_block_token_ids_tuple, extra_keys)),
    
432117cd1f59

gh-99540: Constant hash for _PyNone_Type to aid reproducibility (GH-99541)

https://github.com/python/cpythonyonillaskyDec 16, 2022via ghsa
2 files changed · +7 1
  • Misc/NEWS.d/next/Core and Builtins/2022-12-10-20-00-13.gh-issue-99540.ZZZHeP.rst+1 0 added
    @@ -0,0 +1 @@
    +``None`` now hashes to a constant value. This is not a requirements change.
    
  • Objects/object.c+6 1 modified
    @@ -1641,6 +1641,11 @@ none_bool(PyObject *v)
         return 0;
     }
     
    +static Py_hash_t none_hash(PyObject *v)
    +{
    +    return 0xFCA86420;
    +}
    +
     static PyNumberMethods none_as_number = {
         0,                          /* nb_add */
         0,                          /* nb_subtract */
    @@ -1692,7 +1697,7 @@ PyTypeObject _PyNone_Type = {
         &none_as_number,    /*tp_as_number*/
         0,                  /*tp_as_sequence*/
         0,                  /*tp_as_mapping*/
    -    0,                  /*tp_hash */
    +    (hashfunc)none_hash,/*tp_hash */
         0,                  /*tp_call */
         0,                  /*tp_str */
         0,                  /*tp_getattro */
    

Vulnerability mechanics

Generated by null/stub on May 9, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.

References

8

News mentions

0

No linked articles in our index yet.