VYPR
Medium severity5.6NVD Advisory· Published Apr 27, 2026· Updated May 1, 2026

CVE-2026-7141

CVE-2026-7141

Description

A vulnerability was found in vllm up to 0.19.0. The affected element is the function has_mamba_layers of the file vllm/v1/kv_cache_interface.py of the component KV Block Handler. Performing a manipulation results in uninitialized resource. It is possible to initiate the attack remotely. The attack is considered to have high complexity. The exploitability is described as difficult. The exploit has been made public and could be used. The patch is named 1ad67864c0c20f167929e64c875f5c28e1aad9fd. To fix this issue, it is recommended to deploy a patch.

Affected packages

Versions sourced from the GitHub Security Advisory.

PackageAffected versionsPatched versions
vllmPyPI
< 0.19.10.19.1

Affected products

1
  • cpe:2.3:a:vllm:vllm:*:*:*:*:*:*:*:*
    Range: <=0.19.0

Patches

1
1ad67864c0c2

Zero recycled KV blocks for FullAttention models (#39146)

https://github.com/AjAnubolu/vllmAjAnuboluApr 8, 2026via ghsa
2 files changed · +25 1
  • tests/v1/core/test_kv_cache_utils.py+20 0 modified
    @@ -2094,3 +2094,23 @@ def test_unify_hybrid_kv_cache_specs():
     
         with pytest.raises(ValueError):
             kv_cache_utils.unify_hybrid_kv_cache_specs(kv_cache_spec)
    +
    +
    +def test_needs_kv_cache_zeroing():
    +    # Regression test for #39146: FullAttention models must zero recycled
    +    # blocks to avoid stale K/V leaking through partial-block tail slots.
    +    full_attention = KVCacheConfig(
    +        num_blocks=16,
    +        kv_cache_tensors=[],
    +        kv_cache_groups=[KVCacheGroupSpec(["layer_0"], new_kv_cache_spec())],
    +    )
    +    assert full_attention.needs_kv_cache_zeroing
    +
    +    sliding_only = KVCacheConfig(
    +        num_blocks=16,
    +        kv_cache_tensors=[],
    +        kv_cache_groups=[
    +            KVCacheGroupSpec(["layer_0"], new_sliding_window_spec(sliding_window=64))
    +        ],
    +    )
    +    assert not sliding_only.needs_kv_cache_zeroing
    
  • vllm/v1/kv_cache_interface.py+5 1 modified
    @@ -496,4 +496,8 @@ def has_mamba_layers(self) -> bool:
     
         @property
         def needs_kv_cache_zeroing(self) -> bool:
    -        return self.has_mamba_layers
    +        # Recycled blocks may hold stale K/V from prior requests; partial-block
    +        # tail slots can leak NaN/Inf into masked softmax (see #39146).
    +        return self.has_mamba_layers or any(
    +            type(g.kv_cache_spec) is FullAttentionSpec for g in self.kv_cache_groups
    +        )
    

Vulnerability mechanics

Generated by null/stub on May 9, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.

References

9

News mentions

0

No linked articles in our index yet.