VYPR
Medium severity6.9NVD Advisory· Published Jun 17, 2026· Updated Jun 17, 2026

vLLM: temperature=NaN and temperature=Infinity bypass validation and propagate to GPU kernels

CVE-2026-54235

Description

Summary

All temperature validation gates use comparison operators (<, >), which silently evaluate to False for NaN and for positive Infinity in Python's IEEE 754 float semantics. Both values pass every guard and propagate to GPU sampling kernels, where they produce undefined behavior or CUDA errors that can crash the inference worker. Note: -Infinity is correctly caught.

Root

Cause

sampling_params.py:384: ``python if 0 < self.temperature < _MAX_TEMP: # NaN → False; +Inf → False ``

sampling_params.py:462: ``python if self.temperature < 0.0: # NaN → False; +Inf → False raise VLLMValidationError(...) ``

No math.isnan() or math.isinf() check exists anywhere in sampling_params.py.

Python semantics (verified): float('nan') < 0.0False, float('inf') < 0.0False.

Impact

Crash of inference worker on GPU kernel execution with NaN/Inf softmax input, degrading service for all concurrent users.

Remediation

Add math.isfinite(self.temperature) check in _verify_args(). Reject non-finite float values with a 400 error.

Fix

A fix for this vulnerability was merged here: https://github.com/vllm-project/vllm/pull/45116

AI Insight

LLM-synthesized narrative grounded in this CVE's description and references.

Affected products

1

Patches

Vulnerability mechanics

Root cause

"Missing `math.isfinite()` validation allows NaN and Infinity float values to bypass comparison-based temperature and repetition_penalty guards in `sampling_params.py`."

Attack vector

An attacker submits an inference request with `temperature=NaN` or `temperature=Inf` (or `repetition_penalty=NaN`/`Inf`) via the API. Because Python's comparison operators (`<`, `>`) always return `False` when either operand is `NaN`, and `float('inf') < 0.0` is also `False`, all validation gates are bypassed. The non-finite value propagates to GPU sampling kernels, where it causes undefined behavior or CUDA errors, crashing the inference worker and degrading service for all concurrent users [ref_id=1].

Affected code

The vulnerability resides in `vllm/sampling_params.py` within the `_verify_args()` method. The temperature validation at line 384 (`if 0 < self.temperature < _MAX_TEMP`) and line 462 (`if self.temperature < 0.0`) uses comparison operators that silently evaluate to `False` for `NaN` and positive `Infinity` due to Python's IEEE 754 float semantics. No `math.isnan()` or `math.isinf()` check existed anywhere in `sampling_params.py` before the patch [patch_id=6351925].

What the fix does

The patch adds `math.isfinite()` checks for both `temperature` and `repetition_penalty` in `_verify_args()` before any comparison-based validation [patch_id=6351925]. If the value is not finite (i.e., `NaN` or `Infinity`), a `VLLMValidationError` (for temperature) or `ValueError` (for repetition_penalty) is raised immediately, returning a 400 error to the client. This prevents non-finite floats from ever reaching GPU kernels. The accompanying test suite (`tests/samplers/test_non_finite_params.py`) verifies that `NaN`, `+Inf`, and `-Inf` are all rejected while finite values are accepted [ref_id=1].

Preconditions

  • networkThe attacker must be able to send HTTP requests to the vLLM inference API endpoint that accepts SamplingParams.
  • inputThe attacker must set the `temperature` or `repetition_penalty` parameter to a non-finite float value (NaN, Inf).

Generated on Jun 17, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.

References

4

News mentions

0

No linked articles in our index yet.