Vllm

pypi: vllm

Source repositories

https://github.com/vllm-project/vllm

CVEs (53)

CVE	Vendor / Product	CVSS	EPSS	Published	Description
CVE-2025-71379	Vllm Vllm	—	0.00	Jun 20, 2026	vLLM versions >= 0.6.3 and < 0.9.0 contain multiple regular expression denial of service (ReDoS) vulnerabilities. Several regex patterns — in vllm/lora/utils.py, the phi4mini tool parser, and the OpenAI-compatible serving chat endpoint — are susceptible to catastrophic…
CVE-2026-54233	Vllm Vllm	—	0.00	Jun 17, 2026	### Summary vLLM's `/v1/audio/transcriptions` endpoint limits compressed upload size but not decoded PCM output. A 25MB OPUS file expands to ~14.9GB of float32 PCM at decode time. Tested on vLLM v0.19.0. ### Details `SpeechToTextProcessor` rejects uploads over…
CVE-2026-54236	Vllm Vllm	—	0.01	Jun 17, 2026	# vLLM: incomplete CVE-2026-22778 fix leaks PIL repr addresses via the Anthropic API router Researcher: Kai Aizen — SnailSploit (@SnailSploit), Adversarial & Offensive Security Research Severity: CVSS 3.1 5.3 (Medium) `AV:N/AC:L/PR:N/UI:N/S:U/C:L/I:N/A:N` Target:…
CVE-2026-53923	Vllm Vllm	—	0.00	Jun 17, 2026	## Summary Integer truncation of tensor dimensions in vLLM's GGUF dequantize kernels (`csrc/quantization/gguf/gguf_kernel.cu`) causes partial tensor processing. The output tensor is allocated at full size via `torch::empty` (uninitialized memory), but the dequantize CUDA kernel…
CVE-2026-54235	Vllm Vllm	—	0.00	Jun 17, 2026	## Summary All temperature validation gates use comparison operators (`<`, `>`), which silently evaluate to `False` for `NaN` and for positive `Infinity` in Python's IEEE 754 float semantics. Both values pass every guard and propagate to GPU sampling kernels, where they produce…
CVE-2026-47155	Vllm Vllm	—	0.00	Jun 10, 2026	### Summary vLLM's revision pinning controls do not consistently apply to all artifacts loaded for a model. A deployment that supplies `--revision` or `--code-revision` can still load dynamic code, GGUF files, image processors, retrieval side weights, or same-repository…
CVE-2026-27893	Vllm Vllm	—	0.01	Mar 26, 2026	vLLM is an inference and serving engine for large language models (LLMs). Starting in version 0.10.1 and prior to version 0.18.0, two model implementation files hardcode `trust_remote_code=True` when loading sub-components, bypassing the user's explicit…
CVE-2026-25960	Vllm Vllm	—	0.00	Mar 9, 2026	vLLM is an inference and serving engine for large language models (LLMs). The SSRF protection fix for CVE-2026-24779 add in 0.15.1 can be bypassed in the load_from_url_async method due to inconsistent URL parsing behavior between the validation layer and the actual HTTP client.…
CVE-2026-22778	Vllm Vllm	—	0.04	Feb 2, 2026	vLLM is an inference and serving engine for large language models (LLMs). From 0.8.3 to before 0.14.1, when an invalid image is sent to vLLM's multimodal endpoint, PIL throws an error. vLLM returns this error to the client, leaking a heap address. With this leak, we reduce ASLR…
CVE-2026-24779	Vllm Vllm	—	0.01	Jan 27, 2026	vLLM is an inference and serving engine for large language models (LLMs). Prior to version 0.14.1, a Server-Side Request Forgery (SSRF) vulnerability exists in the `MediaConnector` class within the vLLM project's multimodal feature set. The load_from_url and load_from_url_async…
CVE-2026-22807	Vllm Vllm	—	0.01	Jan 21, 2026	vLLM is an inference and serving engine for large language models (LLMs). Starting in version 0.10.1 and prior to version 0.14.0, vLLM loads Hugging Face `auto_map` dynamic modules during model resolution without gating on `trust_remote_code`, allowing attacker-controlled Python…
CVE-2026-22773	Vllm Vllm	—	0.00	Jan 10, 2026	vLLM is an inference and serving engine for large language models (LLMs). In versions from 0.6.4 to before 0.12.0, users can crash the vLLM engine serving multimodal models that use the Idefics3 vision model implementation by sending a specially crafted 1x1 pixel image. This…
CVE-2025-66448	Vllm Vllm	—	0.01	Dec 1, 2025	vLLM is an inference and serving engine for large language models (LLMs). Prior to 0.11.1, vllm has a critical remote code execution vector in a config class named Nemotron_Nano_VL_Config. When vllm loads a model config that contains an auto_map entry, the config class resolves…
CVE-2025-62372	Vllm Vllm	—	0.00	Nov 21, 2025	vLLM is an inference and serving engine for large language models (LLMs). From version 0.5.5 to before 0.11.1, users can crash the vLLM engine serving multimodal models by passing multimodal embedding inputs with correct ndim but incorrect shape (e.g. hidden dimension is wrong),…
CVE-2025-62426	Vllm Vllm	—	0.00	Nov 21, 2025	vLLM is an inference and serving engine for large language models (LLMs). From version 0.5.5 to before 0.11.1, the /v1/chat/completions and /tokenize endpoints allow a chat_template_kwargs request parameter that is used in the code before it is properly validated against the…
CVE-2025-62164	Vllm Vllm	—	0.01	Nov 21, 2025	vLLM is an inference and serving engine for large language models (LLMs). From versions 0.10.2 to before 0.11.1, a memory corruption vulnerability could lead to a crash (denial-of-service) and potentially remote code execution (RCE), exists in the Completions API endpoint. When…
CVE-2025-59425	Vllm Vllm	—	0.01	Oct 7, 2025	vLLM is an inference and serving engine for large language models (LLMs). Before version 0.11.0rc2, the API key support in vLLM performs validation using a method that was vulnerable to a timing attack. API key validation uses a string comparison that takes longer the more…
CVE-2025-48956	Vllm Vllm	—	0.01	Aug 21, 2025	vLLM is an inference and serving engine for large language models (LLMs). From 0.1.0 to before 0.10.1.1, a Denial of Service (DoS) vulnerability can be triggered by sending a single HTTP GET request with an extremely large header to an HTTP endpoint. This results in server…
CVE-2025-48944	Vllm Vllm	—	0.00	May 30, 2025	vLLM is an inference and serving engine for large language models (LLMs). In version 0.8.0 up to but excluding 0.9.0, the vLLM backend used with the /v1/chat/completions OpenAPI endpoint fails to validate unexpected or malformed input in the "pattern" and "type" fields when the…
CVE-2025-48943	Vllm Vllm	—	0.00	May 30, 2025	vLLM is an inference and serving engine for large language models (LLMs). Version 0.8.0 up to but excluding 0.9.0 have a Denial of Service (ReDoS) that causes the vLLM server to crash if an invalid regex was provided while using structured output. This vulnerability is similar…

CVE-2025-71379Jun 20, 2026
Vllm
Vllm
risk 0.00cvss —epss 0.00
vLLM versions >= 0.6.3 and < 0.9.0 contain multiple regular expression denial of service (ReDoS) vulnerabilities. Several regex patterns — in vllm/lora/utils.py, the phi4mini tool parser, and the OpenAI-compatible serving chat endpoint — are susceptible to catastrophic…
CVE-2026-54233Jun 17, 2026
Vllm
Vllm
risk 0.00cvss —epss 0.00
### Summary vLLM's `/v1/audio/transcriptions` endpoint limits compressed upload size but not decoded PCM output. A 25MB OPUS file expands to ~14.9GB of float32 PCM at decode time. Tested on vLLM v0.19.0. ### Details `SpeechToTextProcessor` rejects uploads over…
CVE-2026-54236Jun 17, 2026
Vllm
Vllm
risk 0.00cvss —epss 0.01
# vLLM: incomplete CVE-2026-22778 fix leaks PIL repr addresses via the Anthropic API router **Researcher:** Kai Aizen — SnailSploit (@SnailSploit), Adversarial & Offensive Security Research **Severity:** CVSS 3.1 5.3 (Medium) `AV:N/AC:L/PR:N/UI:N/S:U/C:L/I:N/A:N` **Target:**…
CVE-2026-53923Jun 17, 2026
Vllm
Vllm
risk 0.00cvss —epss 0.00
## Summary Integer truncation of tensor dimensions in vLLM's GGUF dequantize kernels (`csrc/quantization/gguf/gguf_kernel.cu`) causes partial tensor processing. The output tensor is allocated at full size via `torch::empty` (uninitialized memory), but the dequantize CUDA kernel…
CVE-2026-54235Jun 17, 2026
Vllm
Vllm
risk 0.00cvss —epss 0.00
## Summary All temperature validation gates use comparison operators (`<`, `>`), which silently evaluate to `False` for `NaN` and for positive `Infinity` in Python's IEEE 754 float semantics. Both values pass every guard and propagate to GPU sampling kernels, where they produce…
CVE-2026-47155Jun 10, 2026
Vllm
Vllm
risk 0.00cvss —epss 0.00
### Summary vLLM's revision pinning controls do not consistently apply to all artifacts loaded for a model. A deployment that supplies `--revision` or `--code-revision` can still load dynamic code, GGUF files, image processors, retrieval side weights, or same-repository…
CVE-2026-27893Mar 26, 2026
Vllm
Vllm
risk 0.00cvss —epss 0.01
vLLM is an inference and serving engine for large language models (LLMs). Starting in version 0.10.1 and prior to version 0.18.0, two model implementation files hardcode `trust_remote_code=True` when loading sub-components, bypassing the user's explicit…
CVE-2026-25960Mar 9, 2026
Vllm
Vllm
risk 0.00cvss —epss 0.00
vLLM is an inference and serving engine for large language models (LLMs). The SSRF protection fix for CVE-2026-24779 add in 0.15.1 can be bypassed in the load_from_url_async method due to inconsistent URL parsing behavior between the validation layer and the actual HTTP client.…
CVE-2026-22778Feb 2, 2026
Vllm
Vllm
risk 0.00cvss —epss 0.04
vLLM is an inference and serving engine for large language models (LLMs). From 0.8.3 to before 0.14.1, when an invalid image is sent to vLLM's multimodal endpoint, PIL throws an error. vLLM returns this error to the client, leaking a heap address. With this leak, we reduce ASLR…
CVE-2026-24779Jan 27, 2026
Vllm
Vllm
risk 0.00cvss —epss 0.01
vLLM is an inference and serving engine for large language models (LLMs). Prior to version 0.14.1, a Server-Side Request Forgery (SSRF) vulnerability exists in the `MediaConnector` class within the vLLM project's multimodal feature set. The load_from_url and load_from_url_async…
CVE-2026-22807Jan 21, 2026
Vllm
Vllm
risk 0.00cvss —epss 0.01
vLLM is an inference and serving engine for large language models (LLMs). Starting in version 0.10.1 and prior to version 0.14.0, vLLM loads Hugging Face `auto_map` dynamic modules during model resolution without gating on `trust_remote_code`, allowing attacker-controlled Python…
CVE-2026-22773Jan 10, 2026
Vllm
Vllm
risk 0.00cvss —epss 0.00
vLLM is an inference and serving engine for large language models (LLMs). In versions from 0.6.4 to before 0.12.0, users can crash the vLLM engine serving multimodal models that use the Idefics3 vision model implementation by sending a specially crafted 1x1 pixel image. This…
CVE-2025-66448Dec 1, 2025
Vllm
Vllm
risk 0.00cvss —epss 0.01
vLLM is an inference and serving engine for large language models (LLMs). Prior to 0.11.1, vllm has a critical remote code execution vector in a config class named Nemotron_Nano_VL_Config. When vllm loads a model config that contains an auto_map entry, the config class resolves…
CVE-2025-62372Nov 21, 2025
Vllm
Vllm
risk 0.00cvss —epss 0.00
vLLM is an inference and serving engine for large language models (LLMs). From version 0.5.5 to before 0.11.1, users can crash the vLLM engine serving multimodal models by passing multimodal embedding inputs with correct ndim but incorrect shape (e.g. hidden dimension is wrong),…
CVE-2025-62426Nov 21, 2025
Vllm
Vllm
risk 0.00cvss —epss 0.00
vLLM is an inference and serving engine for large language models (LLMs). From version 0.5.5 to before 0.11.1, the /v1/chat/completions and /tokenize endpoints allow a chat_template_kwargs request parameter that is used in the code before it is properly validated against the…
CVE-2025-62164Nov 21, 2025
Vllm
Vllm
risk 0.00cvss —epss 0.01
vLLM is an inference and serving engine for large language models (LLMs). From versions 0.10.2 to before 0.11.1, a memory corruption vulnerability could lead to a crash (denial-of-service) and potentially remote code execution (RCE), exists in the Completions API endpoint. When…
CVE-2025-59425Oct 7, 2025
Vllm
Vllm
risk 0.00cvss —epss 0.01
vLLM is an inference and serving engine for large language models (LLMs). Before version 0.11.0rc2, the API key support in vLLM performs validation using a method that was vulnerable to a timing attack. API key validation uses a string comparison that takes longer the more…
CVE-2025-48956Aug 21, 2025
Vllm
Vllm
risk 0.00cvss —epss 0.01
vLLM is an inference and serving engine for large language models (LLMs). From 0.1.0 to before 0.10.1.1, a Denial of Service (DoS) vulnerability can be triggered by sending a single HTTP GET request with an extremely large header to an HTTP endpoint. This results in server…
CVE-2025-48944May 30, 2025
Vllm
Vllm
risk 0.00cvss —epss 0.00
vLLM is an inference and serving engine for large language models (LLMs). In version 0.8.0 up to but excluding 0.9.0, the vLLM backend used with the /v1/chat/completions OpenAPI endpoint fails to validate unexpected or malformed input in the "pattern" and "type" fields when the…
CVE-2025-48943May 30, 2025
Vllm
Vllm
risk 0.00cvss —epss 0.00
vLLM is an inference and serving engine for large language models (LLMs). Version 0.8.0 up to but excluding 0.9.0 have a Denial of Service (ReDoS) that causes the vLLM server to crash if an invalid regex was provided while using structured output. This vulnerability is similar…

Page 2 of 3