CVE-2026-34756
Description
vLLM is an inference and serving engine for large language models (LLMs). From 0.1.0 to before 0.19.0, a Denial of Service vulnerability exists in the vLLM OpenAI-compatible API server. Due to the lack of an upper bound validation on the n parameter in the ChatCompletionRequest and CompletionRequest Pydantic models, an unauthenticated attacker can send a single HTTP request with an astronomically large n value. This completely blocks the Python asyncio event loop and causes immediate Out-Of-Memory crashes by allocating millions of request object copies in the heap before the request even reaches the scheduling queue. This vulnerability is fixed in 0.19.0.
AI Insight
LLM-synthesized narrative grounded in this CVE's description and references.
Affected packages
Versions sourced from the GitHub Security Advisory.
| Package | Affected versions | Patched versions |
|---|---|---|
vllmPyPI | >= 0.1.0, < 0.19.0 | 0.19.0 |
Affected products
6- osv-coords5 versionspkg:apk/chainguard/py3.10-vllm-cuda-12.4pkg:apk/chainguard/py3.12-vllm-cuda-12.4pkg:apk/chainguard/tritonserver-backend-vllm-cuda-13.0pkg:apk/chainguard/vllm-openai-cuda-12.9pkg:pypi/vllm
< 0.18.1-r2+ 4 more
- (no CPE)range: < 0.18.1-r2
- (no CPE)range: < 0.18.1-r2
- (no CPE)range: < 25.11-r7
- (no CPE)range: < 0.19.0-r0
- (no CPE)range: >= 0.1.0, < 0.19.0
Patches
Vulnerability mechanics
References
5- github.com/vllm-project/vllm/commit/b111f8a61f100fdca08706f41f29ef3548de7380nvdPatchWEB
- github.com/vllm-project/vllm/pull/37952nvdIssue TrackingPatchWEB
- github.com/vllm-project/vllm/security/advisories/GHSA-3mwp-wvh9-7528nvdPatchVendor AdvisoryWEB
- github.com/advisories/GHSA-3mwp-wvh9-7528ghsaADVISORY
- nvd.nist.gov/vuln/detail/CVE-2026-34756ghsaADVISORY
News mentions
0No linked articles in our index yet.