vLLM: OOM Denial of Service via Audio Decompression Bomb
Description
### Summary vLLM's /v1/audio/transcriptions endpoint limits compressed upload size but not decoded PCM output. A 25MB OPUS file expands to ~14.9GB of float32 PCM at decode time. Tested on vLLM v0.19.0.
Details
SpeechToTextProcessor rejects uploads over VLLM_MAX_AUDIO_CLIP_FILESIZE_MB (default 25MB) based on compressed byte length, but the audio decoder in audio.py accumulates all decoded frames into memory with no size limit before returning:
# speech_to_text.py L184-189
if len(audio_data) / 1024 ** 2 > self.max_audio_filesize_mb:
raise VLLMValidationError(...)
y, sr = load_audio(buf, sr=self.asr_config.sample_rate) # decoded size unchecked
# audio.py L77-107
chunks: list[npt.NDArray] = []
for frame in container.decode(stream):
chunks.append(frame.to_ndarray())
audio = np.concatenate(chunks, axis=-1).astype(np.float32) # single contiguous allocation
A 25MB OPUS file at 6kbps encodes ~8.7 hours of audio. Decoding produces ~5.7GB of float32 PCM (232x amplification), and np.concatenate then allocates a second contiguous array, bringing peak RSS to ~14.9GB from a single request. SpeechToTextConfig.max_audio_clip_s (default 30s) applies only after the full decode and does not prevent the allocation.
Impact
An unauthenticated attacker can exhaust server memory with a small number of concurrent requests, each a valid upload within the documented size limit. Severity was assessed with reference to prior OOM vulnerability reports in vLLM.
Fix
A fix for this vulnerability was merged here: https://github.com/vllm-project/vllm/pull/44970
AI Insight
LLM-synthesized narrative grounded in this CVE's description and references.
Affected products
1Patches
Vulnerability mechanics
Root cause
"Missing decoded-size limit in audio decoder allows decompression bomb amplification from compressed upload to PCM output."
Attack vector
An unauthenticated attacker sends a POST request to the `/v1/audio/transcriptions` endpoint with a small (≤25MB) OPUS audio file encoded at a very low bitrate (e.g., 6kbps). The compressed size check passes, but the decoder in `audio.py` accumulates all decoded frames into memory with no size limit, expanding the payload by a factor of ~232x. A single 25MB OPUS file can produce ~14.9GB of float32 PCM, exhausting server memory with only a few concurrent requests [ref_id=1].
Affected code
The vulnerability resides in `vllm/multimodal/media/audio.py` (the `load_audio_pyav` and `load_audio_soundfile` functions) and `vllm/entrypoints/speech_to_text/base/serving.py` (the `_preprocess_speech_to_text` method). The `SpeechToTextProcessor` in `speech_to_text.py` checks the compressed file size against `VLLM_MAX_AUDIO_CLIP_FILESIZE_MB` but does not limit the decoded PCM output, allowing a decompression bomb attack [patch_id=6351922].
What the fix does
The patch adds a `max_duration_s` parameter (default 600s via `VLLM_MAX_AUDIO_DECODE_DURATION_S`) to both `load_audio_pyav` and `load_audio_soundfile`. Before decoding, it checks container/stream metadata for duration and rejects files exceeding the limit. During decoding, it tracks accumulated sample count and raises `ValueError` once the limit is exceeded, preventing the large contiguous allocation. The `_preprocess_speech_to_text` method passes this limit from the environment variable to `load_audio` [patch_id=6351922].
Preconditions
- networkThe attacker must be able to reach the /v1/audio/transcriptions endpoint (no authentication required).
- inputThe attacker must upload a compressed audio file (e.g., OPUS) within the 25MB size limit that decodes to many hours of PCM.
Generated on Jun 17, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.
References
5- github.com/advisories/GHSA-6pr9-rp53-2pmcghsaADVISORY
- github.com/vllm-project/vllm/commit/1b1359c33269446f13c05da9a90c25174cbea590ghsa
- github.com/vllm-project/vllm/pull/44970ghsa
- github.com/vllm-project/vllm/releases/tag/v0.23.1rc0ghsa
- github.com/vllm-project/vllm/security/advisories/GHSA-6pr9-rp53-2pmcghsa
News mentions
0No linked articles in our index yet.