CVE-2026-5497
Description
vLLM versions 0.8.0+ are vulnerable to DoS via unbounded frame count in video/jpeg data URLs, overwhelming memory.
AI Insight
LLM-synthesized narrative grounded in this CVE's description and references.
vLLM versions 0.8.0+ are vulnerable to DoS via unbounded frame count in video/jpeg data URLs, overwhelming memory.
## Vulnerability vLLM versions 0.8.0 and later are vulnerable to an Out-of-Memory (OOM) Denial of Service (DoS) attack due to unbounded frame count processing in the VideoMediaIO.load_base64() method. When processing video/jpeg data URLs, the method splits the base64 data string on commas to extract individual JPEG frames without enforcing a frame count limit [1][2]. This allows an attacker to craft a single API request containing thousands of comma-separated base64-encoded JPEG frames.
Exploitation
An attacker can exploit this vulnerability by sending a crafted request to the OpenAI-compatible chat completions API without requiring authentication. The request includes a video/jpeg data URL with many comma-separated base64-encoded JPEG frames. The server then decodes all frames into memory, leading to excessive memory consumption and potential crash [1].
Impact
Successful exploitation results in an Out-of-Memory (OOM) condition, causing a denial of service. The server may crash due to memory exhaustion, leading to service unavailability for legitimate users [2]. No authentication is required to trigger the vulnerability.
Mitigation
The fix is included in commit 58ee614 which enforces a frame count limit in VideoMediaIO [1]. Users should upgrade to a patched version containing this commit. No workaround is mentioned in the available references.
AI Insight generated on Jun 11, 2026. Synthesized from this CVE's description and the cited reference URLs; citations are validated against the source bundle.
Affected products
2Patches
158ee61422169(security) Enforce frame limit in VideoMediaIO (#38636)
2 files changed · +69 −10
tests/multimodal/media/test_video.py+61 −9 modified@@ -239,6 +239,17 @@ def test_video_media_io_backend_env_var_fallback(monkeypatch: pytest.MonkeyPatch assert metadata_missing["video_backend"] == "test_video_backend_override_2" +def _make_jpeg_b64_frames(n: int, width: int = 8, height: int = 8) -> list[str]: + """Return *n* tiny base64-encoded JPEG frames.""" + frames: list[str] = [] + for i in range(n): + img = Image.new("RGB", (width, height), color=(i % 256, 0, 0)) + buf = io.BytesIO() + img.save(buf, format="JPEG") + frames.append(pybase64.b64encode(buf.getvalue()).decode("ascii")) + return frames + + def test_load_base64_jpeg_returns_metadata(): """Regression test: load_base64 with video/jpeg must return metadata. @@ -248,16 +259,8 @@ def test_load_base64_jpeg_returns_metadata(): """ num_test_frames = 3 - frame_width, frame_height = 8, 8 - - # Build a few tiny JPEG frames and base64-encode them - b64_frames = [] - for i in range(num_test_frames): - img = Image.new("RGB", (frame_width, frame_height), color=(i * 80, 0, 0)) - buf = io.BytesIO() - img.save(buf, format="JPEG") - b64_frames.append(pybase64.b64encode(buf.getvalue()).decode("ascii")) + b64_frames = _make_jpeg_b64_frames(num_test_frames) data = ",".join(b64_frames) imageio = ImageMediaIO() @@ -287,3 +290,52 @@ def test_load_base64_jpeg_returns_metadata(): # Default fps=1 → duration == num_frames assert metadata["fps"] == 1.0 assert metadata["duration"] == float(num_test_frames) + + +def test_load_base64_jpeg_enforces_num_frames_limit(): + """Frames beyond num_frames must be truncated in the video/jpeg path. + + Without the limit an attacker can send thousands of base64 JPEG frames + in a single request and exhaust server memory (OOM). + """ + num_frames_limit = 4 + sent_frames = 20 + + b64_frames = _make_jpeg_b64_frames(sent_frames) + data = ",".join(b64_frames) + + imageio = ImageMediaIO() + videoio = VideoMediaIO(imageio, num_frames=num_frames_limit) + frames, metadata = videoio.load_base64("video/jpeg", data) + + assert frames.shape[0] == num_frames_limit + assert metadata["total_num_frames"] == num_frames_limit + assert metadata["frames_indices"] == list(range(num_frames_limit)) + + +def test_load_base64_jpeg_no_limit_when_num_frames_negative(): + """When num_frames is -1, all frames should be loaded without truncation.""" + sent_frames = 10 + + b64_frames = _make_jpeg_b64_frames(sent_frames) + data = ",".join(b64_frames) + + imageio = ImageMediaIO() + videoio = VideoMediaIO(imageio, num_frames=-1) + frames, metadata = videoio.load_base64("video/jpeg", data) + + assert frames.shape[0] == sent_frames + assert metadata["total_num_frames"] == sent_frames + assert metadata["frames_indices"] == list(range(sent_frames)) + + +def test_load_base64_jpeg_raises_on_zero_num_frames(): + """num_frames=0 is invalid and should raise ValueError.""" + b64_frames = _make_jpeg_b64_frames(3) + data = ",".join(b64_frames) + + imageio = ImageMediaIO() + videoio = VideoMediaIO(imageio, num_frames=0) + + with pytest.raises(ValueError, match="num_frames must be greater than 0 or -1"): + videoio.load_base64("video/jpeg", data)
vllm/multimodal/media/video.py+8 −1 modified@@ -80,8 +80,15 @@ def load_base64( "image/jpeg", ) + if self.num_frames > 0: + frame_parts = data.split(",", self.num_frames)[: self.num_frames] + elif self.num_frames == 0: + raise ValueError("num_frames must be greater than 0 or -1") + else: + frame_parts = data.split(",") + frames = np.stack( - [np.asarray(load_frame(frame_data)) for frame_data in data.split(",")] + [np.asarray(load_frame(frame_data)) for frame_data in frame_parts] ) total = int(frames.shape[0]) fps = float(self.kwargs.get("fps", 1))
Vulnerability mechanics
Root cause
"Missing frame count limit in VideoMediaIO.load_base64() allows unbounded memory allocation from attacker-supplied base64 JPEG frames."
Attack vector
An unauthenticated attacker sends a single request to the OpenAI-compatible chat completions API with a `video/jpeg` data URL containing thousands of comma-separated base64-encoded JPEG frames. The server decodes every frame into memory without enforcing a frame count limit, causing excessive memory consumption and an Out-of-Memory (OOM) crash. This is a classic resource-exhaustion denial-of-service attack reachable over the network with no authentication required.
Affected code
The vulnerability resides in `vllm/multimodal/media/video.py` in the `VideoMediaIO.load_base64()` method. When processing `video/jpeg` data URLs, the method previously called `data.split(",")` without any limit, allowing an attacker to supply an unbounded number of comma-separated base64-encoded JPEG frames. The patch introduces a frame count limit via `data.split(",", self.num_frames)[:self.num_frames]` when `num_frames > 0`, and raises `ValueError` for `num_frames == 0`.
What the fix does
The patch modifies `VideoMediaIO.load_base64()` in `vllm/multimodal/media/video.py` to limit the number of frames decoded. When `self.num_frames` is positive, the code now calls `data.split(",", self.num_frames)[:self.num_frames]` to truncate the input to at most `num_frames` frames before decoding. A `ValueError` is raised if `num_frames` is zero, and the original unbounded behavior is preserved only when `num_frames` is `-1`. This prevents an attacker from exhausting server memory by sending an arbitrarily large number of frames.
Preconditions
- configThe server must expose the OpenAI-compatible chat completions API endpoint that accepts video/jpeg data URLs.
- authNo authentication or prior access is required.
- networkThe attacker must be able to send HTTP requests to the vLLM server over the network.
- inputThe attacker crafts a data URL with thousands of comma-separated base64-encoded JPEG frames.
Generated on Jun 11, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.
References
2News mentions
0No linked articles in our index yet.