apk package
chainguard/tritonserver-backend-vllm-meta-cuda-12.9
pkg:apk/chainguard/tritonserver-backend-vllm-meta-cuda-12.9
Vulnerabilities (15)
| CVE | Sev | CVSS | KEV | Affected versions | Fixed in | Published | Description |
|---|---|---|---|---|---|---|---|
| CVE-2025-68131 | — | < 25.9.0_git20251112-r4 | 25.9.0_git20251112-r4 | Dec 31, 2025 | cbor2 provides encoding and decoding for the Concise Binary Object Representation (CBOR) serialization format. Starting in version 3.0.0 and prior to version 5.8.0, whhen a CBORDecoder instance is reused across multiple decode operations, values marked with the shareable tag (28) | ||
| CVE-2025-68146 | — | < 25.9.0_git20251112-r3 | 25.9.0_git20251112-r3 | Dec 16, 2025 | filelock is a platform-independent file lock for Python. In versions prior to 3.20.1, a Time-of-Check-Time-of-Use (TOCTOU) race condition allows local attackers to corrupt or truncate arbitrary user files through symlink attacks. The vulnerability exists in both Unix and Windows | ||
| CVE-2025-66471 | — | < 25.9.0_git20251112-r2 | 25.9.0_git20251112-r2 | Dec 5, 2025 | urllib3 is a user-friendly HTTP client library for Python. Starting in version 1.0 and prior to 2.6.0, the Streaming API improperly handles highly compressed data. urllib3's streaming API is designed for the efficient handling of large HTTP responses by reading the content in chu | ||
| CVE-2025-66418 | — | < 25.9.0_git20251112-r2 | 25.9.0_git20251112-r2 | Dec 5, 2025 | urllib3 is a user-friendly HTTP client library for Python. Starting in version 1.24 and prior to 2.6.0, the number of links in the decompression chain was unbounded allowing a malicious server to insert a virtually unlimited number of compression steps leading to high CPU usage a | ||
| CVE-2025-62593 | Cri | — | < 25.9.0_git20251112-r1 | 25.9.0_git20251112-r1 | Nov 26, 2025 | Ray is an AI compute engine. Prior to version 2.52.0, developers working with Ray as a development tool can be exploited via a critical RCE vulnerability exploitable via Firefox and Safari. This vulnerability is due to an insufficient guard against browser-based attacks, as the c | |
| CVE-2025-62372 | — | < 25.9.0_git20251112-r1 | 25.9.0_git20251112-r1 | Nov 21, 2025 | vLLM is an inference and serving engine for large language models (LLMs). From version 0.5.5 to before 0.11.1, users can crash the vLLM engine serving multimodal models by passing multimodal embedding inputs with correct ndim but incorrect shape (e.g. hidden dimension is wrong), | ||
| CVE-2025-62426 | — | < 25.9.0_git20251112-r1 | 25.9.0_git20251112-r1 | Nov 21, 2025 | vLLM is an inference and serving engine for large language models (LLMs). From version 0.5.5 to before 0.11.1, the /v1/chat/completions and /tokenize endpoints allow a chat_template_kwargs request parameter that is used in the code before it is properly validated against the chat | ||
| CVE-2025-62164 | — | < 25.9.0_git20251112-r1 | 25.9.0_git20251112-r1 | Nov 21, 2025 | vLLM is an inference and serving engine for large language models (LLMs). From versions 0.10.2 to before 0.11.1, a memory corruption vulnerability could lead to a crash (denial-of-service) and potentially remote code execution (RCE), exists in the Completions API endpoint. When p | ||
| CVE-2025-61620 | med | — | < 25.9.0_git20251016-r0 | 25.9.0_git20251016-r0 | Oct 7, 2025 | ### Summary A resource-exhaustion (denial-of-service) vulnerability exists in multiple endpoints of the OpenAI-Compatible Server due to the ability to specify Jinja templates via the `chat_template` and `chat_template_kwargs` parameters. If an attacker can supply these parameter | |
| CVE-2025-6242 | Hig | 7.1 | < 25.9.0_git20251016-r0 | 25.9.0_git20251016-r0 | Oct 7, 2025 | A Server-Side Request Forgery (SSRF) vulnerability exists in the MediaConnector class within the vLLM project's multimodal feature set. The load_from_url and load_from_url_async methods fetch and process media from user-provided URLs without adequate restrictions on the target ho | |
| CVE-2025-59425 | — | < 25.9.0_git20251016-r0 | 25.9.0_git20251016-r0 | Oct 7, 2025 | vLLM is an inference and serving engine for large language models (LLMs). Before version 0.11.0rc2, the API key support in vLLM performs validation using a method that was vulnerable to a timing attack. API key validation uses a string comparison that takes longer the more charac | ||
| CVE-2025-58446 | — | < 25.7.1_git20251001-r1 | 25.7.1_git20251001-r1 | Sep 6, 2025 | xgrammar is an open-source library for efficient, flexible, and portable structured generation. A grammar optimizer introduced in 0.1.23 processes large grammars (>100k characters) at very low rates, and can be used for DOS of model providers. This issue is fixed in version 0.1.2 | ||
| CVE-2025-9141 | hig | — | < 25.7.1_git20250821-r1 | 25.7.1_git20250821-r1 | Aug 21, 2025 | ### Summary An unsafe deserialization vulnerability allows any authenticated user to execute arbitrary code on the server if they are able to get the model to pass the code as an argument to a tool call. ### Details vLLM's [Qwen3 Coder tool parser](https://github.com/vllm-proje | |
| CVE-2025-48956 | — | < 25.7.1_git20250821-r1 | 25.7.1_git20250821-r1 | Aug 21, 2025 | vLLM is an inference and serving engine for large language models (LLMs). From 0.1.0 to before 0.10.1.1, a Denial of Service (DoS) vulnerability can be triggered by sending a single HTTP GET request with an extremely large header to an HTTP endpoint. This results in server memory | ||
| CVE-2023-48022 | — | < 0 | 0 | Nov 28, 2023 | Anyscale Ray 2.6.3 and 2.8.0 allows a remote attacker to execute arbitrary code via the job submission API. NOTE: the vendor's position is that this report is irrelevant because Ray, as stated in its documentation, is not intended for use outside of a strictly controlled network |
- CVE-2025-68131Dec 31, 2025affected < 25.9.0_git20251112-r4fixed 25.9.0_git20251112-r4
cbor2 provides encoding and decoding for the Concise Binary Object Representation (CBOR) serialization format. Starting in version 3.0.0 and prior to version 5.8.0, whhen a CBORDecoder instance is reused across multiple decode operations, values marked with the shareable tag (28)
- CVE-2025-68146Dec 16, 2025affected < 25.9.0_git20251112-r3fixed 25.9.0_git20251112-r3
filelock is a platform-independent file lock for Python. In versions prior to 3.20.1, a Time-of-Check-Time-of-Use (TOCTOU) race condition allows local attackers to corrupt or truncate arbitrary user files through symlink attacks. The vulnerability exists in both Unix and Windows
- CVE-2025-66471Dec 5, 2025affected < 25.9.0_git20251112-r2fixed 25.9.0_git20251112-r2
urllib3 is a user-friendly HTTP client library for Python. Starting in version 1.0 and prior to 2.6.0, the Streaming API improperly handles highly compressed data. urllib3's streaming API is designed for the efficient handling of large HTTP responses by reading the content in chu
- CVE-2025-66418Dec 5, 2025affected < 25.9.0_git20251112-r2fixed 25.9.0_git20251112-r2
urllib3 is a user-friendly HTTP client library for Python. Starting in version 1.24 and prior to 2.6.0, the number of links in the decompression chain was unbounded allowing a malicious server to insert a virtually unlimited number of compression steps leading to high CPU usage a
- affected < 25.9.0_git20251112-r1fixed 25.9.0_git20251112-r1
Ray is an AI compute engine. Prior to version 2.52.0, developers working with Ray as a development tool can be exploited via a critical RCE vulnerability exploitable via Firefox and Safari. This vulnerability is due to an insufficient guard against browser-based attacks, as the c
- CVE-2025-62372Nov 21, 2025affected < 25.9.0_git20251112-r1fixed 25.9.0_git20251112-r1
vLLM is an inference and serving engine for large language models (LLMs). From version 0.5.5 to before 0.11.1, users can crash the vLLM engine serving multimodal models by passing multimodal embedding inputs with correct ndim but incorrect shape (e.g. hidden dimension is wrong),
- CVE-2025-62426Nov 21, 2025affected < 25.9.0_git20251112-r1fixed 25.9.0_git20251112-r1
vLLM is an inference and serving engine for large language models (LLMs). From version 0.5.5 to before 0.11.1, the /v1/chat/completions and /tokenize endpoints allow a chat_template_kwargs request parameter that is used in the code before it is properly validated against the chat
- CVE-2025-62164Nov 21, 2025affected < 25.9.0_git20251112-r1fixed 25.9.0_git20251112-r1
vLLM is an inference and serving engine for large language models (LLMs). From versions 0.10.2 to before 0.11.1, a memory corruption vulnerability could lead to a crash (denial-of-service) and potentially remote code execution (RCE), exists in the Completions API endpoint. When p
- affected < 25.9.0_git20251016-r0fixed 25.9.0_git20251016-r0
### Summary A resource-exhaustion (denial-of-service) vulnerability exists in multiple endpoints of the OpenAI-Compatible Server due to the ability to specify Jinja templates via the `chat_template` and `chat_template_kwargs` parameters. If an attacker can supply these parameter
- affected < 25.9.0_git20251016-r0fixed 25.9.0_git20251016-r0
A Server-Side Request Forgery (SSRF) vulnerability exists in the MediaConnector class within the vLLM project's multimodal feature set. The load_from_url and load_from_url_async methods fetch and process media from user-provided URLs without adequate restrictions on the target ho
- CVE-2025-59425Oct 7, 2025affected < 25.9.0_git20251016-r0fixed 25.9.0_git20251016-r0
vLLM is an inference and serving engine for large language models (LLMs). Before version 0.11.0rc2, the API key support in vLLM performs validation using a method that was vulnerable to a timing attack. API key validation uses a string comparison that takes longer the more charac
- CVE-2025-58446Sep 6, 2025affected < 25.7.1_git20251001-r1fixed 25.7.1_git20251001-r1
xgrammar is an open-source library for efficient, flexible, and portable structured generation. A grammar optimizer introduced in 0.1.23 processes large grammars (>100k characters) at very low rates, and can be used for DOS of model providers. This issue is fixed in version 0.1.2
- affected < 25.7.1_git20250821-r1fixed 25.7.1_git20250821-r1
### Summary An unsafe deserialization vulnerability allows any authenticated user to execute arbitrary code on the server if they are able to get the model to pass the code as an argument to a tool call. ### Details vLLM's [Qwen3 Coder tool parser](https://github.com/vllm-proje
- CVE-2025-48956Aug 21, 2025affected < 25.7.1_git20250821-r1fixed 25.7.1_git20250821-r1
vLLM is an inference and serving engine for large language models (LLMs). From 0.1.0 to before 0.10.1.1, a Denial of Service (DoS) vulnerability can be triggered by sending a single HTTP GET request with an extremely large header to an HTTP endpoint. This results in server memory
- CVE-2023-48022Nov 28, 2023affected < 0fixed 0
Anyscale Ray 2.6.3 and 2.8.0 allows a remote attacker to execute arbitrary code via the job submission API. NOTE: the vendor's position is that this report is irrelevant because Ray, as stated in its documentation, is not intended for use outside of a strictly controlled network