apk package
chainguard/tritonserver-backend-vllm-24.04
pkg:apk/chainguard/tritonserver-backend-vllm-24.04
Vulnerabilities (25)
| CVE | Sev | CVSS | KEV | Affected versions | Fixed in | Published | Description |
|---|---|---|---|---|---|---|---|
| CVE-2025-29770 | — | < 24.04-r6 | 24.04-r6 | Mar 19, 2025 | vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. The outlines library is one of the backends used by vLLM to support structured output (a.k.a. guided decoding). Outlines provides an optional cache for its compiled grammars on the local filesys | ||
| CVE-2025-25183 | — | < 24.04-r9 | 24.04-r9 | Feb 7, 2025 | vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Maliciously constructed statements can lead to hash collisions, resulting in cache reuse, which can interfere with subsequent responses and cause unintended behavior. Prefix caching makes use of | ||
| CVE-2025-24357 | — | < 24.04-r9 | 24.04-r9 | Jan 27, 2025 | vLLM is a library for LLM inference and serving. vllm/model_executor/weight_utils.py implements hf_model_weights_iterator to load the model checkpoint, which is downloaded from huggingface. It uses the torch.load function and the weights_only parameter defaults to False. When tor | ||
| CVE-2024-8939 | Med | 6.2 | < 24.04-r9 | 24.04-r9 | Sep 17, 2024 | A vulnerability was found in the ilab model serve component, where improper handling of the best_of parameter in the vllm JSON web API can lead to a Denial of Service (DoS). The API used for LLM-based sentence or chat completion accepts a best_of parameter to return the best comp | |
| CVE-2024-8768 | Hig | 7.5 | < 24.04-r9 | 24.04-r9 | Sep 17, 2024 | A flaw was found in the vLLM library. A completions API request with an empty prompt will crash the vLLM API server, resulting in a denial of service. |
- CVE-2025-29770Mar 19, 2025affected < 24.04-r6fixed 24.04-r6
vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. The outlines library is one of the backends used by vLLM to support structured output (a.k.a. guided decoding). Outlines provides an optional cache for its compiled grammars on the local filesys
- CVE-2025-25183Feb 7, 2025affected < 24.04-r9fixed 24.04-r9
vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Maliciously constructed statements can lead to hash collisions, resulting in cache reuse, which can interfere with subsequent responses and cause unintended behavior. Prefix caching makes use of
- CVE-2025-24357Jan 27, 2025affected < 24.04-r9fixed 24.04-r9
vLLM is a library for LLM inference and serving. vllm/model_executor/weight_utils.py implements hf_model_weights_iterator to load the model checkpoint, which is downloaded from huggingface. It uses the torch.load function and the weights_only parameter defaults to False. When tor
- affected < 24.04-r9fixed 24.04-r9
A vulnerability was found in the ilab model serve component, where improper handling of the best_of parameter in the vllm JSON web API can lead to a Denial of Service (DoS). The API used for LLM-based sentence or chat completion accepts a best_of parameter to return the best comp
- affected < 24.04-r9fixed 24.04-r9
A flaw was found in the vLLM library. A completions API request with an empty prompt will crash the vLLM API server, resulting in a denial of service.
Page 2 of 2