VYPR
Critical severityNVD Advisory· Published Apr 30, 2025· Updated Apr 30, 2025

vLLM Vulnerable to Remote Code Execution via Mooncake Integration

CVE-2025-32444

Description

vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Versions starting from 0.6.5 and prior to 0.8.5, having vLLM integration with mooncake, are vulnerable to remote code execution due to using pickle based serialization over unsecured ZeroMQ sockets. The vulnerable sockets were set to listen on all network interfaces, increasing the likelihood that an attacker is able to reach the vulnerable ZeroMQ sockets to carry out an attack. vLLM instances that do not make use of the mooncake integration are not vulnerable. This issue has been patched in version 0.8.5.

Affected packages

Versions sourced from the GitHub Security Advisory.

PackageAffected versionsPatched versions
vllmPyPI
>= 0.6.5, < 0.8.50.8.5

Affected products

1

Patches

1
a5450f11c958

[Security] Use safe serialization and fix zmq setup for mooncake pipe (#17192)

https://github.com/vllm-project/vllmRussell BryantApr 25, 2025via ghsa
1 file changed · +13 8
  • vllm/distributed/kv_transfer/kv_pipe/mooncake_pipe.py+13 8 modified
    @@ -2,6 +2,7 @@
     
     import json
     import os
    +import struct
     from concurrent.futures import ThreadPoolExecutor
     from dataclasses import dataclass
     from typing import Optional, Union
    @@ -115,14 +116,14 @@ def _setup_metadata_sockets(self, kv_rank: int, p_host: str, p_port: str,
             p_rank_offset = int(p_port) + 8 + self.local_rank * 2
             d_rank_offset = int(d_port) + 8 + self.local_rank * 2
             if kv_rank == 0:
    -            self.sender_socket.bind(f"tcp://*:{p_rank_offset + 1}")
    +            self.sender_socket.bind(f"tcp://{p_host}:{p_rank_offset + 1}")
                 self.receiver_socket.connect(f"tcp://{d_host}:{d_rank_offset + 1}")
                 self.sender_ack.connect(f"tcp://{d_host}:{d_rank_offset + 2}")
    -            self.receiver_ack.bind(f"tcp://*:{p_rank_offset + 2}")
    +            self.receiver_ack.bind(f"tcp://{p_host}:{p_rank_offset + 2}")
             else:
                 self.receiver_socket.connect(f"tcp://{p_host}:{p_rank_offset + 1}")
    -            self.sender_socket.bind(f"tcp://*:{d_rank_offset + 1}")
    -            self.receiver_ack.bind(f"tcp://*:{d_rank_offset + 2}")
    +            self.sender_socket.bind(f"tcp://{d_host}:{d_rank_offset + 1}")
    +            self.receiver_ack.bind(f"tcp://{d_host}:{d_rank_offset + 2}")
                 self.sender_ack.connect(f"tcp://{p_host}:{p_rank_offset + 2}")
     
         def initialize(self, local_hostname: str, metadata_server: str,
    @@ -176,7 +177,7 @@ def read_bytes_from_buffer(self, buffer: int, length: int) -> bytes:
     
         def wait_for_ack(self, src_ptr: int, length: int) -> None:
             """Asynchronously wait for ACK from the receiver."""
    -        ack = self.sender_ack.recv_pyobj()
    +        ack = self.sender_ack.recv()
             if ack != b'ACK':
                 logger.error("Failed to receive ACK from the receiver")
     
    @@ -187,18 +188,22 @@ def send_bytes(self, user_data: bytes) -> None:
             length = len(user_data)
             src_ptr = self.allocate_managed_buffer(length)
             self.write_bytes_to_buffer(src_ptr, user_data, length)
    -        self.sender_socket.send_pyobj((src_ptr, length))
    +        self.sender_socket.send_multipart(
    +            [struct.pack("!Q", src_ptr),
    +             struct.pack("!Q", length)])
             self.buffer_cleaner.submit(self.wait_for_ack, src_ptr, length)
     
         def recv_bytes(self) -> bytes:
             """Receive bytes from the remote process."""
    -        src_ptr, length = self.receiver_socket.recv_pyobj()
    +        data = self.receiver_socket.recv_multipart()
    +        src_ptr = struct.unpack("!Q", data[0])[0]
    +        length = struct.unpack("!Q", data[1])[0]
             dst_ptr = self.allocate_managed_buffer(length)
             self.transfer_sync(dst_ptr, src_ptr, length)
             ret = self.read_bytes_from_buffer(dst_ptr, length)
     
             # Buffer cleanup
    -        self.receiver_ack.send_pyobj(b'ACK')
    +        self.receiver_ack.send(b'ACK')
             self.free_managed_buffer(dst_ptr, length)
     
             return ret
    

Vulnerability mechanics

Generated by null/stub on May 9, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.

References

7

News mentions

0

No linked articles in our index yet.