VYPR
Critical severityNVD Advisory· Published Nov 18, 2025· Updated Nov 19, 2025

CVE-2025-60455

CVE-2025-60455

Description

Unsafe Deserialization vulnerability in Modular Max Serve before 25.6, specifically when the "--experimental-enable-kvcache-agent" feature is used allowing attackers to execute arbitrary code.

AI Insight

LLM-synthesized narrative grounded in this CVE's description and references.

Unsafe deserialization in Modular Max Serve before 25.6 with experimental KV cache agent allows remote code execution.

Vulnerability

CVE-2025-60455 is an unsafe deserialization vulnerability in Modular Max Serve prior to version 25.6. The flaw occurs when the experimental feature --experimental-enable-kvcache-agent is enabled. The root cause is the use of Python's pickle.loads for deserializing data received over ZMQ sockets [2][4]. Pickle is inherently unsafe because it can execute arbitrary code during deserialization [2].

Attack

Vector An attacker can exploit this by sending a maliciously crafted pickle payload to an affected ZMQ socket. The vulnerability does not require authentication, as the ZMQ sockets are unauthenticated [2]. The attack surface is limited to instances where the experimental KV cache agent feature is active, but any network-accessible instance meeting that condition is potentially vulnerable. The bug was introduced through code reuse patterns across multiple AI ecosystem projects, a pattern researchers call "ShadowMQ" [2].

Impact

Successful exploitation allows an attacker to execute arbitrary code on the server running Modular Max Serve. This could lead to full compromise of the affected system, including data exfiltration, service disruption, or lateral movement within the network.

Mitigation

The vulnerability is fixed in Modular Max Serve version 25.6. The patch replaces the default pickle deserialization with a safer alternative [4]. In the commit fixing the issue, pickle is only retained for a specific internal message type (KVCacheChangeMessage) that cannot be serialized by other means [4]. Users should upgrade to version 25.6 or later and avoid enabling the --experimental-enable-kvcache-agent flag on unpatched versions.

AI Insight generated on May 19, 2026. Synthesized from this CVE's description and the cited reference URLs; citations are validated against the source bundle.

Affected packages

Versions sourced from the GitHub Security Advisory.

PackageAffected versionsPatched versions
modularPyPI
< 25.6.025.6.0

Affected products

2

Patches

3
b20e749fa892

[KVCacheAgent] Deserialize KV Cache Events to int for Now

https://github.com/modular/modularkcaverlyAug 13, 2025via ghsa
1 file changed · +1 4
  • max/serve/kvcache_agent/kvcache_agent.py+1 4 modified
    @@ -214,13 +214,10 @@ def __init__(
                 config: Configuration for the server.
             """
             self.config = config
    -        # This remains the only use of pickle in the current codebase.
    -        # As the KVCacheChangeMessage contains Protobuf Enum's it cannot be
    -        # serialized by msgspec/msgpack.
             self._kv_cache_events_pull_socket = ZmqPullSocket[KVCacheChangeMessage](
                 zmq_endpoint=kv_cache_events_zmq_endpoint,
                 # GENAI-233: This is currently non-functional.
    -            deserialize=msgpack_numpy_decoder(KVCacheChangeMessage),
    +            deserialize=msgpack_numpy_decoder(int),
             )
             self.server = grpc.server(
                 concurrent.futures.ThreadPoolExecutor(
    
10620059fb5c

[KVCacheAgent] Remove pickle deserialization in KV Cache Agent

https://github.com/modular/modularkcaverlyAug 13, 2025via ghsa
1 file changed · +3 2
  • max/serve/kvcache_agent/kvcache_agent.py+3 2 modified
    @@ -13,14 +13,14 @@
     
     import concurrent.futures
     import logging
    -import pickle
     import queue
     import threading
     from collections.abc import Iterator
     from dataclasses import dataclass
     from typing import Any
     
     import grpc
    +from max.interfaces import msgpack_numpy_decoder
     from max.serve.kvcache_agent.kvcache_agent_service_v1_pb2 import (  # type: ignore
         KVCacheStateUpdate,
         MemoryTier,
    @@ -219,7 +219,8 @@ def __init__(
             # serialized by msgspec/msgpack.
             self._kv_cache_events_pull_socket = ZmqPullSocket[KVCacheChangeMessage](
                 zmq_endpoint=kv_cache_events_zmq_endpoint,
    -            deserialize=pickle.loads,
    +            # GENAI-233: This is currently non-functional.
    +            deserialize=msgpack_numpy_decoder(KVCacheChangeMessage),
             )
             self.server = grpc.server(
                 concurrent.futures.ThreadPoolExecutor(
    
ee9c4ab02345

[Serialization] Remove pickle default serialization within Zmq Sockets

https://github.com/modular/modularkcaverlyJul 30, 2025via ghsa
6 files changed · +30 23
  • max/serve/kvcache_agent/dispatcher_client.py+2 2 modified
    @@ -54,10 +54,10 @@ def __init__(
             """Initialize dispatcher client with ZMQ sockets for communication."""
             self.pull_socket = ZmqPullSocket[
                 DispatcherMessage[DispatcherMessagePayload]
    -        ](zmq_ctx, recv_endpoint, deserialize=deserialize)
    +        ](zmq_ctx, zmq_endpoint=recv_endpoint, deserialize=deserialize)
             self.push_socket = ZmqPushSocket[
                 DispatcherMessage[DispatcherMessagePayload]
    -        ](zmq_ctx, send_endpoint, serialize=serialize)
    +        ](zmq_ctx, zmq_endpoint=send_endpoint, serialize=serialize)
     
             # Request handlers
             self._request_handlers: dict[
    
  • max/serve/kvcache_agent/dispatcher_service.py+2 2 modified
    @@ -76,10 +76,10 @@ def __init__(
     
             self.local_pull_socket = ZmqPullSocket[
                 DispatcherMessage[DispatcherMessagePayload]
    -        ](zmq_ctx, recv_endpoint, deserialize=deserialize)
    +        ](zmq_ctx, zmq_endpoint=recv_endpoint, deserialize=deserialize)
             self.local_push_socket = ZmqPushSocket[
                 DispatcherMessage[DispatcherMessagePayload]
    -        ](zmq_ctx, send_endpoint, serialize=serialize)
    +        ](zmq_ctx, zmq_endpoint=send_endpoint, serialize=serialize)
     
             self._running = False
             self._tasks: list[asyncio.Task] = []
    
  • max/serve/kvcache_agent/kvcache_agent.py+7 1 modified
    @@ -13,6 +13,7 @@
     
     import concurrent.futures
     import logging
    +import pickle
     import queue
     import threading
     from collections.abc import Iterator
    @@ -215,8 +216,13 @@ def __init__(
                 config: Configuration for the server.
             """
             self.config = config
    +        # This remains the only use of pickle in the current codebase.
    +        # As the KVCacheChangeMessage contains Protobuf Enum's it cannot be
    +        # serialized by msgspec/msgpack.
             self._kv_cache_events_pull_socket = ZmqPullSocket[KVCacheChangeMessage](
    -            zmq_ctx, kv_cache_events_zmq_endpoint
    +            zmq_ctx,
    +            zmq_endpoint=kv_cache_events_zmq_endpoint,
    +            deserialize=pickle.loads,
             )
             self.server = grpc.server(
                 concurrent.futures.ThreadPoolExecutor(
    
  • max/serve/queue/zmq_queue.py+4 2 modified
    @@ -142,8 +142,9 @@ class ZmqPushSocket(Generic[T]):
         def __init__(
             self,
             zmq_ctx: zmq.Context,
    +        *,
    +        serialize: Callable[[Any], bytes],
             zmq_endpoint: Optional[str] = None,
    -        serialize: Callable[[Any], bytes] = pickle.dumps,
         ) -> None:
             self.zmq_endpoint = (
                 zmq_endpoint
    @@ -216,8 +217,9 @@ class ZmqPullSocket(Generic[T]):
         def __init__(
             self,
             zmq_ctx: zmq.Context,
    +        *,
    +        deserialize: Callable[[Any], Any],
             zmq_endpoint: Optional[str] = None,
    -        deserialize=pickle.loads,  # noqa: ANN001
         ) -> None:
             self.zmq_endpoint = (
                 zmq_endpoint
    
  • max/serve/scheduler/audio_generation_scheduler.py+6 2 modified
    @@ -31,6 +31,7 @@
         AudioGeneratorOutput,
         SchedulerResult,
         msgpack_numpy_decoder,
    +    msgpack_numpy_encoder,
     )
     from max.nn.kv_cache import PagedKVCacheManager
     from max.pipelines.core import TTSContext
    @@ -220,8 +221,11 @@ def __init__(
             )
             self.response_q = ZmqPushSocket[
                 dict[str, SchedulerResult[AudioGeneratorOutput]]
    -        ](zmq_ctx=zmq_ctx, zmq_endpoint=response_zmq_endpoint)
    -
    +        ](
    +            zmq_ctx=zmq_ctx,
    +            zmq_endpoint=response_zmq_endpoint,
    +            serialize=msgpack_numpy_encoder(),
    +        )
             self.cancel_q = ZmqPullSocket[list[str]](
                 zmq_ctx=zmq_ctx,
                 zmq_endpoint=cancel_zmq_endpoint,
    
  • max/serve/scheduler/queues.py+9 14 modified
    @@ -66,25 +66,20 @@ def __init__(
             # Create Queues
             self.request_push_socket = ZmqPushSocket[tuple[ReqId, ReqOutput]](
                 zmq_ctx,
    -            request_zmq_endpoint,
    +            zmq_endpoint=request_zmq_endpoint,
                 serialize=msgpack_numpy_encoder(use_shared_memory=True),
             )
     
    -        # TODO: Fix Pickle Deserialization for AUDIO_GENERATION
    -        if pipeline_task == PipelineTask.AUDIO_GENERATION:
    -            self.response_pull_socket = ZmqPullSocket[dict[ReqId, ReqOutput]](
    -                zmq_ctx,
    -                response_zmq_endpoint,
    -            )
    -        else:
    -            self.response_pull_socket = ZmqPullSocket[dict[ReqId, ReqOutput]](
    -                zmq_ctx,
    -                response_zmq_endpoint,
    -                deserialize=msgpack_numpy_decoder(pipeline_task.output_type),
    -            )
    +        self.response_pull_socket = ZmqPullSocket[dict[ReqId, ReqOutput]](
    +            zmq_ctx,
    +            zmq_endpoint=response_zmq_endpoint,
    +            deserialize=msgpack_numpy_decoder(pipeline_task.output_type),
    +        )
     
             self.cancel_push_socket = ZmqPushSocket[list[str]](
    -            zmq_ctx, cancel_zmq_endpoint, serialize=msgpack_numpy_encoder()
    +            zmq_ctx,
    +            zmq_endpoint=cancel_zmq_endpoint,
    +            serialize=msgpack_numpy_encoder(),
             )
     
             self.pending_out_queues: dict[ReqId, asyncio.Queue] = {}
    

Vulnerability mechanics

Generated on May 9, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.

References

7

News mentions

0

No linked articles in our index yet.