CVE-2025-60455
Description
Unsafe Deserialization vulnerability in Modular Max Serve before 25.6, specifically when the "--experimental-enable-kvcache-agent" feature is used allowing attackers to execute arbitrary code.
AI Insight
LLM-synthesized narrative grounded in this CVE's description and references.
Unsafe deserialization in Modular Max Serve before 25.6 with experimental KV cache agent allows remote code execution.
Vulnerability
CVE-2025-60455 is an unsafe deserialization vulnerability in Modular Max Serve prior to version 25.6. The flaw occurs when the experimental feature --experimental-enable-kvcache-agent is enabled. The root cause is the use of Python's pickle.loads for deserializing data received over ZMQ sockets [2][4]. Pickle is inherently unsafe because it can execute arbitrary code during deserialization [2].
Attack
Vector An attacker can exploit this by sending a maliciously crafted pickle payload to an affected ZMQ socket. The vulnerability does not require authentication, as the ZMQ sockets are unauthenticated [2]. The attack surface is limited to instances where the experimental KV cache agent feature is active, but any network-accessible instance meeting that condition is potentially vulnerable. The bug was introduced through code reuse patterns across multiple AI ecosystem projects, a pattern researchers call "ShadowMQ" [2].
Impact
Successful exploitation allows an attacker to execute arbitrary code on the server running Modular Max Serve. This could lead to full compromise of the affected system, including data exfiltration, service disruption, or lateral movement within the network.
Mitigation
The vulnerability is fixed in Modular Max Serve version 25.6. The patch replaces the default pickle deserialization with a safer alternative [4]. In the commit fixing the issue, pickle is only retained for a specific internal message type (KVCacheChangeMessage) that cannot be serialized by other means [4]. Users should upgrade to version 25.6 or later and avoid enabling the --experimental-enable-kvcache-agent flag on unpatched versions.
AI Insight generated on May 19, 2026. Synthesized from this CVE's description and the cited reference URLs; citations are validated against the source bundle.
Affected packages
Versions sourced from the GitHub Security Advisory.
| Package | Affected versions | Patched versions |
|---|---|---|
modularPyPI | < 25.6.0 | 25.6.0 |
Affected products
2- Modular/Max Servedescription
- Range: <25.6
Patches
3b20e749fa892[KVCacheAgent] Deserialize KV Cache Events to int for Now
1 file changed · +1 −4
max/serve/kvcache_agent/kvcache_agent.py+1 −4 modified@@ -214,13 +214,10 @@ def __init__( config: Configuration for the server. """ self.config = config - # This remains the only use of pickle in the current codebase. - # As the KVCacheChangeMessage contains Protobuf Enum's it cannot be - # serialized by msgspec/msgpack. self._kv_cache_events_pull_socket = ZmqPullSocket[KVCacheChangeMessage]( zmq_endpoint=kv_cache_events_zmq_endpoint, # GENAI-233: This is currently non-functional. - deserialize=msgpack_numpy_decoder(KVCacheChangeMessage), + deserialize=msgpack_numpy_decoder(int), ) self.server = grpc.server( concurrent.futures.ThreadPoolExecutor(
10620059fb5c[KVCacheAgent] Remove pickle deserialization in KV Cache Agent
1 file changed · +3 −2
max/serve/kvcache_agent/kvcache_agent.py+3 −2 modified@@ -13,14 +13,14 @@ import concurrent.futures import logging -import pickle import queue import threading from collections.abc import Iterator from dataclasses import dataclass from typing import Any import grpc +from max.interfaces import msgpack_numpy_decoder from max.serve.kvcache_agent.kvcache_agent_service_v1_pb2 import ( # type: ignore KVCacheStateUpdate, MemoryTier, @@ -219,7 +219,8 @@ def __init__( # serialized by msgspec/msgpack. self._kv_cache_events_pull_socket = ZmqPullSocket[KVCacheChangeMessage]( zmq_endpoint=kv_cache_events_zmq_endpoint, - deserialize=pickle.loads, + # GENAI-233: This is currently non-functional. + deserialize=msgpack_numpy_decoder(KVCacheChangeMessage), ) self.server = grpc.server( concurrent.futures.ThreadPoolExecutor(
ee9c4ab02345[Serialization] Remove pickle default serialization within Zmq Sockets
6 files changed · +30 −23
max/serve/kvcache_agent/dispatcher_client.py+2 −2 modified@@ -54,10 +54,10 @@ def __init__( """Initialize dispatcher client with ZMQ sockets for communication.""" self.pull_socket = ZmqPullSocket[ DispatcherMessage[DispatcherMessagePayload] - ](zmq_ctx, recv_endpoint, deserialize=deserialize) + ](zmq_ctx, zmq_endpoint=recv_endpoint, deserialize=deserialize) self.push_socket = ZmqPushSocket[ DispatcherMessage[DispatcherMessagePayload] - ](zmq_ctx, send_endpoint, serialize=serialize) + ](zmq_ctx, zmq_endpoint=send_endpoint, serialize=serialize) # Request handlers self._request_handlers: dict[
max/serve/kvcache_agent/dispatcher_service.py+2 −2 modified@@ -76,10 +76,10 @@ def __init__( self.local_pull_socket = ZmqPullSocket[ DispatcherMessage[DispatcherMessagePayload] - ](zmq_ctx, recv_endpoint, deserialize=deserialize) + ](zmq_ctx, zmq_endpoint=recv_endpoint, deserialize=deserialize) self.local_push_socket = ZmqPushSocket[ DispatcherMessage[DispatcherMessagePayload] - ](zmq_ctx, send_endpoint, serialize=serialize) + ](zmq_ctx, zmq_endpoint=send_endpoint, serialize=serialize) self._running = False self._tasks: list[asyncio.Task] = []
max/serve/kvcache_agent/kvcache_agent.py+7 −1 modified@@ -13,6 +13,7 @@ import concurrent.futures import logging +import pickle import queue import threading from collections.abc import Iterator @@ -215,8 +216,13 @@ def __init__( config: Configuration for the server. """ self.config = config + # This remains the only use of pickle in the current codebase. + # As the KVCacheChangeMessage contains Protobuf Enum's it cannot be + # serialized by msgspec/msgpack. self._kv_cache_events_pull_socket = ZmqPullSocket[KVCacheChangeMessage]( - zmq_ctx, kv_cache_events_zmq_endpoint + zmq_ctx, + zmq_endpoint=kv_cache_events_zmq_endpoint, + deserialize=pickle.loads, ) self.server = grpc.server( concurrent.futures.ThreadPoolExecutor(
max/serve/queue/zmq_queue.py+4 −2 modified@@ -142,8 +142,9 @@ class ZmqPushSocket(Generic[T]): def __init__( self, zmq_ctx: zmq.Context, + *, + serialize: Callable[[Any], bytes], zmq_endpoint: Optional[str] = None, - serialize: Callable[[Any], bytes] = pickle.dumps, ) -> None: self.zmq_endpoint = ( zmq_endpoint @@ -216,8 +217,9 @@ class ZmqPullSocket(Generic[T]): def __init__( self, zmq_ctx: zmq.Context, + *, + deserialize: Callable[[Any], Any], zmq_endpoint: Optional[str] = None, - deserialize=pickle.loads, # noqa: ANN001 ) -> None: self.zmq_endpoint = ( zmq_endpoint
max/serve/scheduler/audio_generation_scheduler.py+6 −2 modified@@ -31,6 +31,7 @@ AudioGeneratorOutput, SchedulerResult, msgpack_numpy_decoder, + msgpack_numpy_encoder, ) from max.nn.kv_cache import PagedKVCacheManager from max.pipelines.core import TTSContext @@ -220,8 +221,11 @@ def __init__( ) self.response_q = ZmqPushSocket[ dict[str, SchedulerResult[AudioGeneratorOutput]] - ](zmq_ctx=zmq_ctx, zmq_endpoint=response_zmq_endpoint) - + ]( + zmq_ctx=zmq_ctx, + zmq_endpoint=response_zmq_endpoint, + serialize=msgpack_numpy_encoder(), + ) self.cancel_q = ZmqPullSocket[list[str]]( zmq_ctx=zmq_ctx, zmq_endpoint=cancel_zmq_endpoint,
max/serve/scheduler/queues.py+9 −14 modified@@ -66,25 +66,20 @@ def __init__( # Create Queues self.request_push_socket = ZmqPushSocket[tuple[ReqId, ReqOutput]]( zmq_ctx, - request_zmq_endpoint, + zmq_endpoint=request_zmq_endpoint, serialize=msgpack_numpy_encoder(use_shared_memory=True), ) - # TODO: Fix Pickle Deserialization for AUDIO_GENERATION - if pipeline_task == PipelineTask.AUDIO_GENERATION: - self.response_pull_socket = ZmqPullSocket[dict[ReqId, ReqOutput]]( - zmq_ctx, - response_zmq_endpoint, - ) - else: - self.response_pull_socket = ZmqPullSocket[dict[ReqId, ReqOutput]]( - zmq_ctx, - response_zmq_endpoint, - deserialize=msgpack_numpy_decoder(pipeline_task.output_type), - ) + self.response_pull_socket = ZmqPullSocket[dict[ReqId, ReqOutput]]( + zmq_ctx, + zmq_endpoint=response_zmq_endpoint, + deserialize=msgpack_numpy_decoder(pipeline_task.output_type), + ) self.cancel_push_socket = ZmqPushSocket[list[str]]( - zmq_ctx, cancel_zmq_endpoint, serialize=msgpack_numpy_encoder() + zmq_ctx, + zmq_endpoint=cancel_zmq_endpoint, + serialize=msgpack_numpy_encoder(), ) self.pending_out_queues: dict[ReqId, asyncio.Queue] = {}
Vulnerability mechanics
Generated on May 9, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.
References
7- github.com/advisories/GHSA-7xcv-9j6c-2fmcghsaADVISORY
- nvd.nist.gov/vuln/detail/CVE-2025-60455ghsaADVISORY
- github.com/modular/modular/commit/10620059fb5c47fb0c30e5d21a8ff3b8d622fba4ghsaWEB
- github.com/modular/modular/commit/b20e749fa892dbe772e890a268002f732164d9f5ghsaWEB
- github.com/modular/modular/commit/ee9c4ab02345dd30bed8b79771b6909ff1b930a1ghsaWEB
- github.com/modular/modular/issues/4795ghsaWEB
- www.oligo.security/blog/shadowmq-how-code-reuse-spread-critical-vulnerabilities-across-the-ai-ecosystemghsaWEB
News mentions
0No linked articles in our index yet.