vLLM: Security Check Bypass via assert Statement in Activation Function Loading Allows Arbitrary Code Execution
Description
vLLM's assert-based security check in activation loading allows unauthenticated RCE via malicious HuggingFace model when Python runs in optimized mode.
AI Insight
LLM-synthesized narrative grounded in this CVE's description and references.
vLLM's assert-based security check in activation loading allows unauthenticated RCE via malicious HuggingFace model when Python runs in optimized mode.
Vulnerability
An assert statement in vLLM at vllm/model_executor/layers/pooler/activations.py:48 is the sole security control restricting which activation functions can be loaded from a HuggingFace model's config.json [2]. The assert checks that function_name starts with "torch.nn.modules.". However, Python's assert statements are stripped when running in optimized mode (python -O or PYTHONOPTIMIZE=1) [1]. When stripped, attacker-controlled function_name is passed directly to resolve_obj_by_qualname(), an unrestricted import gadget that imports any module and retrieves any attribute [1][3]. This affects vLLM versions prior to the commit fixing this issue (b3c7ffcab82c2439726f8cb213800f6f38c023d3) [1].
Exploitation
An unauthenticated attacker can publish a malicious HuggingFace model with a crafted config.json that sets sentence_transformers.activation_fn or sbert_ce_default_activation_function to an arbitrary Python module and attribute (e.g., "os.system") [2][3]. When a vLLM server loads the model while running in Python optimized mode, the assert is removed, and resolve_obj_by_qualname() imports the attacker-specified module and calls the retrieved object, resulting in arbitrary code execution [1][3]. No authentication or network position beyond publishing a model is required.
Impact
Successful exploitation yields arbitrary code execution on the vLLM server with the privileges of the vLLM process [2][3]. This leads to full compromise of confidentiality, integrity, and availability. The attacker can execute arbitrary commands, steal data, or disrupt service. The vulnerability is comparable to CVE-2017-1000433 [1][2].
Mitigation
The fix is implemented in commit b3c7ffcab82c2439726f8cb213800f6f38c023d3, which replaces the assert with a proper raise ValueError check [1]. vLLM users should upgrade to a version containing this commit or apply the patch. As a workaround, do not run vLLM in Python optimized mode (python -O), as this disables assert statements. The fix was released on GitHub; users should update their vLLM installation [1][2][3].
AI Insight generated on Jun 16, 2026. Synthesized from this CVE's description and the cited reference URLs; citations are validated against the source bundle.
Affected products
1Patches
1b3c7ffcab82c[Misc] Replace assert with proper exceptions for security and validation in pooling (#43286)
4 files changed · +37 −17
tests/model_executor/layers/test_pooler_activations.py+1 −1 modified@@ -212,7 +212,7 @@ def test_rejects_non_torch_activation(self): problem_type="", sentence_transformers={"activation_fn": "os.system"}, ) - with pytest.raises(AssertionError, match="restricted"): + with pytest.raises(ValueError, match="restricted"): get_act_fn(cfg)
vllm/model_executor/layers/pooler/activations.py+7 −5 modified@@ -49,10 +49,11 @@ def get_act_fn( function_name = config.sbert_ce_default_activation_function if function_name is not None: - assert function_name.startswith("torch.nn.modules."), ( - "Loading of activation functions is restricted to " - "torch.nn.modules for security reasons" - ) + if not function_name.startswith("torch.nn.modules."): + raise ValueError( + "Loading of activation functions is restricted to " + "torch.nn.modules for security reasons" + ) fn = resolve_obj_by_qualname(function_name)() return PoolerActivation.wraps(fn) @@ -67,7 +68,8 @@ def resolve_classifier_act_fn( if act_fn is None: return get_act_fn(model_config.hf_config, static_num_labels) - assert callable(act_fn) + if not callable(act_fn): + raise TypeError(f"Expected a callable activation function, got {type(act_fn)}") return act_fn
vllm/pooling_params.py+9 −5 modified@@ -110,7 +110,8 @@ def _merge_default_parameters(self, model_config: ModelConfig) -> None: if pooler_config is None: return - assert self.task is not None, "task must be set" + if self.task is None: + raise ValueError("task must be set before merging parameters") valid_parameters = self.valid_parameters[self.task] for k in valid_parameters: @@ -189,7 +190,8 @@ def _set_default_parameters(self, model_config: ModelConfig): raise ValueError(f"Unknown pooling task: {self.task!r}") def _verify_valid_parameters(self): - assert self.task is not None, "task must be set" + if self.task is None: + raise ValueError("task must be set before verifying parameters") valid_parameters = self.valid_parameters[self.task] invalid_parameters = [] for k in self.all_parameters: @@ -221,6 +223,8 @@ def __repr__(self) -> str: ) def __post_init__(self) -> None: - assert self.output_kind == RequestOutputKind.FINAL_ONLY, ( - "For pooling output_kind has to be FINAL_ONLY" - ) + if self.output_kind != RequestOutputKind.FINAL_ONLY: + raise ValueError( + "For pooling output_kind has to be FINAL_ONLY, " + f"got {self.output_kind!r}" + )
vllm/v1/pool/metadata.py+20 −6 modified@@ -64,7 +64,11 @@ def __post_init__(self) -> None: for pooling_param in pooling_params if (task := pooling_param.task) is not None ] - assert len(pooling_params) == len(tasks) + if len(pooling_params) != len(tasks): + raise ValueError( + "Every pooling param must have a task set, but got " + f"{len(tasks)} tasks for {len(pooling_params)} pooling params" + ) self.tasks = tasks @@ -88,9 +92,11 @@ def _get_prompt_token_ids( self, prompt_token_ids: torch.Tensor | None, ) -> list[torch.Tensor]: - assert prompt_token_ids is not None, ( - "Please set `requires_token_ids=True` in `get_pooling_updates`" - ) + if prompt_token_ids is None: + raise ValueError( + "prompt_token_ids is required but was not set. " + "Please set `requires_token_ids=True` in `get_pooling_updates`" + ) return [prompt_token_ids[i, :num] for i, num in enumerate(self.prompt_lens)] def get_prompt_token_ids(self) -> list[torch.Tensor]: @@ -101,7 +107,11 @@ def get_prompt_token_ids_cpu(self) -> list[torch.Tensor]: def get_pooling_cursor(self) -> PoolingCursor: pooling_cursor = self.pooling_cursor - assert pooling_cursor is not None, "Should call `build_pooling_cursor` first" + if pooling_cursor is None: + raise RuntimeError( + "pooling_cursor has not been initialized. " + "Call `build_pooling_cursor` before accessing it" + ) return pooling_cursor @@ -115,7 +125,11 @@ def build_pooling_cursor( n_seq = len(num_scheduled_tokens_np) prompt_lens = self.prompt_lens - assert len(prompt_lens) == n_seq + if len(prompt_lens) != n_seq: + raise ValueError( + f"prompt_lens length ({len(prompt_lens)}) does not match " + f"the number of sequences ({n_seq})" + ) num_scheduled_tokens_cpu = torch.from_numpy(num_scheduled_tokens_np) if query_start_loc_gpu is None:
Vulnerability mechanics
Root cause
"Python assert statement used as the sole security control is stripped at compile time under optimized mode, allowing attacker-controlled input to reach an unrestricted import gadget."
Attack vector
An attacker publishes a HuggingFace model with a crafted `config.json` containing a malicious `activation_fn` value (e.g. `"os.system"`) under `sentence_transformers` or `sbert_ce_default_activation_function`. When a victim loads this model with vLLM running under `python -O` or `PYTHONOPTIMIZE=1`, Python strips the `assert` statement at `vllm/model_executor/layers/pooler/activations.py:48` that was the sole security check. The attacker-controlled string is then passed directly to `resolve_obj_by_qualname()`, an unrestricted import gadget that calls `importlib.import_module()` and `getattr()` on the attacker-supplied module and object name, achieving arbitrary code execution during model initialization. [ref_id=1] [ref_id=2]
What the fix does
The patch replaces the `assert` statement in `get_act_fn()` with an explicit `if not function_name.startswith("torch.nn.modules."): raise ValueError(...)` conditional, which is never stripped by the Python interpreter regardless of optimization flags. [patch_id=6191259] The same pattern is applied to several other `assert` statements across `vllm/pooling_params.py`, `vllm/v1/pool/metadata.py`, and `resolve_classifier_act_fn()` to harden validation. [patch_id=6191259] The test expectation is also updated from `AssertionError` to `ValueError` to match the new exception type. [patch_id=6191259]
Preconditions
- configvLLM must be running under python -O or PYTHONOPTIMIZE=1 so that assert statements are stripped at compile time
- inputVictim must load a malicious HuggingFace model with a crafted config.json
- configModel must use a cross-encoder architecture (e.g. BERT or RoBERTa with sequence classification) that triggers the pooling activation loading path
Generated on Jun 16, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.
References
4News mentions
0No linked articles in our index yet.