VYPR
High severity7.5GHSA Advisory· Published Jun 16, 2026· Updated Jun 16, 2026

vLLM: Security Check Bypass via assert Statement in Activation Function Loading Allows Arbitrary Code Execution

CVE-2026-41523

Description

vLLM's assert-based security check in activation loading allows unauthenticated RCE via malicious HuggingFace model when Python runs in optimized mode.

AI Insight

LLM-synthesized narrative grounded in this CVE's description and references.

vLLM's assert-based security check in activation loading allows unauthenticated RCE via malicious HuggingFace model when Python runs in optimized mode.

Vulnerability

An assert statement in vLLM at vllm/model_executor/layers/pooler/activations.py:48 is the sole security control restricting which activation functions can be loaded from a HuggingFace model's config.json [2]. The assert checks that function_name starts with "torch.nn.modules.". However, Python's assert statements are stripped when running in optimized mode (python -O or PYTHONOPTIMIZE=1) [1]. When stripped, attacker-controlled function_name is passed directly to resolve_obj_by_qualname(), an unrestricted import gadget that imports any module and retrieves any attribute [1][3]. This affects vLLM versions prior to the commit fixing this issue (b3c7ffcab82c2439726f8cb213800f6f38c023d3) [1].

Exploitation

An unauthenticated attacker can publish a malicious HuggingFace model with a crafted config.json that sets sentence_transformers.activation_fn or sbert_ce_default_activation_function to an arbitrary Python module and attribute (e.g., "os.system") [2][3]. When a vLLM server loads the model while running in Python optimized mode, the assert is removed, and resolve_obj_by_qualname() imports the attacker-specified module and calls the retrieved object, resulting in arbitrary code execution [1][3]. No authentication or network position beyond publishing a model is required.

Impact

Successful exploitation yields arbitrary code execution on the vLLM server with the privileges of the vLLM process [2][3]. This leads to full compromise of confidentiality, integrity, and availability. The attacker can execute arbitrary commands, steal data, or disrupt service. The vulnerability is comparable to CVE-2017-1000433 [1][2].

Mitigation

The fix is implemented in commit b3c7ffcab82c2439726f8cb213800f6f38c023d3, which replaces the assert with a proper raise ValueError check [1]. vLLM users should upgrade to a version containing this commit or apply the patch. As a workaround, do not run vLLM in Python optimized mode (python -O), as this disables assert statements. The fix was released on GitHub; users should update their vLLM installation [1][2][3].

AI Insight generated on Jun 16, 2026. Synthesized from this CVE's description and the cited reference URLs; citations are validated against the source bundle.

Affected products

1

Patches

1
b3c7ffcab82c

[Misc] Replace assert with proper exceptions for security and validation in pooling (#43286)

https://github.com/vllm-project/vllmTaneem IbrahimMay 22, 2026via ghsa-ref
4 files changed · +37 17
  • tests/model_executor/layers/test_pooler_activations.py+1 1 modified
    @@ -212,7 +212,7 @@ def test_rejects_non_torch_activation(self):
                 problem_type="",
                 sentence_transformers={"activation_fn": "os.system"},
             )
    -        with pytest.raises(AssertionError, match="restricted"):
    +        with pytest.raises(ValueError, match="restricted"):
                 get_act_fn(cfg)
     
     
    
  • vllm/model_executor/layers/pooler/activations.py+7 5 modified
    @@ -49,10 +49,11 @@ def get_act_fn(
             function_name = config.sbert_ce_default_activation_function
     
         if function_name is not None:
    -        assert function_name.startswith("torch.nn.modules."), (
    -            "Loading of activation functions is restricted to "
    -            "torch.nn.modules for security reasons"
    -        )
    +        if not function_name.startswith("torch.nn.modules."):
    +            raise ValueError(
    +                "Loading of activation functions is restricted to "
    +                "torch.nn.modules for security reasons"
    +            )
             fn = resolve_obj_by_qualname(function_name)()
             return PoolerActivation.wraps(fn)
     
    @@ -67,7 +68,8 @@ def resolve_classifier_act_fn(
         if act_fn is None:
             return get_act_fn(model_config.hf_config, static_num_labels)
     
    -    assert callable(act_fn)
    +    if not callable(act_fn):
    +        raise TypeError(f"Expected a callable activation function, got {type(act_fn)}")
         return act_fn
     
     
    
  • vllm/pooling_params.py+9 5 modified
    @@ -110,7 +110,8 @@ def _merge_default_parameters(self, model_config: ModelConfig) -> None:
             if pooler_config is None:
                 return
     
    -        assert self.task is not None, "task must be set"
    +        if self.task is None:
    +            raise ValueError("task must be set before merging parameters")
             valid_parameters = self.valid_parameters[self.task]
     
             for k in valid_parameters:
    @@ -189,7 +190,8 @@ def _set_default_parameters(self, model_config: ModelConfig):
                 raise ValueError(f"Unknown pooling task: {self.task!r}")
     
         def _verify_valid_parameters(self):
    -        assert self.task is not None, "task must be set"
    +        if self.task is None:
    +            raise ValueError("task must be set before verifying parameters")
             valid_parameters = self.valid_parameters[self.task]
             invalid_parameters = []
             for k in self.all_parameters:
    @@ -221,6 +223,8 @@ def __repr__(self) -> str:
             )
     
         def __post_init__(self) -> None:
    -        assert self.output_kind == RequestOutputKind.FINAL_ONLY, (
    -            "For pooling output_kind has to be FINAL_ONLY"
    -        )
    +        if self.output_kind != RequestOutputKind.FINAL_ONLY:
    +            raise ValueError(
    +                "For pooling output_kind has to be FINAL_ONLY, "
    +                f"got {self.output_kind!r}"
    +            )
    
  • vllm/v1/pool/metadata.py+20 6 modified
    @@ -64,7 +64,11 @@ def __post_init__(self) -> None:
                 for pooling_param in pooling_params
                 if (task := pooling_param.task) is not None
             ]
    -        assert len(pooling_params) == len(tasks)
    +        if len(pooling_params) != len(tasks):
    +            raise ValueError(
    +                "Every pooling param must have a task set, but got "
    +                f"{len(tasks)} tasks for {len(pooling_params)} pooling params"
    +            )
     
             self.tasks = tasks
     
    @@ -88,9 +92,11 @@ def _get_prompt_token_ids(
             self,
             prompt_token_ids: torch.Tensor | None,
         ) -> list[torch.Tensor]:
    -        assert prompt_token_ids is not None, (
    -            "Please set `requires_token_ids=True` in `get_pooling_updates`"
    -        )
    +        if prompt_token_ids is None:
    +            raise ValueError(
    +                "prompt_token_ids is required but was not set. "
    +                "Please set `requires_token_ids=True` in `get_pooling_updates`"
    +            )
             return [prompt_token_ids[i, :num] for i, num in enumerate(self.prompt_lens)]
     
         def get_prompt_token_ids(self) -> list[torch.Tensor]:
    @@ -101,7 +107,11 @@ def get_prompt_token_ids_cpu(self) -> list[torch.Tensor]:
     
         def get_pooling_cursor(self) -> PoolingCursor:
             pooling_cursor = self.pooling_cursor
    -        assert pooling_cursor is not None, "Should call `build_pooling_cursor` first"
    +        if pooling_cursor is None:
    +            raise RuntimeError(
    +                "pooling_cursor has not been initialized. "
    +                "Call `build_pooling_cursor` before accessing it"
    +            )
     
             return pooling_cursor
     
    @@ -115,7 +125,11 @@ def build_pooling_cursor(
             n_seq = len(num_scheduled_tokens_np)
             prompt_lens = self.prompt_lens
     
    -        assert len(prompt_lens) == n_seq
    +        if len(prompt_lens) != n_seq:
    +            raise ValueError(
    +                f"prompt_lens length ({len(prompt_lens)}) does not match "
    +                f"the number of sequences ({n_seq})"
    +            )
     
             num_scheduled_tokens_cpu = torch.from_numpy(num_scheduled_tokens_np)
             if query_start_loc_gpu is None:
    

Vulnerability mechanics

Root cause

"Python assert statement used as the sole security control is stripped at compile time under optimized mode, allowing attacker-controlled input to reach an unrestricted import gadget."

Attack vector

An attacker publishes a HuggingFace model with a crafted `config.json` containing a malicious `activation_fn` value (e.g. `"os.system"`) under `sentence_transformers` or `sbert_ce_default_activation_function`. When a victim loads this model with vLLM running under `python -O` or `PYTHONOPTIMIZE=1`, Python strips the `assert` statement at `vllm/model_executor/layers/pooler/activations.py:48` that was the sole security check. The attacker-controlled string is then passed directly to `resolve_obj_by_qualname()`, an unrestricted import gadget that calls `importlib.import_module()` and `getattr()` on the attacker-supplied module and object name, achieving arbitrary code execution during model initialization. [ref_id=1] [ref_id=2]

What the fix does

The patch replaces the `assert` statement in `get_act_fn()` with an explicit `if not function_name.startswith("torch.nn.modules."): raise ValueError(...)` conditional, which is never stripped by the Python interpreter regardless of optimization flags. [patch_id=6191259] The same pattern is applied to several other `assert` statements across `vllm/pooling_params.py`, `vllm/v1/pool/metadata.py`, and `resolve_classifier_act_fn()` to harden validation. [patch_id=6191259] The test expectation is also updated from `AssertionError` to `ValueError` to match the new exception type. [patch_id=6191259]

Preconditions

  • configvLLM must be running under python -O or PYTHONOPTIMIZE=1 so that assert statements are stripped at compile time
  • inputVictim must load a malicious HuggingFace model with a crafted config.json
  • configModel must use a cross-encoder architecture (e.g. BERT or RoBERTa with sequence classification) that triggers the pooling activation loading path

Generated on Jun 16, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.

References

4

News mentions

0

No linked articles in our index yet.