VYPR
High severity7.3NVD Advisory· Published Jun 1, 2026

CVE-2026-10220

CVE-2026-10220

Description

A prompt injection filter bypass in NousResearch hermes-agent's skills_tool.py allows remote attackers to inject malicious instructions into the LLM agent.

AI Insight

LLM-synthesized narrative grounded in this CVE's description and references.

A prompt injection filter bypass in NousResearch hermes-agent's skills_tool.py allows remote attackers to inject malicious instructions into the LLM agent.

Vulnerability

The vulnerability resides in the _serve_plugin_skill and skill_view functions of tools/skills_tool.py in NousResearch hermes-agent up to version 2026.4.30. The module uses a static blocklist of exact literal strings (e.g., "ignore previous instructions", "forget your instructions") and performs naive substring matching via Python's in operator [1]. This approach only detects the precise contiguous character sequences, making it trivial to bypass with synonyms, extra words, or whitespace insertion. Notably, the codebase already contains a hardened regex-based implementation in tools/skills_guard.py that correctly handles these variants, but it was never applied to the vulnerable functions [1].

Exploitation

An attacker requires remote network access to deliver a crafted skill to the agent (e.g., via plugin installation or skill upload). The attacker constructs a prompt injection payload that deviates from the blocked literals, such as substituting "ignore ALL prior instructions" for "ignore previous instructions" or inserting extra whitespace (e.g., "disregard your original guidelines") [1]. When the skill is loaded by the vulnerable functions, the payload bypasses the simplistic filter and enters the LLM context, enabling the attacker to manipulate the agent's behavior.

Impact

Successful exploitation allows an attacker to inject arbitrary instructions into the LLM agent's prompt context. This can override previous system directives, potentially leading to unauthorized actions, information disclosure, or complete control over the agent's responses and operations. Since the attacker can execute this remotely and without authentication (skill installation assumed to be possible), the impact is high with significant consequences for confidentiality, integrity, and availability.

Mitigation

The vendor (NousResearch) was contacted but did not respond, and no official patch has been released [1]. The advisory recommends replacing the naive substring matching in tools/skills_tool.py with the regex-based detection already implemented in tools/skills_guard.py [1]. Until a fix is available, users should avoid installing skills from untrusted sources or manually patch the file to apply the more robust guard logic.

AI Insight generated on Jun 1, 2026. Synthesized from this CVE's description and the cited reference URLs; citations are validated against the source bundle.

Affected products

1

Patches

0

No patches discovered yet.

Vulnerability mechanics

Root cause

"The prompt injection scanner in tools/skills_tool.py uses naive exact-string substring matching instead of compiled regex patterns, allowing trivial bypass via multi-word payload variants."

Attack vector

An attacker creates a malicious skill containing a prompt injection payload with extra words, synonym substitutions, or added whitespace between keywords, then distributes it via a shared skill repository, git clone, or plugin [ref_id=1]. When a victim loads the skill through `skill_view()`, the naive exact-string substring filter fails to flag the payload, and the malicious content is returned directly into the LLM agent's context window [ref_id=1]. This allows the attacker to override the agent's instructions and potentially exfiltrate sensitive data, execute arbitrary commands, or manipulate the end user [ref_id=1]. The attack is performed remotely with no authentication required [CWE-20].

Affected code

The vulnerability is in `tools/skills_tool.py` at lines 130–143 where `_INJECTION_PATTERNS` is defined as a list of exact literal strings, and at lines 787 and 997 where Python's `in` operator performs naive substring matching. The same codebase already contains a hardened regex-based implementation in `tools/skills_guard.py` line 164 that was never applied to `skills_tool.py` [ref_id=1].

What the fix does

The patch does not appear in the bundle; the advisory states no patched version exists [ref_id=1]. The recommended fix is to replace the naive exact-string substring matching in `tools/skills_tool.py` with the compiled regex approach already present in `tools/skills_guard.py`, which uses `r'ignore\s+(?:\w+\s+)*(previous|all|above|prior)\s+instructions'` to allow any number of intervening words between keywords [ref_id=1]. Until such a patch is applied, the injection filter remains trivially bypassable.

Preconditions

  • inputThe victim must load a skill containing a crafted prompt injection payload through skill_view()
  • configThe skills system is enabled by default — no special configuration needed
  • networkThe attacker distributes the malicious skill via shared repos, git clone, or plugin

Reproduction

The advisory provides a public PoC script (`poc_exploit.py`) and a control experiment script (`control-regex_detection.py`) [ref_id=1]. Steps: set up a working hermes-agent installation, set `HERMES_ROOT`, run `python3 poc_exploit.py` to create temporary skill files with multi-word variant injection payloads and load them through `skill_view()`, then run `python3 control-regex_detection.py` to compare exact-string matching against the hardened regex approach [ref_id=1].

Generated on Jun 1, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.

References

5

News mentions

1