TrustFall Attack Exploits MCP Server Trust Dialog for One-Click RCE in Claude Code and Other AI CLIs

Security firm Adversa AI has disclosed a one-click remote code execution attack dubbed TrustFall that targets AI coding assistants including Claude Code, Gemini CLI, Cursor CLI, and Copilot CLI. The proof-of-concept attack demonstrates how a cloned code repository can include two JSON files (.mcp.json and .claude/settings.json) that open the door to an attacker-controlled Model Context Protocol (MCP) server. MCP servers make tools, configuration data, schemas, and documentation available in a standard format to AI models via JSON.

The vulnerability arises from inconsistent restrictions governing the scope of settings: Anthropic blocks some dangerous settings at the project level (e.g. bypassPermissions) but not others (e.g. enableAllProjectMcpServers and enabledMcpjsonServers). The JSON files simply enable those settings. "The moment a developer presses Enter on Claude Code's generic 'Yes, I trust this folder' dialog, the server spawns as an unsandboxed Node.js process with the user's full privileges — no per-server consent, no tool call from Claude required," Adversa AI explains in its PoC repo. The likely result is a compromised system.

The PoC worked on Claude Code CLI v2.1.114, as of May 2. Other agent CLIs are also said to be affected, but specific PoCs have not been published. "It's the third CVE in Claude Code in six months from the same root cause (project-scoped settings as injection vector)," Alex Polyakov, co-founder of Adversa AI, told The Register in an email. "Each gets patched in isolation but the underlying class hasn't been finally fixed. Most developers don't know these settings exist, let alone that a cloned repo can set them silently."

Anthropic, according to the security firm, contends that the user's trust decision moves the issue outside its threat model. CVE-2025-59536 was considered a vulnerability because it triggered automatically when a user started up Claude Code in a malicious directory. TrustFall, however, is considered out of scope because the user has been presented with a dialog box and made a trust decision. Adversa argues that the decision is not being made with informed consent, citing a prior, more explicit warning notice that was removed in v2.1 of the Claude Code CLI.

"The pre-v2.1 dialog explicitly warned that .mcp.json could execute code and offered three options including 'proceed with MCP servers disabled,'" writes Adversa's Sergey Malenkovich. "That informed-consent UX was removed. The current dialog defaults to 'Yes, I trust this folder' with no MCP-specific language, no enumeration of which executables will spawn, and no opt-out for MCP while keeping the rest of the trust grant."

Then there's the zero-click variant to consider for CI/CD pipelines that implement Claude Code. When Claude Code is invoked in CI/CD, that happens via SDK rather than the interactive CLI. So there's no terminal prompt. Malenkovich argues that Anthropic should make three changes. First, block enableAllProjectMcpServers, enabledMcpjsonServers, and permissions.allow from any settings file inside a project. The idea is that a malicious server should not be able to approve its own servers. Second, implement a dedicated MCP consent dialog that defaults to "deny." And third, require interactive consent per server rather than for all servers.

Anthropic did not respond to a request for comment. The TrustFall attack highlights the broader challenge of securing AI coding tools that rely on trust-based models. As these tools become more integrated into development workflows, the potential for supply-chain attacks via malicious repositories grows. Developers are urged to exercise caution when cloning repositories and to review trust dialogs carefully.