'TrustFall' Convention Exposes Claude Code Execution Risk

Adversa AI researchers have disclosed a class-level security issue affecting popular AI coding tools, including Claude Code, Cursor CLI, Gemini CLI, and CoPilot CLI. The issue, dubbed 'TrustFall,' arises from the tools' trust dialogs, which fail to adequately warn users that approving a repository can auto-launch a malicious Model Context Protocol (MCP) server with full system privileges. This enables code execution, credential theft, and backdoor installation with just a single keypress in interactive mode or no interaction in CI/CD environments.

The attack vector is straightforward: a threat actor creates a repository that includes a malicious MCP server and configuration settings that auto-approve it to run. When a developer clones or opens the repo in the AI coding tool and presses 'enter' on what appears to be a routine security check, the tool unwittingly launches the attacker-controlled code with the developer's full system privileges. In CI/CD environments, the attack unfolds with no human interaction at all.

According to Adversa AI's lead researcher Rony Utevsky, 'A repository can ship a configuration that auto-approves and immediately launches an MCP server, no tool call from the agent is required.' The payload can read local files, including secrets, SSH keys, and tokens; access other projects; install backdoors; and establish command-and-control connections. The impact is full-machine compromise, as MCP servers execute as native OS processes with the full privileges of the user running the tool.

The report highlights a risky change in Claude Code version 2.1, which removed explicit MCP warnings present in earlier versions. Previously, users were warned about MCP execution and given an option to proceed with MCP servers disabled. Now, the dialog simply asks 'Yes, I trust this folder,' with no mention of MCP. Utevsky notes that 'most developers don't realize 'trusting' hands over that much power.'

Anthropic has described the issue as outside its threat model, stating that malicious activity occurs only after the user has allowed a repo/folder to be trusted. Adversa AI did not raise the issue with other vendors because Anthropic's approach appears to be the general convention. The researchers identified three exploitable vulnerabilities in Claude Code (CVE-2025-59536, CVE-2026-21852, CVE-2026-33068) that Anthropic has patched, but the TrustFall convention remains unaddressed.

The TrustFall issue underscores a broader challenge in AI coding tool security: the gap between user consent and actual risk. As these tools become more integrated into development workflows, the potential for supply-chain attacks grows. Developers are advised to review trust dialogs carefully, disable automatic MCP server approval where possible, and be cautious when cloning repositories from untrusted sources.