Independent Study Ranks 100 AI Agents, Finding Few Are Both Capable and Secure

A comprehensive evaluation of 100 AI agents has revealed a significant security gap, with a vast majority failing to balance capability with robust defenses. The "AI Risk Quadrant" report, published by Adversa AI, assessed these agents across ten categories, positioning them based on their vulnerability to compromise, the potential impact of a breach, and the strength of their security measures. The findings indicate a concerning trend where the most powerful AI agents often present the largest attack surfaces.

The core issue identified is the "AI agent lethal trifecta," defined as "private data access + exposure to untrusted content + ability for outbound actions." This translates into a familiar cybersecurity challenge: excessive power combined with undue trust and insufficient control. According to Adversa's analysis, 98% of the tested agents exhibit this trifecta, making it difficult for them to be both useful and secure simultaneously. This structural feature of the AI market means that vendors offering the most capable agents also tend to offer the widest attack surface.

Among the categories examined, "computer agents" and "coding agents" showed the most pronounced "power-protection inversion." Computer agents, designed to perform specific tasks for users, are often granted extensive access rights, effectively mirroring the operating system's permissions. A compromise of such an agent could grant an attacker full control over the user's machine. Furthermore, users often lack visibility into the agent's internal processes, making it difficult to ascertain the exact actions taken between receiving an input and generating an output.

The report highlights a critical flaw in user confirmation steps for these agents. Users may approve an action based on its superficial appearance rather than understanding the underlying operations. This "confirmation mismatch" occurs because the interface does not adequately surface the differences between what the human perceives and what the AI agent is actually programmed to do, especially when dealing with complex system interactions.

Coding agents present another significant risk, particularly as AI-assisted software development becomes more prevalent. These agents, ranging from coding copilots to autonomous app builders, can interact with sensitive system components long before code is reviewed. Adversa notes that coding agents "don't just write code – they touch shell, dependencies, and tokens long before a diff lands in review." This means a compromise could directly lead to a production environment breach, bypassing traditional code review defenses.

The danger with coding agents lies not just in generating flawed code, but in their high-trust operations within the software supply chain. Their ability to access secrets, signing keys, and deployment pipelines, coupled with non-deterministic behavior, means that a full action trail can be compromised even if the final code output appears benign. The primary defense, code review, often fails to account for the agent's extensive attack surface and blast radius.

Across the ten agent categories, including general assistants, work copilots, browsers, and data engineering tools, few agents emerged without significant security concerns. Only one agent in the general assistant category and one in data engineering were exceptions to the lethal trifecta. General findings from Adversa indicate that default agent configurations often prioritize speed over safety, and a substantial portion of claimed AI agent defenses are not publicly verifiable.

The report concludes that AI agents often function as black boxes, forcing users into a "take it or leave it" scenario due to business pressures. This lack of transparency and control, combined with inherent security vulnerabilities, underscores the urgent need for improved security practices and auditing within the rapidly evolving AI landscape.