Rapid7 Formalizes Red Teaming with Multi-Agent AI Architecture

The increasing sophistication of threat actors leveraging AI for offensive operations has prompted Rapid7's Red Team to develop a parallel AI-driven approach. Over the past year, they have formalized their penetration testing methodology into a structured multi-agent system designed to handle engagements from initial scoping through to final report generation. This system aims to automate repetitive tasks, such as reconnaissance and documentation, thereby freeing up human analysts to focus on complex decision-making, vulnerability assessment, and understanding business impact.

The architecture is built around an orchestrator that manages specialist AI agents, mirroring the collaborative nature of human red teams. This design choice separates the decision-making process from the execution of tasks, enhancing predictability, auditability, and control, which are crucial for operating in sensitive environments. Specialist agents are assigned specific roles, such as enumeration, code review, dynamic testing, and reporting, each with clearly defined inputs, outputs, and operational constraints.

A key design principle was to reverse-engineer the architecture directly from the team's established daily task lists and engagement workflows. This methodological approach ensures that the AI system adheres to proven offensive strategies, incorporating branching logic and phase transitions that reflect real-world penetration testing scenarios. The system's effectiveness was further validated through its integration with Anthropic's Project Glasswing, where it was infused with the Claude Mythos model.

This collaboration with Anthropic allowed Rapid7 to apply their AI architecture to vulnerability research and red team operations, yielding significant results in vulnerability analysis and the development of exploit chains. The project demonstrated the potential of combining a formalized multi-agent system with frontier AI models, highlighting the importance of equipping defenders with such advanced capabilities early on.

One of the primary lessons learned during development was the necessity of deliberate scope decomposition. Instead of feeding an entire engagement scope to a single AI agent, which can overwhelm its context window and lead to shallow analysis, the scope is broken down into smaller, manageable chunks. Each component or functional area is processed independently through the full architecture, ensuring depth of analysis and enabling parallelization.

This decomposition strategy ensures that each part of the target receives the agent's full attention, preventing important details from being lost. It also provides clear progress tracking, allowing human operators to see precisely which areas have been thoroughly assessed. This mirrors the strategy of experienced human pentesters who break down complex targets into logical units for deep dives before synthesizing findings.

The system's development also provided Rapid7 with direct architectural insights into how AI agents behave in adversarial contexts. Understanding the capabilities, limitations, and failure modes of these agents is crucial for assessing and securing Rapid7's own AI-powered products and for anticipating how adversaries might employ similar technologies.

By formalizing their offensive methodology into a multi-agent AI architecture, Rapid7 aims to not only enhance their own red teaming capabilities but also to contribute to the broader cybersecurity community's understanding of AI-driven offensive and defensive strategies. The focus remains on maintaining human judgment at critical decision points, ensuring that AI serves as a powerful tool to augment, rather than replace, human expertise in cybersecurity.