Microsoft's MDASH Agentic AI System Discovers 16 Windows Vulnerabilities, Including Four Critical RCE Flaws
Microsoft's new agentic security system, MDASH, has uncovered 16 vulnerabilities in the Windows networking and authentication stack, including four critical remote code execution flaws, marking a shift from AI research to production-grade vulnerability discovery.

Microsoft has announced that its new agentic security system, codenamed MDASH, discovered 16 vulnerabilities in the Windows networking and authentication stack, including four critical remote code execution (RCE) flaws. Two of these — CVE-2026-40361 and CVE-2026-40364 — were flagged by Microsoft as more likely to be exploited in the wild. The findings were disclosed as part of the May 2026 Patch Tuesday release.
MDASH, built by Microsoft's Autonomous Code Security team, uses over 100 specialized AI agents and an ensemble of frontier and distilled models to discover, debate, and validate exploitable vulnerabilities end-to-end. According to Taesoo Kim, VP of Agentic Security at Microsoft, "AI vulnerability discovery has crossed from research curiosity into production-grade defense at enterprise scale, and the durable advantage lies in the agentic system around the model rather than any single model itself."
To validate MDASH's capabilities, Microsoft tested the system against a private Windows driver named StorageDrive that contained 21 intentionally injected vulnerabilities, including kernel use-after-frees, integer handling issues, IOCTL validation gaps, and locking errors. Because StorageDrive had never been publicly released, the benchmark minimized the risk that AI models had previously seen the code during training. MDASH identified all 21 vulnerabilities without generating any false positives, a result Kim described as showing that the system "can approximate professional offensive researchers."
Microsoft also highlighted MDASH's performance on internal and public benchmarks. The system achieved a 96% recall rate against five years of confirmed Microsoft Security Response Center (MSRC) vulnerabilities in clfs.sys and a 100% recall rate in tcpip.sys. On the CyberGym public benchmark, which contains 1,507 vulnerabilities from OSS-Fuzz projects, MDASH scored 88.45%, placing it at the top of the leaderboard, roughly five percentage points ahead of the next highest-ranked system.
The four critical RCE vulnerabilities affect the Windows networking and authentication stack, though Microsoft has not yet disclosed full technical details for all of them. The company emphasized that CVE-2026-40361 and CVE-2026-40364 are the most likely to be weaponized, urging customers to prioritize patching. The remaining 12 vulnerabilities range from elevation of privilege to information disclosure, with varying severity levels.
MDASH is currently in a limited private preview with select customers, signaling Microsoft's intent to commercialize the system as a service. Kim noted that "AI-powered vulnerability discovery stops being speculative and starts being an engineering problem," adding that the findings demonstrate that "AI vulnerability findings can scale." The system's ability to find real-world flaws in production code marks a significant milestone in the application of AI to offensive security.
The announcement comes amid growing competition in AI-driven security, with other vendors and open-source projects also developing autonomous vulnerability discovery tools. Microsoft's approach — using a multi-agent system rather than a single model — may set a new standard for how enterprises approach proactive vulnerability hunting. As MDASH moves toward broader availability, it could fundamentally change the economics of finding and fixing security flaws before attackers exploit them.
Microsoft's blog post provides deeper technical details on MDASH's architecture, revealing that it orchestrates over 100 specialized AI agents across an ensemble of frontier and distilled models to discover, debate, and prove exploitable bugs. The system achieved an 88.45% score on the CyberGym benchmark, the top on the leaderboard, and is now in limited private preview for security engineering teams. The post also includes deep dives into two critical CVEs: CVE-2026-33827 (remote unauthenticated UAF in tcpip.sys) and CVE-2026-33824 (unauthenticated IKEv2 double-free leading to LocalSystem RCE).