MDASH: Microsoft’s AI Beats Claude Mythos at Finding Security Flaws

2026-05-27 Florian Burnel 0 Comments

MDASH is the name of Microsoft’s new AI system specialized in vulnerability discovery. This multi-model agentic security system orchestrates more than 100 AI agents to detect, triage, and prove the existence of security flaws at scale.

Table of Contents

More Than 100 Specialized AI Agents to Audit Code
Is MDASH More Effective Than Claude Mythos Preview?
Vulnerabilities Discovered in Windows by MDASH

More Than 100 Specialized AI Agents to Audit Code

Named MDASH for Multi-moDel Agentic Scanning Harness, this AI-powered system was designed by Microsoft’s Autonomous Code Security (ACS) team. Thanks to AI agent orchestration, the system can discover, validate, and prove the exploitability of security flaws in source code. It has already helped uncover vulnerabilities in Windows through codebase analysis.

"Unlike single-model approaches, this platform coordinates more than 100 specialized AI agents across a set of state-of-the-art and distilled models to detect, analyze, and end-to-end validate exploitable flaws.", Microsoft explains.

MDASH works through a structured pipeline made up of five main stages:

Preparation: ingesting the target source code, creating linguistic indexes, and modeling the attack surface as well as threat models by analyzing commit history.
Analysis: deploying specialized auditing agents to identify code paths leading to potential vulnerabilities. These agents also gather evidence associated with each discovered vulnerability.
Validation: a second set of agents, considered debaters, steps in to argue for or against the feasibility and real-world exploitability of the detected flaw.
Deduplication: merging and grouping semantically equivalent results.
Proof: building and dynamically executing trigger scenarios to prove the vulnerability exists.

Is MDASH More Effective Than Claude Mythos Preview?

To accurately assess MDASH’s discovery capabilities, Microsoft tasked it with analyzing StorageDrive, a private device driver used internally during interviews for its offensive security researchers. Because this code had never been released, no AI model had been able to train on it.

The MDASH system successfully identified all 21 injected vulnerabilities (locking errors, use-after-free memory corruptions, and more), with zero false positives. It also achieved an 88.45% success rate on the public CyberGym benchmark (which includes 1,507 real-world vulnerability reproduction tasks), taking the top spot overall with nearly five points ahead of second place (83.1%).

Here, Microsoft shows that in the tests performed, its system outperforms Anthropic’s Claude Mythos Preview.

Vulnerabilities Discovered in Windows by MDASH

MDASH’s impact is already being felt in Microsoft’s published updates. In fact, the May 2026 Patch Tuesday includes references to 16 new security vulnerabilities discovered by the MDASH system. These are vulnerabilities in Windows networking and authentication components.

Among them are four critical vulnerabilities that enable remote code execution (RCE), directly affecting the TCP/IP kernel (tcpip.sys) and the IKEv2 service (ikeext.dll). Another notable finding is the critical flaw CVE-2026-41089 in netlogon.dll.

Microsoft also shared details about two vulnerabilities, including CVE-2026-33824. Rated critical (CVSS 9.8), it affects the network library ikeext.dll. This double-free flaw allows an unauthenticated attacker to inject malicious packets remotely on a machine where the IKEv2 protocol is enabled. With no user interaction required, the attacker can execute arbitrary code and take full control of the Windows system.

To learn more, I invite you to read the full report published by Microsoft.

Florian Burnel Co-founder of IT-Connect

Systems and network engineer, co-founder of IT-Connect and Microsoft MVP "Cloud and Datacenter Management". I'd like to share my experience and discoveries through my articles. I'm a generalist with a particular interest in Microsoft solutions and scripting. Enjoy your reading.

See Full Bio