Microsoft has deployed MDASH, a multi-agent artificial intelligence system, to detect software vulnerabilities. The system identified 16 new flaws in Windows components, including four critical remote code execution vulnerabilities. The system is currently in private preview, used by Microsoft's security teams and a limited number of customers.
System Architecture and Function
MDASH, internally referred to as the Microsoft Security multi-model agentic scanning harness, employs over 100 specialized AI agents. The system operates in a staged process:
- Prepares target code for analysis
- Scans for weaknesses
- Validates findings through agent debate, where disagreement between models signals suspicious findings
- Removes duplicate entries
- Attempts to prove flaws by generating triggering inputs
The system uses a combination of cutting-edge and smaller AI models, with agents specialized in identifying specific software bug types.
Vulnerabilities Discovered
MDASH identified 16 vulnerabilities in Windows networking and authentication components, included in a Patch Tuesday security release. The breakdown includes:
- 10 vulnerabilities in kernel-mode software
- 6 vulnerabilities in user-mode software
- Most vulnerabilities were reachable from a network without credentials
Critical Remote Code Execution Flaws
Two of the four critical flaws were documented in detail:
CVE-2026-33827 (tcpip.sys): A remote, unauthenticated use-after-free vulnerability in the Windows IPv4 receive path, related to Strict Source and Record Route processing. The flaw involved improper lifetime management of a reference-counted object.
CVE-2026-33824 (IKEEXT): A remote, unauthenticated double-free vulnerability over UDP/500 on hosts configured as IKEv2 responders. The flaw spanned six files and involved a shallow memory copy error.
Performance Benchmarks
MDASH achieved the following results in testing:
Test Result Private sample device driver (StorageDrive) with 21 planted vulnerabilities 100% detection rate, zero false positives Historical MSRC cases (clfs.sys - 28 confirmed bugs) 96% recall rate Historical MSRC cases (tcpip.sys - 7 confirmed bugs) 100% recall rate Public CyberGym benchmark 88.45% score (highest published at time of report)The CyberGym score outperformed other AI models including Anthropic's Claude Mythos and OpenAI's GPT 5.5.
Development and Deployment
The work was conducted by Microsoft's Autonomous Code Security team in collaboration with Windows Attack Research and Protection. The system is designed to approximate the work of professional offensive security researchers.
Current access:
- Microsoft's security engineering teams
- A limited set of customers in private preview
- Microsoft plans to expand access to select enterprise customers upon application
The system allows for plugins that inject specialist knowledge into the scanning process.
Context
MDASH is part of a broader trend in cybersecurity where defensive and offensive actors are incorporating AI tools. Microsoft has implemented access controls on the system to prevent misuse.