shmews.

Microsoft has deployed MDASH, a multi-agent artificial intelligence system, to detect software vulnerabilities. The system identified 16 new flaws in Windows components, including four critical remote code execution vulnerabilities. The system is currently in private preview, used by Microsoft's security teams and a limited number of customers.

System Architecture and Function

MDASH, internally referred to as the Microsoft Security multi-model agentic scanning harness, employs over 100 specialized AI agents. The system operates in a staged process:

Prepares target code for analysis
Scans for weaknesses
Validates findings through agent debate, where disagreement between models signals suspicious findings
Removes duplicate entries
Attempts to prove flaws by generating triggering inputs

The system uses a combination of cutting-edge and smaller AI models, with agents specialized in identifying specific software bug types.

Vulnerabilities Discovered

MDASH identified 16 vulnerabilities in Windows networking and authentication components, included in a Patch Tuesday security release. The breakdown includes:

10 vulnerabilities in kernel-mode software
6 vulnerabilities in user-mode software
Most vulnerabilities were reachable from a network without credentials

Critical Remote Code Execution Flaws

Two of the four critical flaws were documented in detail:

CVE-2026-33827 (tcpip.sys): A remote, unauthenticated use-after-free vulnerability in the Windows IPv4 receive path, related to Strict Source and Record Route processing. The flaw involved improper lifetime management of a reference-counted object.

CVE-2026-33824 (IKEEXT): A remote, unauthenticated double-free vulnerability over UDP/500 on hosts configured as IKEv2 responders. The flaw spanned six files and involved a shallow memory copy error.

Performance Benchmarks

MDASH achieved the following results in testing:

Test Result Private sample device driver (StorageDrive) with 21 planted vulnerabilities 100% detection rate, zero false positives Historical MSRC cases (clfs.sys - 28 confirmed bugs) 96% recall rate Historical MSRC cases (tcpip.sys - 7 confirmed bugs) 100% recall rate Public CyberGym benchmark 88.45% score (highest published at time of report)

The CyberGym score outperformed other AI models including Anthropic's Claude Mythos and OpenAI's GPT 5.5.

Development and Deployment

The work was conducted by Microsoft's Autonomous Code Security team in collaboration with Windows Attack Research and Protection. The system is designed to approximate the work of professional offensive security researchers.

Current access:

Microsoft's security engineering teams
A limited set of customers in private preview
Microsoft plans to expand access to select enterprise customers upon application

The system allows for plugins that inject specialist knowledge into the scanning process.

Context

MDASH is part of a broader trend in cybersecurity where defensive and offensive actors are incorporating AI tools. Microsoft has implemented access controls on the system to prevent misuse.