Verder naar navigatie Doorgaan naar hoofdinhoud Ga naar de voettekst

The Symmetry Problem: Why AI Attackers and AI Defenders Share the Same Blind Spots

door Steven van der Baan

15 juni 2026

This is the fourth piece in a series. The first, The Cartographer's Advantage, made the philosophical argument for why human judgment is structurally irreplaceable in security testing. The second, The Expedition Debrief, asked what that looks like in practice. The third, Teaching the Map-Makers, examined what AI's training data actually contains, and what it systematically leaves out. This piece asks a question the first three left standing: when both the attacker and the defender are navigating by AI-generated maps, what does it mean that they learned cartography from the same teachers?

Last year, XBOW became the first non-human entity to top HackerOne's US leaderboard. A few months later, Burp Suite released Burp AI, an autonomous capability that investigates scanner findings, attempts exploitation, and identifies additional attack vectors without human direction. Most practitioners read both events as data points in the same narrative: AI is transforming what attackers can do, AI is transforming what defenders can do, and the race between them will determine the structural advantage for the next decade. That narrative is mostly right. The part worth examining carefully is where the race is actually happening, because both sides drew their maps from the same cartographic school, and their maps run out at the same place.

 

The Cartographers Had the Same Teacher

The previous piece in this series spent some time on the training corpus that AI security tools learn from: CVE databases, bug bounty reports, security research papers, OWASP lists, exploit archives, conference presentations. The sources share a structural characteristic. They are records of vulnerabilities that were found, named, documented, and published, the vulnerability landscape that intersects with the incentive structures and publication norms of the security industry, not a representative sample of what's actually exploitable in real applications.

That analysis focused on the offensive side. The defensive side has the same provenance.

The AI capabilities underpinning modern DAST platforms, SAST tools, AI-augmented WAFs, and products like Burp AI were built on the same substrate. The patterns they match against are the patterns that made it into security research. The detection signatures they generate are derived from the same publicly documented vulnerability corpus as the attack patterns the offensive tools search for. This isn't a criticism of the vendors building these tools. It's a structural observation about where security knowledge gets recorded and how it becomes a training signal: findings that were triage-legible, demonstrable, and documented are over-represented; findings that were application-specific, context-dependent, or too complex to generalise are not.

The observers closest to XBOW's work are frank about what this means. Michiel Prins, HackerOne's co-founder, put it directly: business logic vulnerabilities are "very hard for an AI to find, because the AI needs to really understand the intent of the application." Of XBOW's 1,060 submitted findings, he noted the tool "excels in volume" but "does not yet excel in business impact." Amélie Koran, reviewing the findings, described them as "the more basic things you can find with automation: data leaks, XML exposure, cross-site scripting, command injection and access control." The tool that topped the global bug bounty leaderboard was winning on the documented terrain, the exact classes the corpus could teach it. The category it can't reach is precisely the one that doesn't survive translation into CVE format.

At Black Hat 2025, the consensus finding pointed in the same direction from the other side of the engagement. Analysts observed that AI tools are "particularly adept at spotting SQL vulnerabilities, likely due to training data since SQL flaws are commonplace." The attribution is direct: the tool is strong on SQL not because of architectural superiority but because SQL injection is common in the training corpus. Swap the framing from offense to defense and the same sentence applies. AI detection tools are strong on the vulnerability classes that are well-documented for the same reason.

 

What the Race Is Actually Over

None of this diminishes what AI security tools are achieving on their home terrain. Shannon, an open-source autonomous pentester, runs comprehensive assessments for roughly sixty dollars and finds real vulnerabilities at rates competitive with skilled human testers on the classes it knows, what its documentation calls a "hit list" derived from OWASP and public vulnerability research. Research benchmarking AI pentesters against sanitized CVE datasets puts success rates at around 87%. These are genuine capabilities delivering genuine value on well-documented attack surface.

The cliff appears when you step off that terrain, and it's steeper than most discussions of AI security acknowledge. The same benchmark analysis puts success on realistic conditions, novel application behaviour, target-specific logic, custom implementations that aren't in the training data, at around 13%. The gap between 87% and 13% is the corpus talking. The tools perform well on what they've seen. They struggle on what they haven't.

The arms race is happening on the terrain where both numbers are high. Attacker AI is improving at finding documented vulnerability classes faster, cheaper, and more consistently than human attackers. Defensive AI is improving at detecting and responding to those same classes faster than human defenders. Both trajectories are real and the delta between them matters. CERT-EU documents that mean time to exploit newly disclosed vulnerabilities has dropped to approximately negative seven days, exploitation now routinely precedes patching, and AI-generated exploits are a significant part of that acceleration. On documented terrain, the race is genuine and the stakes are high.

But documented terrain is not the whole territory.

 

The Terrain Both Maps Skip

Consider IDOR, it's the example I keep returning to because it exposes the gap so cleanly. A DAST scanner looking for unauthorized direct object references sees a request that returns a valid HTTP 200. The scanner marks it as clean and moves on. IDOR isn't a malformed-input problem, it's a logic problem. The request is syntactically valid. The flaw is that the ownership check the application was supposed to perform didn't happen, or happened incorrectly, or happened correctly in one endpoint and not in a downstream one. The scanner has no model for "this is correct syntax but violates the intended authorization boundary," because that model requires understanding the intended authorization boundary, which isn't in the corpus.

The defensive gap mirrors the offensive gap precisely, and for the same reason. DAST platforms miss business logic vulnerabilities, incorrect discount calculations, broken billing rules, permission assumptions that different teams implemented differently. They miss multi-step workflow failures that require maintaining session state across a sequence of requests. They miss authorization boundary issues where the authorization model itself is flawed rather than incorrectly implemented. Google's Big Sleep, one of the most sophisticated defensive AI vulnerability discovery systems deployed in production, had its confirmed finds in 2024 concentrated entirely in memory safety classes, buffer overflows, integer overflows, use-after-free. No business logic finds. No state-dependent authorization finds. The finding set is a precise reflection of the training distribution.

This is the symmetry the arms race narrative doesn't account for. On the terrain both tools know well, the documented vulnerability landscape that fills CVE databases and bug bounty corpora, the competition is real, the improvement trajectories are genuine, and the outcome matters for defenders. On the terrain neither tool knows well, business logic flaws, state-dependent vulnerabilities, multi-actor authorization failures, vulnerability classes that haven't been named yet, neither side is operating effectively. The attacker AI isn't probing for these classes. The defensive AI isn't flagging them when they're exploited. A human adversary who understands business logic, who approaches an application as a system of interacting assumptions rather than a collection of endpoints, is targeting attack surface that the arms race, as currently described, leaves entirely uncontested.

Both Maps Run Out at the Same Place

The danger isn't that AI security tools are bad at what they do. They're extraordinarily capable on their home terrain, and anyone arguing otherwise isn't paying attention. The danger is the conclusion a buyer can draw from the arms race framing: that investing in defensive AI in response to attacker AI represents comprehensive security improvement, that AI versus AI constitutes a full accounting of the risk.

It doesn't. It constitutes an accounting of the risk on the terrain both sides have mapped.

Korzybski's observation has run as a thread through this series: the map is not the territory. Every representation of reality is a simplification, useful precisely because it reduces complexity and limited for the same reason. The new expression of that observation isn't that AI draws worse maps than human testers, on documented terrain, AI draws faster, broader, more consistent maps than any human could. It's that the race being waged with such intensity and capability is happening over the well-mapped terrain. The terrain neither map covers isn't a contested battlefield. It's invisible to both combatants.

The question I keep returning to, for any organisation that has built or is building an AI-first security posture, isn't whether the defensive AI can keep pace with the attacker AI. On documented terrain, it probably can, and getting better. The question is what happens on the terrain where both maps agree to end.

 

References

1. XBOW, "We Ran 1,060 Autonomous Attacks", https://xbow.com/blog/we-ran-1060-autonomous-attacks

2. CyberScoop, "Is XBOW's success the beginning of the end of human-led bug hunting? Not yet." (Michiel Prins and Amélie Koran quotes), https://cyberscoop.com/is-xbows-success-the-beginning-of-the-end-of-human-led-bug-hunting-not-yet/

3. The Register, "Black Hat/DEF CON: AI more useful for defense than hacking" (SQL/training data attribution), https://www.theregister.com/2025/08/11/ai_security_offense_defense

4. Shannon, open-source autonomous pentester, https://github.com/KeygraphHQ/shannon

5. AppSec Santa, AI pentesting agents benchmark (87%/13% figures), https://appsecsanta.com/research/ai-pentesting-agents-2026

6. Google Project Zero, Big Sleep (memory safety finding set), https://projectzero.google/2024/10/from-naptime-to-big-sleep.html

7. CERT-EU, mean time to exploit analysis, https://www.cert.europa.eu/blog/ai-vulnerability-discovery-defenders-must-adapt