AI-powered security audits: faster discovery, same human accountability.

In May 2026, Anthropic announced that it had worked with Mozilla to apply Claude to the Firefox codebase. Over a two-week period, the model helped identify 22 vulnerabilities. The result is a useful signal for security teams considering how AI fits into vulnerability discovery: the throughput gains are real, but the accountability model does not change.

What the exercise demonstrated

Firefox is a large, mature codebase with a long history of security review. Finding 22 vulnerabilities in two weeks is a meaningful output, especially when many security teams operate under permanent backlog pressure. The exercise suggests that AI can help scale the initial stages of code review: scanning broad surfaces, flagging suspicious patterns and surfacing issues that might otherwise wait months for human attention.

What it does not demonstrate is that AI can replace security researchers. Vulnerability discovery is only the first step. Each finding still has to be validated, prioritised, reproduced, remediated and disclosed. Those steps require judgement, context and accountability that models do not possess.

Where AI helps most in security review

The most productive place for AI in security auditing is triage, not final verdict. Codebases are larger than any team can read comprehensively. AI can narrow the search space, identify common vulnerability classes and suggest locations that deserve closer inspection. This is particularly valuable for legacy code, third-party dependencies and large refactors where risk surfaces shift.

AI is also useful for consistency. A model can apply a checklist of security patterns across thousands of files without fatigue. That does not mean every match is a vulnerability, but it does mean the obvious candidates are less likely to be missed.

Where human accountability remains essential

Security findings are consequential. A false positive can waste engineering hours. A false negative can expose users to exploitation. Reporting a vulnerability triggers a chain of decisions about severity, disclosure timing and patch priority. Those decisions affect real people and organisations, and they require accountable human judgment.

AI can accelerate input into those decisions, but it cannot own them. Firms that use AI for security review should make the human role explicit: who validates findings, who decides severity, who signs off on disclosure and who is responsible if something goes wrong.

Implications for in-house security teams

Teams looking to replicate this kind of work should start with bounded scope. Pick a single codebase or vulnerability class, run an AI-assisted review alongside a human review and compare results. Measure false-positive rates, time to triage and whether the model surfaces issues the human review missed.

Data handling is also important. Feeding proprietary code into a third-party model raises questions about retention, training and residency. Organisations should understand the data flow before running production code through an external service, especially in regulated sectors.

The bottom line

AI-assisted security auditing is moving from experiment to operational tool. The Firefox case shows that large-scale, AI-augmented review can deliver results quickly. It also confirms that the final accountability for security remains with people. The firms that benefit will be the ones that use AI to amplify expert attention, not to replace it.

AI-powered security audits: faster discovery, same human accountability.

What the exercise demonstrated

Where AI helps most in security review

Where human accountability remains essential

Implications for in-house security teams

The bottom line

Keep reading.

Assembling a cost-efficient AI infrastructure stack layer by layer.

Zip's procurement Superagents: a case study in governance-first AI.

Microsoft Build 2026: a CIO's guide to the new agent stack.

Longer thinking →