AI Auditing and Assurance — The Discipline That's Emerging
AI auditing has moved from a theoretical concept to a real enterprise discipline through 2024 and 2025. The frameworks are codifying; the practice is becoming professional.
AI auditing has moved from a theoretical concept to a real enterprise discipline through 2024 and 2025. Internal audit functions are building AI capability. External auditors are publishing methodologies. Specialised AI audit firms are emerging. The frameworks are codifying. The practice is becoming professional.
This piece is a practitioner view of where AI auditing sits in 2025 — what auditors are actually looking for, what evidence enterprises need to provide, and what the discipline is converging toward.
What AI auditing actually covers
In current usage, AI audit covers several distinct areas:
Model audit
Examining the AI models themselves:
- Provenance — where did the model come from
- Training data characterisation
- Bias and fairness testing
- Performance characterisation
- Robustness testing
- Documentation completeness
System audit
Examining the integrated AI system:
- Architecture and design
- Integration with enterprise systems
- Identity and access controls
- Audit logging adequacy
- Operational procedures
Operational audit
Examining how the system runs:
- Compliance with documented policies
- Incident handling
- Change management
- Performance monitoring
- Cost controls
Governance audit
Examining the institutional context:
- AI inventory completeness
- Risk classification
- Approval processes
- Periodic review discipline
- Workforce capability
Compliance audit
Examining specific regulatory compliance:
- EU AI Act compliance for applicable systems
- Sector-specific regulatory compliance
- Cross-jurisdictional considerations
- Data protection compliance
What auditors are actually looking for
Across the engagements where we have supported audit responses:
Documentation that reflects reality
Documentation that describes the system as it currently is. Out-of-date documentation is one of the most common findings.
Evidence trails
For each policy claim, evidence that the policy is being followed. Logs, screenshots, exports, sign-offs.
Risk classification with rationale
Each AI system classified by risk with the rationale. Auditors test whether the classification is appropriate.
Decision traces
For systems making consequential decisions, the ability to reconstruct a specific decision. Why was this case handled this way? What inputs, what outputs, what reasoning?
Periodic review evidence
Evidence that the periodic reviews actually happened. Meeting notes, sign-offs, action items, follow-up.
Vendor evidence
For systems with vendor AI components, evidence of the vendor's claims and the enterprise's verification.
Workforce evidence
Evidence that the people operating the system have appropriate skills and authority.
Incident handling evidence
For past incidents, evidence of the response, root cause, and remediation.
What gets flagged
Common audit findings:
Stale documentation
The most common single finding. Systems change; documentation doesn't keep up.
Insufficient logging
The traces don't reconstruct what happened. Audit identity isn't recorded with each action. Retention is too short.
Permission propagation gaps
The user's permissions don't flow through every layer. Audit shows broader access than intended.
Periodic review deferrals
Reviews scheduled but not held. Action items from prior reviews not followed up.
Vendor evidence gaps
The vendor's claims aren't validated. The contract doesn't include the controls the audit assumed.
Model version drift
The system is using a different model version than documented. The version drift wasn't approved.
Bias and fairness gaps
The system hasn't been tested for bias and fairness against the populations it serves.
Cost control absences
No budgets, no circuit breakers, no anomaly detection. Cost exposure isn't bounded.
The audit lifecycle
A working AI audit cycle:
Pre-audit preparation
The team prepares evidence packages in advance. The documentation is current. The traces are exportable. The team rehearses the responses.
Field work
The auditor reviews evidence, interviews team members, performs specific tests. The auditor's findings are documented as they emerge.
Findings discussion
Findings are reviewed with the team. Some are factual disagreements; some are about interpretation. The discussion shapes the final report.
Remediation planning
For confirmed findings, remediation plans with timelines. The audit closes when remediation is committed.
Follow-up
The next audit checks remediation. Persistent findings escalate; resolved findings close.
This is the same cycle as conventional audit. The substance is AI-specific.
What's settling in the methodology
Through 2024 and 2025 the methodology has been converging:
Risk-tiered approach
Different risk tiers get different audit depth. High-risk systems get extensive examination; lower-risk get sampled.
Standardised evidence
The categories of evidence are standardising — inventory documents, risk classifications, architecture diagrams, audit traces, model cards, vendor attestations.
Tooling for audit
Specialised tools for AI audit are emerging — model documentation generators, audit trace exporters, fairness testing platforms.
Audit committee involvement
Boards and audit committees are increasingly engaged on AI risk. The cadence of AI updates to board is rising.
External assurance services
External firms offering AI audit services — Big Four firms, specialist boutiques, vendor-specific audit. The market is forming.
What's hard about AI audit
Genuine challenges:
Audit cost-benefit
Auditing complex AI systems comprehensively is expensive. Calibrating audit depth against risk is judgment work.
Auditor capability
The auditor needs both audit experience and AI understanding. Few professionals have both. The capability is being built.
Vendor opacity
Some AI vendors don't provide the documentation needed for audit. Negotiating audit access becomes part of vendor management.
Evolving systems
AI systems evolve rapidly. Periodic audits may always be slightly out of date. Continuous audit is emerging but immature.
Bias and fairness measurement
Measuring bias and fairness against meaningful populations is methodologically harder than the metrics suggest. The discipline is developing.
What we keep seeing
Patterns in enterprise AI audit engagements in 2025:
Documentation is the most common finding. Stale, incomplete, or missing documentation drives many of the findings.
Logs that don't reconstruct are the second most common. Audit trails that look adequate but fail in practice.
Vendor governance gaps surface. Enterprises hadn't validated vendor claims; the audit forces it.
Risk classification gets refined through audit. What seemed like a low-risk system has implications the audit surfaces.
The audit produces improvement. The discipline of preparing for and responding to audit makes the systems better.
External audit is still being figured out. Methodologies vary across firms; standards are emerging.
What we recommend
For enterprises building AI audit readiness in 2025:
- Build the AI inventory. Without it, audit is impossible.
- Document each system properly. Currency matters; out-of-date docs are findings.
- Capture audit-grade logs from day one. Retrofitting is painful.
- Run periodic internal reviews. Practice the audit cycle.
- Build vendor governance with audit access. Negotiate at procurement.
- Track audit findings and remediation. Persistent findings escalate.
- Engage external assurance where it adds value. The capability builds internal discipline.
AI audit in 2025 is a real discipline being built. The enterprises that engage with it deliberately — building inventory, documentation, logs, governance — operate with audit readiness as part of their normal operating model. The enterprises that respond to audit reactively spend the audit cycle scrambling. The discipline pays back; the discipline is the asset.
RELATED READING
More from the field.
Service practices the article draws on, related programmes, and other pieces on adjacent topics.
Service practices
Related pieces
AI Governance Frameworks Codify — What's Settled in 2025
AI governance was an evolving set of internal practices a year ago. In 2025 the frameworks are codifying — internally and externally — and the patterns that work are clearer.
AI Governance and Guardrails for Production Systems
Most enterprises talk about AI governance after the first incident. The teams that do it from day one ship faster, not slower — the discipline matters as much as the model.
AI Vendor Selection and Procurement for Enterprise
AI vendors are pitching every enterprise. The procurement process for AI tools needs to evaluate things conventional software procurement doesn't — model lineage, data handling, evaluation methodology, exit strategy.
Discuss this work
Bring an enterprise programme.
If anything in this piece resonates with what you're building, talk to us. Senior practitioners engage directly on architecture and delivery.
Work with the practitioners
Bring an enterprise programme.
Architecture audit, new delivery, modernisation, or in-flight rescue — Intellectual engages directly on enterprise programmes with senior practitioners.