Three Years of Enterprise AI — What We Got Right and Wrong
A practitioner reflection on three years of enterprise AI work — the patterns I called correctly, the calls I got wrong, and what to take from each into 2026 and beyond.
Three years into a real wave of enterprise AI work has given me enough data to reflect on what we — practitioners, advisors, the field collectively — called correctly and where we missed. The reflection is useful not for nostalgia but for what it suggests about the calls we are making now.
This is a practitioner's honest assessment of what aged well and what aged badly from the patterns I've been part of advising on since the start of 2023.
What we got right
The integration discipline mattered more than the model
The most consistent thing we got right was that the discipline around the model — identity, observability, governance, audit — would determine whether AI workloads shipped, not the model itself. The teams that focused on integration discipline shipped reliably; the teams that focused on model selection without the integration work didn't.
This held across years. The model frontier kept moving; the integration discipline stayed where it was. The teams with the discipline kept shipping with each new generation of models.
Retrieval would beat fine-tuning for knowledge
A position we took early — that retrieval-augmented generation would beat fine-tuning for adding knowledge to systems — held up. Fine-tuning is the right answer for narrow high-volume tasks where behaviour shaping matters; retrieval is the right answer for knowledge tasks. The decision frameworks we used in 2023 still apply.
Human-in-the-loop would remain the production posture
The autonomous-agent enthusiasm has come and gone several times. The production reality stayed supervised. We bet on this and it has held; the autonomy aspirations remain aspirations, not production realities.
Evaluation would be the differentiator
We argued early that the teams investing in evaluation infrastructure would ship more reliably than the teams running on intuition. This has proven correct. The mature deployments are the ones with strong evaluation; the stalled ones lack it.
Open models would close the gap
Open-weight models in 2023 lagged commercial frontier models substantially. We argued the gap would close enough to make open models viable for enterprise. By 2025 this had happened. The current open ecosystem is competitive for most enterprise workloads.
Cost discipline would be necessary
We were vocal early about cost monitoring, model routing, caching. Some teams adopted the discipline; some didn't. The ones that adopted had better operational economics; the ones that didn't paid for the lesson.
Governance would crystallise around existing frameworks
We argued AI governance would be applied through existing model risk management, conventional risk frameworks, conventional audit — not as a separate discipline. This proved correct. The EU AI Act, sector regulator publications, and internal frameworks all built on existing structures.
What we got wrong
We underestimated how fast costs would drop
We were correct that costs would drop. We were wrong about the rate. The cost-per-token reductions through 2024 and 2025 exceeded what we projected. Architectures we designed in 2023 with cost optimisation as a primary concern were partly over-engineered by 2025.
We were overly cautious on agent infrastructure timelines
In late 2023 and through 2024 I argued that production agent systems were further out than the marketing suggested. That was largely right. But bounded supervisor-worker patterns moved into production faster than I expected; agent infrastructure matured in 2025 more than my 2023 projections suggested.
We underestimated MCP-style standardisation
I expected the AI integration ecosystem to remain fragmented for longer. The MCP standardisation effort got more traction than I projected. The ecosystem is still uneven but the trajectory is more positive than my early bet.
We were over-optimistic about specific industry transformations
Categories like healthcare and education absorbed AI more cautiously than I expected. The marketing about industry transformation continues to outrun the actual production deployment. I should have anticipated the regulatory and operational caution more.
We underestimated voice AI
I called voice AI "always almost there" in late 2023 and through 2024. By 2025 the real-time voice capability had crossed the threshold and produced production deployments I didn't see coming. The capability moved faster than I projected.
We were too confident about model fine-tuning roles
I expected enterprise fine-tuning to become more central than it did. It is real and useful for narrow tasks; it is not the differentiator I projected. Retrieval and prompting cover more workloads better than I expected.
We were over-cautious on multimodal adoption timelines
The integration of multimodal capability into mainstream enterprise workflows moved faster than I projected. By late 2025 multimodal was a default expectation in many contexts; I expected this in 2027 or beyond.
What stayed important
A few things have remained true throughout:
Workload-specific decisions
The answer to "what's the right model / architecture / pattern" remained workload-specific. Generic answers continued to be wrong. The discipline of matching technique to use case stayed central.
Foundations compound
Investments in data layer, evaluation, observability, governance compounded across years. Teams that built foundations in 2023 are reaping returns in 2026. Teams that chased capability without foundations keep producing impressive activity and modest delivery.
People and process matter more than tools
The technology accelerated; the change management, the skill development, the organisational capability moved at human speed. Throughout the three years, the social side has been the rate-limiting step.
Discipline beats brilliance
Teams with disciplined operations outperformed teams with brilliant individuals operating without discipline. The pattern has been remarkably consistent.
The framing matters
How enterprises framed AI work shaped what they got. "Transformation" framings produced theatre. "Productivity tool" framings produced productivity. The conceptual framing has been more decisive than I initially appreciated.
What I'm watching for 2026 and beyond
The calls I'm making now, knowing my track record on the previous ones:
The capability frontier keeps advancing
Models will keep getting better. The rate may slow; the direction is consistent. Plan for ongoing capability evolution, not for a stable platform.
Specialisation will deepen
Specialist models, specialist agents, specialist tooling for specific domains. The general capability is good enough; specialisation is where differentiation lives.
The integration with conventional enterprise systems continues to deepen
AI as a feature of every enterprise platform, not as a separate stack. Major ERP and CRM and ITSM platforms continue to integrate AI more deeply.
Governance will continue to specify
Regulators will publish more specific expectations. Sovereign requirements will continue to evolve. The institutions building governance capability will navigate this more easily than the institutions that haven't.
The autonomous-agent vision continues to be ahead of production reality
I expect this to remain true through 2026 and 2027. Bounded patterns will broaden; full autonomy on consequential actions will remain rare.
Workforce evolution continues
Roles evolve; net headcount changes modestly; the skill profile of enterprise teams shifts. The change management of this is ongoing.
What I'd tell my 2023 self
If I could brief myself at the start of this wave:
- Bet harder on the integration discipline. The pattern was visible early; I could have argued it more strongly.
- Don't underestimate how fast costs will drop. Plan architectures with flexibility, not with cost as the primary constraint.
- Build the evaluation infrastructure earlier. The teams that did won; the teams that didn't lost.
- Take the autonomous-agent demos with appropriate scepticism. They've been demos for three years.
- Engage governance earlier in initiatives. The work has its own timeline.
- Take voice AI more seriously by 2024. It was closer than I projected.
What I'm telling teams now
For the work in 2026 and beyond:
- Build foundations. They compound.
- Match technique to workload. There is no universal answer.
- Maintain the disciplines — evaluation, observability, governance, cost.
- Plan for ongoing capability evolution. The frontier doesn't stop.
- Engage business sponsors as partners. The work succeeds when business and technology align.
- Take a multi-year view. AI value compounds; quarterly thinking misses the picture.
Three years in, the work has become more substantial and less dramatic. The patterns that work are clearer; the patterns that don't are also clearer. The teams I am working with now are doing better work with less hype than the teams I was working with in 2023. The maturation of the field is showing in the maturation of the work. The lessons that aged well were mostly the ones that were boring at the time. The lessons that aged badly were mostly the ones that were exciting at the time. Take from that what you will.
RELATED READING
More from the field.
Service practices the article draws on, related programmes, and other pieces on adjacent topics.
Service practices
Related pieces
Enterprise AI in 2025 — Year in Review
A second year-end reflection from the field. What stabilised, what surprised, and what's heading into 2026.
Building the 2026 AI Roadmap — A Practitioner Framework
Annual AI planning has matured into its own discipline. A framework for building the 2026 roadmap that holds up through the year, not just through the planning cycle.
Enterprise AI in 2024 — What We Learned
A year-end practitioner reflection on what changed in enterprise AI in 2024, what stayed the same, and what to take into 2025.
Discuss this work
Bring an enterprise programme.
If anything in this piece resonates with what you're building, talk to us. Senior practitioners engage directly on architecture and delivery.
Work with the practitioners
Bring an enterprise programme.
Architecture audit, new delivery, modernisation, or in-flight rescue — Intellectual engages directly on enterprise programmes with senior practitioners.