From AI Pilot to Production — The Playbook That Bridges the Gap
Every enterprise has AI pilots. Far fewer have AI in production. The bridge between the two is more about organisational discipline than technical capability. A practitioner playbook.
A pattern from the last eighteen months of enterprise AI engagements: every organisation has AI pilots. Far fewer have AI in actual production. The pilots produce promising results, then stall. The bridge from pilot to production is less about technical capability and more about organisational discipline. This is a playbook for that bridge.
Why pilots stall
The recurring failure modes:
- No production path identified. The pilot succeeded against a goal that was never specified as a production goal. The team has a working system but no clear next step.
- No production owner. The pilot was run by an innovation team, a strategy team, or an external partner. When the time comes to operate the system, nobody owns it.
- Governance never engaged. The pilot proceeded under a "this is experimental" exemption. Production deployment requires the security, compliance, and risk reviews that were deferred.
- Architecture not production-ready. The pilot was built fast; the system is not engineered for the reliability, observability, or scale that production requires.
- Business case underdeveloped. The pilot demonstrated capability; the case for the operating cost, the change-management cost, and the risk has not been built.
- Change management absent. Production deployment requires user adoption, workflow change, possibly role change. No plan for any of these.
Most stalled pilots fail at two or three of these simultaneously. Fixing them one at a time after the pilot is more expensive than building them into the pilot's design.
What the playbook covers
A working pilot-to-production playbook addresses six dimensions in sequence:
- Strategic intent — what is this AI workload supposed to accomplish, and how do we know if it succeeded
- Production path — what does the path from pilot to operation look like
- Architecture — what is the production-grade architecture
- Governance — what reviews and approvals are needed
- Operating model — who owns it, how is it run
- Change management — how do users actually adopt it
The playbook isn't a checklist run once; it's a working document that evolves through the engagement.
Step 1 — Strategic intent
Before any pilot, write the success criteria:
- What is the workload doing?
- What is the value when it works?
- What does success look like, quantified?
- What is the threshold below which it isn't worth proceeding?
- Who is the executive sponsor?
A pilot without these is exploration, which is fine; an exploration that proceeds toward production without these is the path to stalled production.
Step 2 — Production path
At the start of the pilot, sketch the production path:
- If the pilot succeeds, what is the deployment plan?
- Who owns the production system?
- What are the rough dates?
- What is the rough operating cost?
- What is the rough change management effort?
This is not a binding plan; it is a discipline. Pilots that have a sketched production path are more likely to actually reach production.
Step 3 — Architecture
The pilot may run on a notebook or a small service. Production cannot. The architecture work that has to happen before production:
- Resilience — error handling, retries, fallbacks, graceful degradation
- Observability — traces, metrics, logs at the level needed for operations
- Cost discipline — token monitoring, budgets, circuit breakers
- Security — authentication, authorisation, secret management, encryption
- Audit — for regulated workloads, the audit trail at compliance-grade depth
- Scale — capacity, throughput, latency under realistic load
- Maintenance — how is the system updated, what is the deployment process
The pilot architecture is the starting point. The production architecture is a deliberate rebuild, often substantially different. Pilots that try to incrementally evolve into production architectures tend to ship later and operate less reliably.
Step 4 — Governance
The reviews that have to happen:
- Security review — covering data flows, attack surfaces, secret handling
- Privacy review — what personal data is touched, what the lawful basis is, what the retention is
- Compliance review — for regulated industries, the specific regulatory framework's requirements
- Risk review — what can go wrong, what is the residual risk, what is the mitigation
- IT operations review — can this be run in the existing operations model
- Architectural review — does this fit the enterprise architecture
These are not box-ticking exercises. They surface real issues that may require rework. The earlier they happen — ideally in parallel with development — the cheaper the resulting changes are.
Step 5 — Operating model
Who owns the production system, and how is it run:
- Product ownership — who decides what the system should do, what changes to prioritise
- Engineering ownership — who maintains and evolves the system
- Operational ownership — who responds to incidents, handles tickets
- Cost ownership — whose budget pays for it, who is accountable for the spend
- Vendor relationships — who manages the model provider relationship
Without explicit assignment, ownership defaults to "the team that built it", which usually wasn't expecting to operate it. Resentment and underperformance follow.
Step 6 — Change management
Production AI changes how people work. The change management that has to happen:
- User training — how do affected users learn to use the system
- Workflow change — what processes are different now
- Role evolution — what does the work of the affected team look like
- Performance metric change — how is success measured now
- Communication — how is the change explained to the broader organisation
A well-engineered system that nobody uses is failure. Change management is what determines whether the system actually generates value.
The cadence
A working cadence for a pilot-to-production engagement:
Weeks 1-2 — strategic intent, production path sketched, governance partners identified
Weeks 3-8 — pilot development, with parallel architectural design for production
Weeks 9-12 — pilot evaluation against success criteria; if successful, governance reviews begin
Weeks 13-20 — production engineering, security/privacy/compliance reviews proceed, change management planned
Weeks 21-24 — production deployment, user training, monitored ramp
Weeks 25+ — sustained operation, ongoing evaluation, iteration
This is a six-month engagement for most workloads. The pilots that stall are the ones that compress the early steps and discover the missed work at week 24 instead of week 4.
What we keep seeing
Recurring patterns in pilot-to-production engagements:
The governance work consumes more time than expected. First-time AI deployments often spend more weeks in security/privacy review than in development. Plan accordingly.
The architecture rebuild is real. The pilot architecture is rarely the production architecture. Budget for the rebuild explicitly.
Operating ownership is the most-deferred decision. The conversation that nobody wants to have until the system is about to go live. Have it earlier.
Change management is the most-neglected dimension. Engineering attention dominates; change management is treated as the easy part. It isn't.
Executive sponsorship is the difference between stalled and shipped. Workloads with a committed sponsor navigate the obstacles; workloads without one die at the first hurdle.
What we recommend
For enterprise teams managing AI pilots in 2024:
- Sketch the production path at pilot kickoff, not at pilot conclusion.
- Engage governance partners early. Surprise reviews delay everything.
- Plan the production architecture as a deliberate rebuild, not an incremental evolution.
- Assign operating ownership before deployment, not after.
- Treat change management as a primary work stream, not an afterthought.
- Confirm executive sponsorship is real, not nominal.
- Budget six months for first-time deployments. Compression produces stalled pilots.
The bridge from pilot to production is the work that turns AI capability into enterprise value. The playbook is not complicated, but it is consistent — the same dimensions in the same order, in workload after workload. The organisations that follow it ship; the organisations that improvise produce impressive pilots and few shipped systems.
RELATED READING
More from the field.
Service practices the article draws on, related programmes, and other pieces on adjacent topics.
Service practices
Related pieces
Building the 2026 AI Roadmap — A Practitioner Framework
Annual AI planning has matured into its own discipline. A framework for building the 2026 roadmap that holds up through the year, not just through the planning cycle.
Building an AI Centre of Excellence — What Actually Works
Every enterprise has an AI Centre of Excellence on the org chart or planned for one. The shape that compounds value differs from the consultancy-recommended default.
LLMOps Maturity — A Practitioner's Maturity Model
Most enterprises are operating LLM workloads on engineering intuition alone. A maturity model helps locate where you are, what to invest in next, and what the next stage actually requires.
Discuss this work
Bring an enterprise programme.
If anything in this piece resonates with what you're building, talk to us. Senior practitioners engage directly on architecture and delivery.
Work with the practitioners
Bring an enterprise programme.
Architecture audit, new delivery, modernisation, or in-flight rescue — Intellectual engages directly on enterprise programmes with senior practitioners.