Intellectual
← All Insights
AI & Enterprise AI4 February 20257 min read

Agent Infrastructure Catches Up — The Production Stack in 2025

Agent infrastructure was the gap a year ago. In 2025 the stack has matured enough that production deployment is a reasonable expectation, not a research bet.

A year ago, the infrastructure for production agent systems was the gap. The capability was demonstrable; the operational and architectural pieces around it were not. In 2025 the picture has changed. Agent infrastructure has matured enough that production deployment is a reasonable expectation, not a research bet. The teams shipping agent systems share a recognisable stack.

This piece is a practitioner snapshot of the agent infrastructure stack in early 2025 — what components matter, which products are stabilising, and where the residual gaps remain.

The components of an agent stack

A working production agent system in 2025 has the following pieces:

Orchestration framework

The runtime that coordinates model calls, tool calls, and state. Choices in early 2025:

  • LangGraph — graph-based, controllable, common in production
  • OpenAI Assistants API and AgentSDK — hosted, simpler, less flexible
  • Anthropic Claude with MCP — increasingly common as MCP gains adoption
  • Custom orchestration — what many serious production teams build

The frameworks have stabilised enough that the choice is now about fit rather than capability.

Tool catalogue and registry

The set of functions the agent can call. The infrastructure to manage them:

  • Registry with descriptions, schemas, permissions
  • Versioning and deprecation
  • Approval workflows for new tools
  • Audit logging of tool invocations

This was the gap a year ago. Tools were added casually; governance was retrofitted. In 2025 the discipline is more common.

Permission and identity propagation

The user's identity travels with every tool call. Permission enforcement happens at the tool execution layer. This is now standard pattern; deployments that skip it fail security review.

State management

Agent state — conversation history, intermediate results, current goal — needs to be managed. In 2025 the patterns are:

  • Per-session state in a fast store (Redis or equivalent)
  • Persisted state for long-running interactions
  • Summarisation of long histories to fit context windows
  • Explicit state contracts between agent steps

Cost and resource controls

Circuit breakers at multiple levels:

  • Per-call cost caps
  • Per-session budgets
  • Per-user budgets
  • Workload-level budgets
  • Anomaly detection on cost rate

These are now table stakes. Production agents without them produce expensive incidents.

Observability

Full traces capturing:

  • Every model call (prompt, response, model version, cost)
  • Every tool call (arguments, result, latency, errors)
  • State transitions
  • Errors and recoveries
  • Human checkpoints

The observability tooling — LangSmith, Langfuse, Phoenix, custom — has matured. The traces are useful.

Evaluation

Curated test sets exercising the agent's behaviour. Regression testing on every change. This is the discipline that distinguishes production agent systems from extended pilots.

Human-in-the-loop interface

For agents that propose consequential actions, the human approval surface:

  • Clear presentation of what the agent did and why
  • Easy approval or rejection
  • Edit-then-approve patterns where appropriate
  • Audit trail of human decisions

Error and escalation handling

What happens when the agent can't proceed:

  • Clear "I cannot complete this" outputs
  • Routing to humans with sufficient context
  • Recovery from partial states
  • Graceful degradation

The stabilising patterns

Across the production agent deployments we are seeing in 2025:

Supervisor-worker is the dominant shape

Open-ended free-form agent conversations are still uncommon in production. Supervisor-worker patterns with deterministic hand-offs dominate. The supervisor is closer to a workflow engine than a free planner.

MCP is gaining traction

Model Context Protocol adoption is broadening. Internal MCP servers, vendor MCP servers, community MCP servers. The standardisation is reducing the integration burden.

Specialist agents over generalist agents

A few well-bounded agents handle specific workloads. Generalist agents that can do anything well are still aspirational. Production deployments are narrower.

Hybrid agentic-deterministic flows

Agents are steps in larger workflows, not the whole workflow. Deterministic code handles the parts where rules are stable; agents handle the parts that benefit from reasoning. The orchestration sits in a workflow engine.

Heavy use of evaluation

The teams that ship reliably are the ones with evaluation. Without it, drift is invisible.

The residual gaps

Even in 2025, some gaps remain:

Long-running stateful agents

Agents that operate over hours or days, maintaining state, surviving restarts — the infrastructure is immature. Most production agents are session-scoped.

Multi-agent collectives

Despite framework support, multi-agent designs in production are rare. The bounded supervisor-worker shape dominates; emergent multi-agent behaviour is still mostly research.

Cross-organisation agents

Agents that operate across organisational boundaries — your agent interacting with another organisation's agent — are not yet common. The protocols, trust models, and governance are immature.

Agent-to-agent authentication

Beyond OAuth-style patterns, the conventions for agent identity, agent permissions, and inter-agent trust are still forming.

What we keep seeing

Patterns in 2025 agent deployments:

The discipline distinguishes shipped from stalled. Teams with strong evaluation, observability, and human-in-the-loop discipline ship. Teams without these stay in extended pilot.

Tool catalogue governance matters more than the agent design. A well-governed tool catalogue with carefully scoped functions enables agent systems that are useful and safe. A casually grown catalogue produces exposure.

Cost surprises are less common. The discipline around budgets and circuit breakers has spread. Cost is a managed concern, not a surprise.

The integration with existing enterprise workflows is the work. The agent capability is the smaller part; the integration with existing systems, processes, and people is the bulk of effort.

MCP adoption is reducing custom integration. Where MCP servers exist for needed systems, the integration burden drops materially.

What we recommend

For enterprise teams building production agents in 2025:

  1. Pick orchestration based on production fit, not demo aesthetics. LangGraph, custom, or hosted — match to your operating model.
  2. Govern the tool catalogue from day one. The retrofit is harder than the discipline.
  3. Propagate identity and enforce permissions at execution. The agent is not the authorisation layer.
  4. Build evaluation as a primary discipline. Without it, you cannot improve or even maintain quality.
  5. Apply human-in-the-loop on consequential actions. The autonomy aspiration is still ahead of the production reality.
  6. Use MCP where it fits. The standardisation is paying off.
  7. Plan the cost discipline. Agents make more model calls than chat workloads; budgets and circuit breakers matter.

Agent infrastructure in 2025 is mature enough for production. The teams that respect the disciplines — bounded scope, strong tool governance, identity propagation, evaluation, observability — ship useful systems. The teams that chase the autonomous-agent aspiration without the discipline produce systems that demo well and fail in operation. The capability is real; the discipline determines whether it ships.

Work with the practitioners

Bring an enterprise programme.

Architecture audit, new delivery, modernisation, or in-flight rescue — Intellectual engages directly on enterprise programmes with senior practitioners.