Intellectual
← All Insights
AI & Enterprise AI13 February 20249 min read

AI Governance and Guardrails for Production Systems

Most enterprises talk about AI governance after the first incident. The teams that do it from day one ship faster, not slower — the discipline matters as much as the model.

In every serious enterprise AI engagement we have worked, the governance conversation has happened in one of two orders. Either it happened at the start — as part of the architecture and design work — or it happened after the first production incident. The teams that did it at the start did not ship slower than the teams that did it after. They shipped with confidence, with audit trails, and with the institutional support to keep shipping.

This is a practitioner view of what enterprise AI governance actually consists of, what guardrails belong in the technical stack, and how to set the discipline up so it accelerates rather than obstructs.

What governance actually is

In an enterprise context, AI governance is the set of policies, processes, technical controls, and audit mechanisms that ensure AI workloads are:

  • Compliant with regulations applicable to the data, the use case, and the jurisdiction
  • Auditable with enough detail that decisions and outputs can be reconstructed
  • Bounded in terms of cost, scope of action, and content produced
  • Aligned with the organisation's risk appetite and the stakeholder expectations
  • Reviewable by humans where decisions cross a materiality threshold
  • Improving over time as new risks surface and old ones are mitigated

Governance is not a separate layer that sits on top of an AI system. It is a property of the system, encoded in the architecture and the operating practice.

The technical guardrails

The guardrails worth investing in, in roughly the order most enterprises adopt them:

Input filtering

The simplest layer. Before the request reaches the model, it goes through filters:

  • Identity verification — who is making the request, do they have permission for this workload
  • Content classification — is this request appropriate for this AI workload (work topic, not personal; in-scope, not adversarial)
  • PII detection — is the request including sensitive personal data that should be redacted or blocked
  • Prompt injection screening — does the request contain patterns associated with prompt injection attempts

Input filtering catches many problems before they happen. It is also the first place to log for audit.

Output filtering

After the model produces a response, before it returns to the user:

  • PII screening — does the output contain personally identifiable information that should be redacted
  • Content policy compliance — does the output violate organisational policy (no legal advice, no medical advice, no financial recommendations)
  • Toxicity and bias screening — does the output contain content that is offensive, discriminatory, or otherwise problematic
  • Factuality checks — for some workloads, the output is checked against source material for hallucinations
  • Schema validation — for structured outputs, does the response conform to the expected schema

Output filtering is harder than input filtering because the space of possible outputs is larger and the criteria are more nuanced. It is also where the most expensive incidents tend to be prevented.

Function and tool guardrails

When the model can call functions or use tools, each function is a potential action surface:

  • Permission gating — each function is permitted only for users who are authorised
  • Argument validation — function arguments are validated against allowlists or schemas before execution
  • Idempotency — functions are idempotent where possible, so retries don't double-act
  • Audit logging — every function call recorded with the LLM input, the model's chosen call, the arguments, the result
  • Reversibility checkpointing — irreversible actions (sending, deleting, transacting) require human confirmation

The function catalogue is governed as carefully as the API catalogue. Adding a function is a change with security implications.

Cost and rate controls

The economic version of safety. Every workload has:

  • Per-user budget — how many tokens or dollars per user per period
  • Per-workload budget — how much the workload consumes overall
  • Rate limits — to prevent both abuse and runaway loops
  • Circuit breakers — automatic shutoff if anomalous usage patterns appear
  • Cost monitoring — visibility per workload, per user, per tenant

Without these, an agent loop or an enthusiastic user can produce a six-figure bill in a weekend.

Model governance

Which models can be used, where:

  • Approved model registry — only approved models can be invoked from production workloads
  • Version pinning — model versions are pinned in code; upgrades go through review
  • Provider routing — workloads that touch certain data classes are restricted to certain providers (e.g. only providers with appropriate data-handling commitments)
  • Specialist model approval — fine-tuned models, customer-trained models go through additional review

Without model governance, the estate accumulates ungoverned model usage that becomes a compliance problem at the worst possible time.

The process discipline

Technical guardrails enable but do not replace process discipline. The processes that work:

Use-case review

A new AI use case goes through review before development starts. The review covers:

  • What is the workload doing? What decision or action results from it?
  • What is the materiality of the worst-case error? Inconvenience, regulatory exposure, financial loss, harm?
  • What data does it touch? What classification, what jurisdiction, what consent?
  • What models is it using? What's the residency story?
  • What is the human-in-the-loop posture? Where do humans verify, where do they act on outputs?
  • What is the fallback when the AI fails?
  • How will success and failure be measured?

A short review at the start prevents a year of awkward conversations later.

Risk classification

Every workload gets classified by risk profile. A common scheme:

  • Low risk — informational outputs, fully reversible, low materiality
  • Medium risk — outputs that influence decisions, partially automated, moderate materiality
  • High risk — outputs that drive decisions or actions with significant materiality
  • Critical — automated decisions in regulated areas, irreversible actions

The risk class determines the guardrails required, the review cadence, the human-in-the-loop posture, the audit retention. Same kind of risk-tiering most enterprises already apply to other operational systems.

Incident response

When something goes wrong — an inappropriate output, an unintended action, a privacy incident — there is a defined response process:

  • Who is notified, in what order
  • How is the incident logged and tracked
  • How is root cause analysed
  • How are corrective actions tracked
  • How is the incident reported externally if required

A team that has practised the response handles incidents calmly. A team that hasn't tends to make the incident worse.

Periodic review

Every workload is reviewed periodically. New risks have emerged. The model has been updated. The user base has changed. The legal landscape has shifted. Review catches drift before it becomes an incident.

What we keep seeing

Recurring patterns in enterprise governance maturity:

Governance committee, no controls. A monthly steering committee that issues policy memos but no technical implementation. Policy without enforcement is decoration.

Controls, no policy. Engineers building guardrails based on their personal judgment because the organisation hasn't articulated what is acceptable. The team carries policy-making burden they shouldn't.

Big-bang policy launch. A 50-page policy document released as a fait accompli. Adoption is poor because nobody was involved in writing it. Iterate the policy with the practitioners.

Audit trails that don't reconstruct. Logs are collected but they don't actually let you reconstruct what happened. Test the audit trail by trying to reconstruct an incident; if you can't, fix the logging.

Guardrails treated as the goal. Adding more guardrails because more is safer. Each guardrail has cost — latency, friction, false positives. The right number is the minimum that addresses the risks the workload actually has.

What we recommend

For an enterprise standing up AI governance:

  1. Start with risk classification. You cannot govern uniformly; you have to differentiate by risk class.
  2. Build technical guardrails first for input filtering, output filtering, cost control, and audit logging. These cover most incident classes.
  3. Define the use-case review process and run it for every new workload. Short, structured, decision-oriented.
  4. Pick model governance early. The estate will accumulate ungoverned usage if you don't.
  5. Practise incident response. Tabletop exercises catch gaps before real incidents do.
  6. Treat policy as living. Update with the practitioners. The policy that nobody reads is worse than a shorter policy that everyone follows.
  7. Audit-trail by default. Every input, every output, every action, every human decision recorded.

Governance is the discipline that lets you ship AI workloads in regulated environments with confidence. The teams that build it from day one ship faster over the two-year horizon. The teams that bolt it on after the first incident spend the following quarter rebuilding what should have been there from the start.

Work with the practitioners

Bring an enterprise programme.

Architecture audit, new delivery, modernisation, or in-flight rescue — Intellectual engages directly on enterprise programmes with senior practitioners.