AI Vendor Selection and Procurement for Enterprise
AI vendors are pitching every enterprise. The procurement process for AI tools needs to evaluate things conventional software procurement doesn't — model lineage, data handling, evaluation methodology, exit strategy.
Every enterprise we work with is being pitched AI tools at a cadence none of them have seen before. AI features in existing SaaS products, AI-native products, AI platforms, AI services. Standard procurement processes weren't designed for evaluating these. Teams ask the same questions they ask of conventional software vendors and miss the things that matter for AI specifically.
This piece is a practitioner view of the questions that should be in an enterprise AI vendor evaluation, the answers that should make you cautious, and how to structure procurement so the organisation doesn't accumulate AI exposure it didn't intend.
What conventional procurement evaluates
Standard enterprise software procurement looks at:
- Functional fit
- Cost
- Security posture (SOC 2, ISO 27001)
- Data handling and privacy
- Reliability and SLAs
- Support model
- Integration capabilities
- Exit options
These are still necessary for AI vendors. They are not sufficient.
What AI procurement should add
The AI-specific questions:
Model lineage and provenance
- Which models does the product use under the hood?
- Are those models the vendor's, a foundation model provider's, or both?
- Are model versions pinned, or does the product silently upgrade?
- Can the customer be notified before model upgrades?
- For sensitive workloads, is the model option configurable?
A vendor that can't answer these clearly is using models without a clear understanding of them. A vendor that uses third-party models opaquely is exposing you to upstream changes you can't control.
Data handling for AI inference
- What data is sent to AI models?
- Is the data sent to the vendor's models, or to upstream model providers, or both?
- Is the data retained for training, retained for operational logs, or not retained?
- For models hosted by third parties, what is the vendor's contract with those providers about data?
- Where does the data sit geographically when it is processed?
The answers determine whether the vendor's offering is acceptable for your data classification, residency requirements, and confidentiality posture.
Evaluation methodology
- How does the vendor evaluate the quality of their AI features?
- What is the accuracy on real customer workloads, not benchmarks?
- Can the customer evaluate against their own data before commitment?
- What happens to quality over time as models update?
A vendor with no evaluation methodology is shipping AI features on hope. A vendor with evaluation that depends only on internal benchmarks isn't measuring what matters for production deployment.
Human-in-the-loop posture
- Does the product take consequential actions autonomously?
- For each consequential action, is there a human checkpoint?
- Can the customer configure where human checkpoints sit?
- What is the audit trail for autonomous actions?
A product that takes actions you can't audit, can't intercept, and didn't authorise is exposure. The vendor's defaults should match your risk appetite.
Cost predictability
- How does the cost scale with usage?
- Are there caps, or does cost grow unbounded with usage?
- For high-volume workloads, is there a path to predictable pricing?
- What controls does the customer have over cost (per-user limits, workload budgets)?
AI costs can surprise. Procurement should require visibility and control, not just per-call pricing.
Customer data for training
- Is customer data used to train the vendor's models?
- Is customer data used to train third-party models the vendor uses?
- What is the opt-out posture?
- Are the contractual commitments unambiguous?
Default-on training is a real risk. The contract has to be specific.
Exit options
- If you stop using the product, what happens to the data the vendor has accumulated?
- Are the prompts, fine-tuned models, embeddings, or other artifacts portable?
- What is the data return process?
- Can you migrate to alternatives without losing capability?
AI vendor lock-in is more subtle than conventional SaaS lock-in. The artifacts that accumulate (prompts, evaluations, integrations) can make switching expensive even when the data is portable.
Adversarial robustness
- How does the product handle prompt injection?
- What input filtering and output filtering does the product apply?
- What is the disclosure process if a security issue is identified?
- Has the product been red-teamed?
A vendor that hasn't thought about this is selling a product that will produce incidents.
Liability and indemnity
- What is the vendor's liability if the AI produces incorrect output?
- What is the indemnity for IP claims related to AI-generated content?
- Are there exclusions specific to AI behaviour?
Standard software contracts often exclude AI-related liabilities. Procurement should address this explicitly.
Patterns of vendor risk
A few patterns that should prompt caution:
"AI-powered" without details
A vendor's product is described as AI-powered with no specifics about how. Pressed for details, the vendor is evasive or the answers are inconsistent. The AI may be marketing varnish on a conventional product, or it may be capability the vendor doesn't actually understand.
Demos that don't match production behaviour
The demo is impressive; the trial shows different results. The vendor may have tuned the demo on a curated dataset that doesn't reflect customer data. Always trial on real data.
No model version control
The vendor's product silently updates models without notification. Behaviour changes after deployment. The customer has no way to lock in a specific model version. This is a real problem for production stability and audit posture.
Training data ambiguity
The vendor's documentation is ambiguous about whether customer data is used for training. Asked directly, the answers are non-committal. This is the path to data ending up in training corpora you didn't consent to.
Vendor's own AI maturity unclear
The vendor is using AI features they bought from elsewhere. They don't have AI engineering depth themselves. When something goes wrong, they don't have the expertise to diagnose and fix.
Pricing that doesn't predict at scale
The trial is on a small workload; pricing extrapolation suggests reasonable production cost. The actual production behaviour shows costs that are much higher because hidden multipliers (extra calls per user request, retries on internal failures) inflate the unit economics.
Procurement structure that works
A working pattern for enterprise AI procurement:
Trial period with evaluation criteria
The procurement includes a trial period — typically 30-90 days — with explicit evaluation criteria. The criteria are written before the trial; the trial succeeds or fails based on them, not on impressions.
Production-data evaluation
Trial uses real production data (with appropriate privacy controls) for a representative period. Synthetic or curated demo data doesn't predict production behaviour.
Contract clauses specific to AI
Contracts include:
- Model version control or notification
- Data-for-training opt-out (or absence)
- Cost caps or alerting thresholds
- AI-specific liability handling
- Adversarial-robustness commitments
- Audit access for AI behaviour
Joint security review
Security partners review the AI behaviour specifically — prompt injection posture, output filtering, audit trails — alongside the conventional security review.
Exit planning at procurement, not at exit
The procurement includes a sketch of how to leave. Data export formats, artifact portability, migration considerations. The vendor's answer informs the long-term commitment.
Review cadence
The relationship is reviewed periodically — not just at renewal. Model upgrades, behaviour drift, cost trajectory all change the value calculation. Treat AI vendor relationships as ongoing, not set-and-forget.
What we keep seeing
Recurring patterns in enterprise AI procurement engagements:
The questions about AI specifics aren't asked. Standard procurement runs; the AI gets the same treatment as conventional software; the AI-specific risks aren't surfaced.
Trial periods are too short. A two-week trial doesn't reveal the patterns that matter — drift over time, cost at scale, behaviour on edge cases.
Contract negotiation misses the data clauses. The legal team focuses on conventional clauses; the data-for-training and model-version clauses don't make it into the contract.
Internal expertise gap is real. Procurement teams aren't AI engineers; the technical evaluation needs AI engineering involvement.
Vendor lock-in accumulates silently. The artifacts that accumulate (prompts, evaluations, custom integrations) make switching expensive even when the data is portable. The lock-in is visible at year three, not at year one.
What we recommend
For enterprise teams procuring AI vendors in 2024:
- Add AI-specific questions to the standard evaluation. Generic procurement misses them.
- Trial on production data with written evaluation criteria.
- Negotiate AI-specific contract clauses. Model version, training data, cost control, audit access.
- Involve AI engineering in technical evaluation. Procurement alone isn't enough.
- Plan exit at procurement, not at exit. Knowing how to leave informs how to commit.
- Review relationships periodically as model behaviour evolves.
- Document the answers you received. The evaluation is the institutional record.
AI vendor procurement is a discipline the enterprise needs to build deliberately. The vendors are moving fast; the procurement process needs to match. The organisations that build the discipline will capture the value while managing the risk. The organisations that treat AI procurement as conventional software procurement will accumulate exposure they didn't intend, and notice only when something goes wrong.
RELATED READING
More from the field.
Service practices the article draws on, related programmes, and other pieces on adjacent topics.
Service practices
Related pieces
MCP One Year In — What's Working, What Isn't
Model Context Protocol is a year into broader adoption. The standardisation has paid off in specific ways and disappointed in others. A practitioner perspective from the trenches.
Enterprise AI in 2025 — Year in Review
A second year-end reflection from the field. What stabilised, what surprised, and what's heading into 2026.
Building the 2026 AI Roadmap — A Practitioner Framework
Annual AI planning has matured into its own discipline. A framework for building the 2026 roadmap that holds up through the year, not just through the planning cycle.
Discuss this work
Bring an enterprise programme.
If anything in this piece resonates with what you're building, talk to us. Senior practitioners engage directly on architecture and delivery.
Work with the practitioners
Bring an enterprise programme.
Architecture audit, new delivery, modernisation, or in-flight rescue — Intellectual engages directly on enterprise programmes with senior practitioners.