AI & Enterprise AI30 January 20248 min read

Vector Databases for Enterprise Search

Vector databases are the easy part to demo and the hard part to run at enterprise scale. A practitioner view of the choices that actually matter when picking and operating one in a regulated estate.

ByIntellectual AI Engineering Practice· Collective byline

A vector database is the easiest piece of a RAG system to demo and one of the harder pieces to operate. The demo runs in a notebook against a few thousand documents; the production system holds tens of millions of chunks, serves real query load, integrates with enterprise identity, and has to survive operational realities the demo never touches.

This piece is a practitioner view of vector database selection and operation in enterprise environments — what the real choices are, where the trade-offs sit, and what gets underestimated.

What a vector database actually is

A vector database stores numerical embeddings of content and answers approximate nearest-neighbour queries — given an input vector, return the most similar stored vectors. The implementation underneath uses indexing structures like HNSW (Hierarchical Navigable Small World) or IVF (Inverted File) that trade exactness for speed.

For enterprise use, the database also has to handle:

Metadata filtering — find similar vectors that also satisfy filter conditions (date range, document type, access tags)
Hybrid search — combine vector similarity with lexical match (BM25 or equivalent)
Multi-tenancy — separate logical namespaces with their own data and access controls
CRUD operations — vectors aren't only read; they're inserted, updated, deleted as documents change
Backup and recovery — operational basics that the demos quietly skip

Most products in the space handle some subset of these well. Few handle all of them well.

The landscape

The category has consolidated into a few categories of provider:

Pure-play hosted vector databases — Pinecone, Zilliz Cloud (managed Milvus), Weaviate Cloud. Fastest to start, simplest to operate. You pay for the convenience.

Pure-play self-hosted vector databases — Milvus, Weaviate, Qdrant. Operational responsibility moves to you; total cost can be lower at scale.

Vector extensions to existing databases — pgvector for Postgres, MongoDB Atlas Vector Search, Elastic vector search, Redis vector search. The advantage is staying inside your existing operational footprint and access-control model. The disadvantage is that vector workload tuning isn't the database's primary concern.

Cloud-native vector services — Azure AI Search vector, AWS OpenSearch vector, GCP Vertex Vector Search. Aligned with your cloud's identity, networking, and operational tooling. Quality and cost vary.

The right choice depends on operational profile more than on benchmark numbers.

Selection criteria

The criteria that matter in real selection — roughly in priority order for most enterprise contexts:

Operational fit

Does this run inside our compliance boundary? Some hosted services are not viable for data residency reasons.
Does this use our existing identity model? A new auth surface is operational overhead.
Does this integrate with our observability stack? Vector search latency, recall, error rate need to be visible alongside everything else.
What is the backup and recovery story? Can we restore the index to a point in time?

A vector database that benchmarks well but fails on these tends to be replaced within a year of production.

Metadata filtering quality

Most enterprise queries are filtered queries — "find me documents about X published in the last twelve months from this business unit." The filter has to be efficient.

Some implementations apply filters post-retrieval, which means a filtered query that needs ten results may have to retrieve thousands of vectors first. Others integrate filters into the index, which is much more efficient but requires the index to know about the metadata.

Test with realistic filters on realistic data volumes. Performance differences here can be order-of-magnitude.

Hybrid search

Pure vector search loses to hybrid search on enterprise corpora. This is consistent across the work we have done. The implementations differ:

Some vector databases offer hybrid search natively, blending vector and BM25 scoring.
Some require you to run a separate lexical index and merge results yourself.
Some have hybrid in the product roadmap but not yet in production.

If hybrid search is a near-term requirement, narrow the selection to products that have it shipped and stable.

Scale and cost profile

The cost profile of vector databases is shaped by:

Index size — embeddings are storage. A 1536-dimensional embedding stored as 32-bit floats is 6KB per vector. Ten million vectors is 60GB before any index overhead.
Memory residency — most products keep the working index in memory for query latency. RAM cost dominates.
Query throughput — qps requirements drive provisioning.
Update load — high-update workloads are operationally harder than mostly-read workloads. Some products handle this better than others.

Run a realistic scale test before committing. The convenient pricing of small-scale tiers often becomes uncomfortable at production volumes.

Embedding model independence

A vector database stores opaque vectors. Strictly speaking, it doesn't care what embedding model produced them. In practice, some products bundle embedding model choice into the SDK, which can lock you in. Prefer products where the embedding model is your choice and you can change it without leaving the database.

Operational maturity

Look for:

Replication and high availability
Backup and restore
Point-in-time recovery (rare but valuable)
Rolling upgrades without downtime
Multi-region for disaster recovery
Audit logs of administrative actions

These are unglamorous but determine whether the system runs reliably for years.

What we keep seeing

Recurring patterns in enterprise vector database deployments:

Pinecone as the default starter. It is the fastest path from zero to a working RAG demo. Some teams then migrate as scale grows; others stay. The decision depends mostly on operational and compliance profile.

pgvector for teams that already operate Postgres. Staying inside the existing database means inheriting the existing operational maturity, backup story, access control, and observability. The performance ceiling is lower than dedicated products, but for many enterprise workloads it is more than sufficient.

Azure AI Search and OpenSearch for cloud-native shops. When the rest of the platform is on Azure or AWS, the vector search service that comes with the cloud is usually the right starting point.

Re-embedding pain. Switching embedding models means re-embedding the entire corpus. Teams that didn't think about this early end up locked in. Build the re-embedding pipeline as part of initial deployment, even if you don't use it immediately.

Filter performance surprises. A naive product that handles unfiltered top-k well but degrades on filtered queries. The filter is the common case in enterprise search; benchmark on filtered queries.

Index sprawl. One project starts one index. The next project starts another. Six months later you have a dozen indices that should have been one. Plan for consolidation from the start.

Operational practice

Once a vector database is in production:

Monitor recall, not just latency. Latency is easy to monitor; recall — whether the right documents are coming back — needs an evaluation harness. Build it.

Watch index drift. As the corpus grows and changes, the index needs maintenance. Some products auto-maintain; some require explicit reindex operations. Schedule and monitor.

Plan re-embeddings. Every major embedding model upgrade triggers a re-embedding. This is an expensive batch operation that needs planning, not a surprise.

Capacity-plan for growth. Enterprise corpora grow. Plan the next two years of growth, not the next quarter.

Backup before you need it. A vector index lost to corruption or accident takes weeks to rebuild from scratch. Treat backups as an actual operational requirement.

What we recommend

For an enterprise selecting a vector database in 2024:

Start from operational fit, not benchmark numbers. The product that fits your compliance, identity, and operational model is the right starting point.
If you already run Postgres, try pgvector first. It is good enough for many workloads and avoids adding a new operational surface.
If you need maximum convenience and your data is allowed in the cloud, Pinecone or a cloud-native vector service is the fastest path.
If you need self-hosted, Milvus or Weaviate are the most mature options.
Benchmark on realistic filtered queries with realistic data volumes. Don't trust marketing benchmarks.
Plan for re-embedding from day one. Build the pipeline; you will need it.
Build evaluation alongside the database. Recall measurement is your early-warning system.

The vector database is part of the substrate of the AI system. It is not the differentiator, but a bad choice here causes friction for years. Make it deliberately.

Bring an enterprise programme.

Architecture audit, new delivery, modernisation, or in-flight rescue — Intellectual engages directly on enterprise programmes with senior practitioners.

Contact Intellectual →Read more insights