Knowledge Graphs and RAG — Two Patterns That Belong Together
Pure vector retrieval has a ceiling on enterprise knowledge. Combining it with a structured knowledge graph layer breaks past that ceiling for many real workloads.
Pure vector retrieval has a ceiling on enterprise knowledge tasks. The system retrieves chunks whose embeddings are similar to the query embedding. For questions that look like the indexed content, this works well. For questions that require reasoning over relationships, aggregating across entities, or following indirect connections, it falls short.
A pattern emerging through the first half of 2024: combine retrieval over a vector index with retrieval over a knowledge graph. The graph captures the relationships; the vector index captures the textual content. Queries that need both are answered better than either alone could.
This is a practitioner view of the integration pattern, where it earns its keep, and what the implementation cost actually is.
What a knowledge graph adds
A knowledge graph represents the entities in your domain and the relationships between them as nodes and edges. Entities have properties; relationships have types and attributes.
For an enterprise that has been doing master data management or has a CMDB, the graph already exists in some form. For an enterprise that has not, the graph has to be built — which is non-trivial work.
The value the graph adds to RAG:
- Entity resolution. "Customer X" mentioned in a document maps to a specific entity in the graph. The graph knows that "Customer X", "Cust X Ltd", and "X Limited" are the same entity.
- Relationship traversal. "Which contracts does Customer X have with subsidiaries of Vendor Y?" requires following relationships across the graph. Vector retrieval cannot do this.
- Aggregation. "How many open incidents does the platform team own across all services?" requires aggregating over typed relationships. The graph supports it; the vector index does not.
- Authoritative grounding. The graph is the source of truth for relationships. Outputs grounded in the graph are defensible in a way that pure text retrieval is not.
The combined architecture
A working pattern:
- Query analysis. The incoming query is analysed for intent and entities. Named entity recognition surfaces the entities; intent classification surfaces what kind of answer is needed.
- Graph query. For queries that touch relationships or aggregations, a graph query (Cypher, SPARQL, or a custom DSL) is constructed and executed. The result is a structured subgraph or aggregate.
- Vector retrieval. For queries that need textual content — explanations, descriptions, details — vector retrieval surfaces relevant chunks.
- Combined context. The graph result and the vector retrievals are assembled into the prompt context.
- Generation. The LLM produces an answer grounded in both sources.
- Citation. The answer cites both the graph (for structured claims) and the source documents (for textual claims).
Each step has substance. The orchestration is non-trivial.
Where the combination works
The workloads where this pattern produces step-change improvements:
Compliance and audit queries
"Show me all transactions over X involving counterparties classified as high-risk." Pure vector retrieval over a document store will surface documents that mention high-risk counterparties; it cannot enumerate the transactions. A knowledge graph with classified entities and linked transactions can.
Customer 360 questions
"What is the relationship history with Customer X across our subsidiaries?" Pure vector retrieval surfaces individual documents about Customer X. A graph that captures the corporate structure, the products purchased, the support history, and the relationship-manager assignments answers the question more completely.
Regulatory mapping
"Which products are subject to regulation Y in jurisdiction Z?" Pure vector retrieval finds documents about Y and Z. A graph that links products, jurisdictions, and regulations enumerates the answer precisely.
Impact analysis
"If we change system X, what other systems are affected?" Pure vector retrieval finds documentation. A graph that captures system dependencies answers the question directly.
Investigative queries
"Find all transactions in the last six months involving these three counterparties." A graph excels at this; vector retrieval falls short.
In each of these, the graph adds something vector retrieval fundamentally cannot.
Where pure vector retrieval is enough
For many workloads, the graph would be overkill:
- Conversational Q&A on documentation. The documentation is in the index; the answers are in the text. The graph adds little.
- Single-document interaction. Summarisation, extraction, translation of a single document. No relationships to traverse.
- Open-ended exploratory queries. Where the user doesn't know what they are looking for, vector retrieval helps them browse. Graph queries require knowing the shape of what you want.
The right architecture depends on the workload mix. A team that has a graph-friendly workload mix invests in the graph; a team that doesn't gets less value from the investment.
Building the graph
The graph is the most expensive part of the architecture. Approaches:
Use the graph you already have
Many enterprises have:
- A CRM that captures customer relationships
- A CMDB that captures IT system dependencies
- An identity system that captures organisational structure
- An ERP that captures supplier and product relationships
These are partial knowledge graphs. They can be exposed as graph queries with appropriate adapters. Starting from existing master data is much cheaper than building from scratch.
Extract from documents
LLMs are reasonably good at extracting entities and relationships from unstructured text. A pipeline that runs over the document corpus, extracts entities and relationships, and populates a graph can produce a usable graph at a reasonable cost.
The challenges:
- Entity resolution. "Acme Corp" mentioned in different documents has to be resolved to a single graph entity.
- Relationship typing. The extracted relationship "Acme acquired Beta" needs to be typed correctly.
- Confidence calibration. Some extractions are confident; some are uncertain. The graph has to represent uncertainty.
- Update discipline. The graph evolves as documents update. The pipeline has to re-extract.
Hybrid
Most production graphs are hybrid — structured authoritative data from existing systems plus extracted information from documents to fill the gaps. The authoritative parts are high-confidence; the extracted parts are confidence-tagged and may require human curation for high-stakes entries.
Operational considerations
The operational profile of a graph + RAG system is more demanding than RAG alone:
- Two systems to maintain. The vector index and the graph database. Both need backup, replication, scaling.
- Two query languages in the codebase. Graph queries (Cypher or SPARQL or a query-builder) and the application's standard query patterns.
- More complex retrieval logic. Deciding when to use the graph, when to use vector retrieval, when both — this is application logic.
- Audit complexity. Answers grounded in two sources have to cite both.
Teams with strong data engineering background take this on more easily. Teams without should treat it as a real operational commitment.
Tooling in 2024
The tooling landscape for graph + RAG:
- Graph databases — Neo4j, Amazon Neptune, Azure Cosmos DB Gremlin, ArangoDB, TigerGraph. Each has its own query language and operational profile.
- Triplestores for SPARQL — Apache Jena, Stardog, GraphDB. For ontology-heavy domains.
- Property graph extensions to existing databases — Apache AGE for Postgres, MariaDB OQGraph. Lower operational overhead but smaller capabilities.
- GraphRAG frameworks — Microsoft's GraphRAG, LlamaIndex's PropertyGraph, custom implementations. All early; expect iteration.
- Extraction tools — LlamaIndex extraction, KGen, Mistral's structured output models. Capabilities improving rapidly.
Picking products is a function of existing operational capability, the data model the workload needs, and the team's familiarity with each. None is dominantly the right answer.
What we keep seeing
Recurring patterns in graph + RAG engagements:
The graph quality dominates. A clean, well-modelled graph with good entity resolution produces step-change improvements. A messy graph produces marginal improvements over pure vector retrieval.
Entity resolution is the hard part. Building the graph is one step; keeping it consistent over time, especially as new documents arrive, is the ongoing work.
Query construction is the unsung skill. Translating a natural-language question into the right graph query, possibly combined with vector retrieval, is where the application logic lives. It is non-trivial.
The cost case is workload-dependent. For relationship-heavy workloads, the graph earns its cost easily. For text-heavy workloads, the cost case is weaker.
Hybrid retrieval almost always beats either alone. When the graph is good enough to use, using it alongside vector retrieval produces better answers than either alone.
What we recommend
For enterprise teams considering graph + RAG architectures in 2024:
- Profile the workload. Relationship-heavy queries justify the investment; text-heavy queries may not.
- Start with the graph you have. Existing master data, CMDB, identity — these are graph-shaped already.
- Add extracted information cautiously. Confidence-tag it; curate the high-stakes entries.
- Build the query orchestration layer deliberately. Deciding when to use which retrieval is application logic, not framework magic.
- Cite both sources. Answers grounded in the graph need to show the graph path; answers from documents need to show the source.
- Plan for ongoing graph maintenance. The graph evolves; the pipeline has to keep up.
- Treat the graph + RAG pattern as a real operational commitment, not an extension.
Knowledge graphs and vector retrieval are complementary. The teams that combine them well unlock workloads that neither pattern could serve alone. The teams that combine them poorly add operational complexity without proportional value. The difference is in the graph quality and the orchestration discipline.
RELATED READING
More from the field.
Service practices the article draws on, related programmes, and other pieces on adjacent topics.
Service practices
Related pieces
RAG Architecture — From Demo to Production
Retrieval-augmented generation is the dominant enterprise LLM pattern of the year. The demos are cheap; the production systems are not. A practitioner walkthrough of where the work actually sits.
Three Years of Enterprise AI — What We Got Right and Wrong
A practitioner reflection on three years of enterprise AI work — the patterns I called correctly, the calls I got wrong, and what to take from each into 2026 and beyond.
The 2026 AI Infrastructure Shift — What's Changing Underneath
The infrastructure layer for enterprise AI is shifting in 2026. New hardware, new deployment patterns, new economics. A look at what's actually different and what it means for architecture decisions.
Discuss this work
Bring an enterprise programme.
If anything in this piece resonates with what you're building, talk to us. Senior practitioners engage directly on architecture and delivery.
Work with the practitioners
Bring an enterprise programme.
Architecture audit, new delivery, modernisation, or in-flight rescue — Intellectual engages directly on enterprise programmes with senior practitioners.