Overview
Production Readiness
0.4
Novelty Score
0.6
Cost Impact Score
0.6
Citation Count
22
Why It Matters For Business
GraphRAG injects relational facts into LLM outputs, reducing hallucination and shortening input prompts; this improves accuracy for QA, search, and domain workflows while leveraging existing graph databases.
Summary TLDR
This paper is the first systematic survey of Graph Retrieval-Augmented Generation (GraphRAG). GraphRAG extends text-based RAG by indexing and retrieving graph elements (nodes, triples, paths, subgraphs) and converting them into formats LMs can consume. The authors organize research into three stages—G-Indexing, G-Retrieval, G-Generation—cover core methods (graph/text/vector/hybrid indexing; non-parametric/LM/GNN retrievers; graph languages and embeddings; hybrid GNN+LM generators), benchmarks, industry systems, and open challenges like scalable retrieval, dynamic graphs, multimodality, and context compression. The repo link and many literature pointers are provided for quick follow-up.
Problem Statement
Text-only RAG misses structured relations, produces redundant long context, and struggles to capture global relational context. GraphRAG aims to retrieve structured graph elements to supply relational knowledge, reduce verbosity, and improvefaithful, context-aware generation.
Main Contribution
First comprehensive survey of GraphRAG methods and applications.
Formalizes GraphRAG pipeline into three stages: G-Indexing, G-Retrieval, G-Generation.
Categorizes core techniques, training strategies, benchmarks, and industrial deployments, and lists open problems and future directions.
Key Findings
GraphRAG workflow decomposes into three repeatable stages: Graph-Based Indexing, Graph-Guided Retrieval, and Graph-Enhanced Generation.
Retrieval granularity strongly affects trade-offs: nodes/triples are fast but narrow; paths/subgraphs capture richer context but explode combinatorially.
The candidate subgraph space grows exponentially, making efficient search and pruning essential.
Graph data can be fed to LMs either as graph languages (text/code sequences) or as graph embeddings; each has trade-offs.
Benchmarks and cross-domain suites exist but are fragmented; GRBENCH contains 1,740 questions for graph-augmented QA.
Industry prototypes (Microsoft, NebulaGraph, AntGroup, Neo4j) show GraphRAG practicality for QFS, search, and enterprise knowledge access.
Who Should Care
What To Try In 7 Days
Prototype a small text-attributed graph from internal docs and index it in Neo4j.
Implement a two-stage retriever: fast non-parametric seed (BFS/PCST) + LM or cross-encoder reranker.
Feed concise graph language summaries (edge table or node sequences) to an LLM prompt and compare accuracy vs. text-only RAG.
Agent Features
Memory
- Graph database as long-term structured memory
- Community summaries as compressed global context
Planning
- LLM-generated reasoning paths
- Agentic hop prediction to stop retrieval
Tool Use
- Graph DB traversal (BFS/DFS)
- SPARQL/Cypher queries
- LLM function-calling for retrieval
Frameworks
- LlamaIndex
- LangChain
- Neo4j + NaLLM
- NebulaGraph GraphRAG
Architectures
- GNN + LM (hybrid)
- GNN cascaded into LM
- Parallel GNN + LM fusion
Collaboration
- GNN encoders working with LMs
- Multi-stage retriever + reranker pipelines
Optimization Features
Token Efficiency
- Summaries/community reports to shorten prompts
- Graph languages (adjacency/edge tables) to compress structure
Infra Optimization
- Use vector indices (LSH) for fast nearest-neighbor lookup
- Leverage graph DB traversal for structural queries
Model Optimization
- Prompt/prefix tuning for LMs
- Lightweight GNN variants
System Optimization
- Hybrid indexing (graph + vector + text) for latency/recall trade-offs
Training Optimization
- Distant supervision for retriever paths
- Contrastive pretraining for passage/subgraph embeddings
Inference Optimization
- Multi-stage retrieval to reduce LM calls
- Constrained decoding for valid KB queries
Reproducibility
Code Urls
Code Available
Open Source Status
- partial
Risks & Boundaries
Limitations
- Most methods tested on small graphs; large-scale industrial graphs remain challenging.
- Converting graphs into LM-input formats can produce long contexts that LMs mishandle.
- Graph embeddings may lose exact entity names and precise facts.
When Not To Use
- When your problem is pure text retrieval with no relational reasoning needs.
- When you cannot afford graph construction and maintenance costs.
- When low-latency (<100ms) real-time responses are mandatory and multi-hop retrieval is required
Failure Modes
- Over-retrieval: returning huge subgraphs that exceed token limits and confuse the LM.
- Poor retrieval precision when entity linking is noisy, causing wrong evidence to drive answers.
- Embedding-based fusion losing exact entity identifiers, producing semantically close but incorrect answers.
Core Entities
Models
- GPT-4
- GPT-3
- LLaMA
- LLaMA2
- Qwen2
- RoBERTa
- BERT
- SentenceBERT
- GCN
- GAT
- GraphSAGE
- Graph Transformer
- GreaseLM
- ENGINE
- G-Retriever
- GNN-RAG
- GRAG
- KG-GPT
Metrics
- Exact Match (EM)
- F1
- Accuracy
- Recall
- MRR
- Hits@K
- BERTScore
- GPT4Score
- BLEU
- ROUGE-L
- NDCG@K
Datasets
- GRBENCH
- GraphQA
- WebQSP
- WebQuestions
- CWQ
- GrailQA
- CSQA
- ConceptNet
- Wikidata
- Freebase
- DBpedia
- CMeKG
- CPubMed-KG
- HotpotQA
- STaRK
- CRAG
Benchmarks
- GRBENCH
- STaRK
- CRAG

