Reducing Hallucinations in Complex Question Answering using Simple Graph-based Retrieval-Augmented Generation (long version)
Summary
A study by Wedge et al. introduces a novel agentic system that significantly reduces hallucinations and improves factual correctness in complex question answering (QA) by augmenting Retrieval-Augmented Generation (RAG) with a lightweight graph-based knowledge base. Evaluating on 510 questions from the challenging MoNaCo Wikipedia QA benchmark, the "vector+graph RAG" system, utilizing a curated August 2025 Wikipedia snapshot, achieved a fine-grained truthfulness score of approximately 63, an 80% increase over baseline vector RAG (score 35). It also more than doubled factual correctness precision and recall compared to vector RAG, and halved hallucinated answers compared to a zero-shot approach (coarse truthfulness improved from -127 to -49). The system, powered by GPT-5.4 and Microsoft's Harrier 0.6B embedding model, uses handwritten Cypher queries over a Neo4j graph database containing 5.7 million nodes and 22 million relationships, demonstrating improved performance with only a modest increase in token usage.
Key takeaway
For Machine Learning Engineers developing complex question answering systems, integrating a lightweight knowledge graph with agentic RAG can substantially improve answer accuracy and reduce hallucinations. Prioritize designing explicit, efficient graph query tools over relying solely on vector search, even if LLMs show a bias towards simpler tools. Your systems will deliver more trustworthy and factually correct responses, especially for multi-hop and multi-entity queries, providing better value despite a slight increase in token usage.
Key insights
Integrating lightweight knowledge graphs into RAG systems significantly reduces LLM hallucination and improves complex QA accuracy.
Principles
- Agentic RAG with structured query tools outperforms pure vector search for complex QA.
- Handwritten graph queries enhance LLM focus and avoid prompt injection risks.
- Evaluating RAG systems requires metrics that penalize hallucination more severely than refusal.
Method
Construct a lightweight knowledge graph from semi-structured documents. Equip an LLM agent with vector search, structural navigation, and relational query tools (e.g., Cypher) to retrieve information efficiently.
In practice
- Use Neo4j and Cypher for graph-based RAG implementation.
- Employ smaller, efficient embedding models like Harrier 0.6B for large datasets.
- Explicitly prompt LLM agents for efficient tool use (e.g., "Title search" -> "Section titles" -> "Get sections").
Topics
- Retrieval-Augmented Generation
- Knowledge Graphs
- LLM Hallucination
- Complex Question Answering
- Neo4j Cypher
- Agentic AI Systems
Code references
Best for: Research Scientist, AI Architect, AI Engineer, AI Scientist, Machine Learning Engineer, NLP Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.