Vector RAG vs LLM-Compiled Wiki: A Preregistered Comparison on a Small Multi-Domain Research
Summary
A preregistered study compared a single-round Vector RAG system against an LLM-compiled markdown wiki for assisting an LLM in answering 13 questions across 24 research papers. Using the same answer-generating model and blinded LLM judges, the wiki significantly outperformed Vector RAG in connecting findings across papers, though its advantage in answer organization was less pronounced after judge adjustments. Vector RAG, however, met the preregistered test for single-fact lookup questions. Contrary to expectations, the wiki incurred substantially higher query-side token costs than RAG, negating any potential upfront build cost recovery. Exploratory analyses revealed the wiki's citations more accurately supported specific claims, despite RAG scoring better on overall groundedness. A decomposition-based RAG variant achieved most of the wiki's cross-paper synthesis benefits at lower token costs, but not its claim-by-claim citation support.
Key takeaway
For AI Architects and Research Scientists designing knowledge retrieval systems, understand that "grounded research synthesis" is not a monolithic capability. If your primary need is complex cross-document synthesis with high citation fidelity, an LLM-compiled wiki or a decomposition-based RAG variant might be more effective, despite potentially higher query costs. For single-fact retrieval, standard Vector RAG remains efficient. Evaluate your specific synthesis and citation requirements against system costs.
Key insights
Grounded research synthesis involves multiple capabilities, with no single architecture excelling across all metrics.
Principles
- Cross-paper synthesis benefits from structured knowledge.
- Citation accuracy differs from overall groundedness.
- Query cost can outweigh upfront knowledge base creation.
Method
The study compared Vector RAG and an LLM-compiled wiki using blinded LLM judges to score answers to 13 questions over 24 papers, analyzing organization, synthesis, cost, and citation support.
In practice
- Consider wiki for complex cross-paper synthesis.
- Use Vector RAG for efficient single-fact lookups.
- Evaluate token costs for knowledge base choice.
Topics
- Vector RAG
- LLM-compiled Wiki
- Research Synthesis
- Question Answering
- LLM Judges
Best for: Research Scientist, AI Architect, AI Engineer, AI Scientist, Machine Learning Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.