Extrinsic Hallucinations in LLMs
Summary
Large language model (LLM) hallucination, defined as generating fabricated content not grounded by context or world knowledge, is categorized into in-context and extrinsic types. This analysis focuses on extrinsic hallucination, which occurs when model output is not grounded by its pre-training data or external facts. Hallucinations stem from issues in pre-training data, which can be outdated or incorrect, and fine-tuning, where models learn new knowledge slower and exhibit increased hallucination tendencies, as shown by Gekhman et al. (2024). Various detection methods exist, including retrieval-augmented evaluation benchmarks like FactualityPrompt, FActScore, SAFE, and FacTool, which use external knowledge bases. Sampling-based methods like SelfCheckGPT assess consistency across multiple model samples. Calibration techniques, such as TruthfulQA and SelfAware, measure a model's ability to acknowledge unknown information and express appropriate confidence. Anti-hallucination methods include Retrieval-Augmented Generation (RAG) frameworks like RARR, FAVA, Rethinking with Retrieval (RR), and Self-RAG, which integrate external knowledge and self-reflection to improve factuality.
Key takeaway
For AI Architects and Research Scientists developing or deploying LLMs, understanding the dual causes of hallucination (pre-training data and fine-tuning new knowledge) is critical. You should prioritize integrating retrieval-augmented generation (RAG) frameworks like RARR or Self-RAG to ground model outputs in verifiable external knowledge. Additionally, implement robust evaluation benchmarks such as SAFE or FacTool to continuously monitor and quantify factuality, especially when fine-tuning models on novel datasets, to mitigate the risk of increased fabrication.
Key insights
LLM hallucinations arise from pre-training data issues and fine-tuning new knowledge, necessitating robust detection and mitigation strategies.
Principles
- Factuality requires grounding in context or world knowledge.
- Fine-tuning new knowledge increases hallucination risk.
- Retrieval-augmented methods enhance factual grounding.
Method
Hallucination detection involves retrieval-augmented evaluation using benchmarks like FactualityPrompt and FActScore, or sampling-based consistency checks (SelfCheckGPT), alongside calibration for unknown knowledge.
In practice
- Use RAG to ground LLM outputs with external documents.
- Monitor fine-tuning for increased hallucination rates.
- Employ SAFE for long-form factuality evaluation.
Topics
- LLM Hallucination
- Retrieval-Augmented Generation
- Factuality Evaluation
- LLM Fine-tuning
- Knowledge Calibration
Code references
- princeton-nlp/EntityQuestions
- google-deepmind/long-form-factuality
- sylinrl/TruthfulQA
- sylinrl/CalibratedMath
- kttian/llm_factuality_tuning
Best for: Research Scientist, AI Architect, AI Engineer, AI Researcher, AI Scientist, Machine Learning Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Lil'Log.