Extrinsic Hallucinations in LLMs

· Source: Lil'Log · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Advanced, extended

Summary

Large language model (LLM) hallucination, defined as generating fabricated content not grounded by context or world knowledge, is categorized into in-context and extrinsic types. This analysis focuses on extrinsic hallucination, which occurs when model output is not grounded by its pre-training data or external facts. Hallucinations stem from issues in pre-training data, which can be outdated or incorrect, and fine-tuning, where models learn new knowledge slower and exhibit increased hallucination tendencies, as shown by Gekhman et al. (2024). Various detection methods exist, including retrieval-augmented evaluation benchmarks like FactualityPrompt, FActScore, SAFE, and FacTool, which use external knowledge bases. Sampling-based methods like SelfCheckGPT assess consistency across multiple model samples. Calibration techniques, such as TruthfulQA and SelfAware, measure a model's ability to acknowledge unknown information and express appropriate confidence. Anti-hallucination methods include Retrieval-Augmented Generation (RAG) frameworks like RARR, FAVA, Rethinking with Retrieval (RR), and Self-RAG, which integrate external knowledge and self-reflection to improve factuality.

Key takeaway

For AI Architects and Research Scientists developing or deploying LLMs, understanding the dual causes of hallucination (pre-training data and fine-tuning new knowledge) is critical. You should prioritize integrating retrieval-augmented generation (RAG) frameworks like RARR or Self-RAG to ground model outputs in verifiable external knowledge. Additionally, implement robust evaluation benchmarks such as SAFE or FacTool to continuously monitor and quantify factuality, especially when fine-tuning models on novel datasets, to mitigate the risk of increased fabrication.

Key insights

LLM hallucinations arise from pre-training data issues and fine-tuning new knowledge, necessitating robust detection and mitigation strategies.

Principles

Method

Hallucination detection involves retrieval-augmented evaluation using benchmarks like FactualityPrompt and FActScore, or sampling-based consistency checks (SelfCheckGPT), alongside calibration for unknown knowledge.

In practice

Topics

Code references

Best for: Research Scientist, AI Architect, AI Engineer, AI Researcher, AI Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Lil'Log.