Hallucinations as Orthogonal Noise: Inference-Time Manifold Alignment via Dynamic Contextual Orthogonalization
Summary
Dynamic Contextual Orthogonalization (DCO) is an inference-time intervention method designed to mitigate hallucinations in Large Language Models (LLMs). This approach posits that hallucinations arise from orthogonal noise within the residual stream's semantic manifold, where specific attention heads introduce components divergent from the context subspace. DCO leverages the input residual stream as a dynamic context anchor, performing orthogonal decomposition on attention head outputs. It then employs a layer-wise Z-score suppression mechanism to selectively attenuate outlier orthogonal components. Evaluations on Llama-3-8B and 70B across benchmarks like XSum, NQ-Swap, and IFEval demonstrate DCO's superior contextual faithfulness compared to existing baselines. Furthermore, DCO maintains high performance on knowledge-intensive tasks such as TriviaQA and TruthfulQA, effectively addressing the common trade-off between hallucination suppression and parametric knowledge retention.
Key takeaway
For machine learning engineers deploying Large Language Models, if you are struggling with persistent hallucinations, consider implementing Dynamic Contextual Orthogonalization (DCO). This method offers a computationally efficient way to improve contextual faithfulness and mitigate the trade-off between hallucination suppression and knowledge retention. Integrating DCO could significantly enhance the reliability of your LLM applications, especially for knowledge-intensive tasks.
Key insights
Hallucinations in LLMs are modeled as orthogonal noise, which Dynamic Contextual Orthogonalization (DCO) suppresses by aligning latent representations.
Principles
- Hallucinations manifest as orthogonal noise relative to the semantic manifold of the residual stream.
- Attention heads can introduce components orthogonal to the context subspace, disrupting latent representation coherence.
Method
DCO uses the input residual stream as a dynamic context anchor to perform orthogonal decomposition on attention head outputs, then applies layer-wise Z-score suppression to outlier orthogonal components.
In practice
- Apply DCO to Llama-3-8B and 70B models to enhance contextual faithfulness.
- Utilize DCO to suppress hallucinations without sacrificing parametric knowledge retention in LLMs.
Topics
- Large Language Models
- Hallucination Mitigation
- Inference-Time Intervention
- Manifold Alignment
- Orthogonal Decomposition
- Llama-3
Code references
Best for: Research Scientist, AI Engineer, AI Scientist, Machine Learning Engineer, NLP Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.