Hallucinations as Orthogonal Noise: Inference-Time Manifold Alignment via Dynamic Contextual Orthogonalization

· Source: Artificial Intelligence · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Natural Language Processing · Depth: Expert, quick

Summary

Dynamic Contextual Orthogonalization (DCO) is an inference-time intervention method designed to mitigate hallucinations in Large Language Models (LLMs). This approach posits that hallucinations arise from orthogonal noise within the residual stream's semantic manifold, where specific attention heads introduce components divergent from the context subspace. DCO leverages the input residual stream as a dynamic context anchor, performing orthogonal decomposition on attention head outputs. It then employs a layer-wise Z-score suppression mechanism to selectively attenuate outlier orthogonal components. Evaluations on Llama-3-8B and 70B across benchmarks like XSum, NQ-Swap, and IFEval demonstrate DCO's superior contextual faithfulness compared to existing baselines. Furthermore, DCO maintains high performance on knowledge-intensive tasks such as TriviaQA and TruthfulQA, effectively addressing the common trade-off between hallucination suppression and parametric knowledge retention.

Key takeaway

For machine learning engineers deploying Large Language Models, if you are struggling with persistent hallucinations, consider implementing Dynamic Contextual Orthogonalization (DCO). This method offers a computationally efficient way to improve contextual faithfulness and mitigate the trade-off between hallucination suppression and knowledge retention. Integrating DCO could significantly enhance the reliability of your LLM applications, especially for knowledge-intensive tasks.

Key insights

Hallucinations in LLMs are modeled as orthogonal noise, which Dynamic Contextual Orthogonalization (DCO) suppresses by aligning latent representations.

Principles

Method

DCO uses the input residual stream as a dynamic context anchor to perform orthogonal decomposition on attention head outputs, then applies layer-wise Z-score suppression to outlier orthogonal components.

In practice

Topics

Code references

Best for: Research Scientist, AI Engineer, AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.