The Geometry of Truth: Detecting LLM Hallucinations with Geometric Algebra
Summary
Type II hallucination, or "contradiction," occurs when a Large Language Model (LLM) generates claims that directly conflict with provided context, despite having high topical relevance. This differs from Type I hallucinations, which are pure fabrications. Examples include reversing financial growth percentages, altering product launch timelines, or flipping medical test results. These contradictions are particularly dangerous because they sound plausible, score high on standard similarity metrics, reverse critical facts, and can lead to compliance violations in regulated fields. Traditional sentence embeddings fail to detect these contradictions, often assigning high similarity scores due to shared entities, necessitating a more sophisticated approach like Natural Language Inference (NLI) cross-encoders.
Key takeaway
For AI Engineers building RAG systems or fact-checking pipelines, you should integrate a Type II contradiction detector to prevent subtle yet critical errors. Standard similarity metrics are insufficient; instead, implement NLI cross-encoders with orthogonalized discriminant directions to accurately identify claims that semantically oppose source context, especially in regulated domains where factual reversal is a high-risk compliance issue.
Key insights
Type II hallucinations are context contradictions, dangerous due to their plausibility and standard similarity metric blindness.
Principles
- Standard embeddings fail at contradiction detection.
- NLI cross-encoders capture semantic opposition.
Method
The Discriminant Direction Approach uses NLI cross-encoders to classify (premise, hypothesis) pairs. It extracts a pure contradiction direction by orthogonalizing the raw contradiction direction against the entailment direction using Gram-Schmidt, then projects normalized hidden states onto this pure direction to score claims.
In practice
- Use NLI cross-encoders for contradiction detection.
- Orthogonalize discriminant directions for pure contradiction.
- Implement a threshold for flagging contradictions.
Topics
- LLM Hallucinations
- Contradiction Detection
- Natural Language Inference
- Cross-Encoders
- Sentence Embeddings
Best for: AI Engineer, Machine Learning Engineer, AI Researcher
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Agus’s Substack.