The Geometry of Truth: Detecting LLM Hallucinations with Geometric Algebra

2026-01-11 · Source: Agus’s Substack · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Advanced, medium

Summary

Type II hallucination, or "contradiction," occurs when a Large Language Model (LLM) generates claims that directly conflict with provided context, despite having high topical relevance. This differs from Type I hallucinations, which are pure fabrications. Examples include reversing financial growth percentages, altering product launch timelines, or flipping medical test results. These contradictions are particularly dangerous because they sound plausible, score high on standard similarity metrics, reverse critical facts, and can lead to compliance violations in regulated fields. Traditional sentence embeddings fail to detect these contradictions, often assigning high similarity scores due to shared entities, necessitating a more sophisticated approach like Natural Language Inference (NLI) cross-encoders.

Key takeaway

For AI Engineers building RAG systems or fact-checking pipelines, you should integrate a Type II contradiction detector to prevent subtle yet critical errors. Standard similarity metrics are insufficient; instead, implement NLI cross-encoders with orthogonalized discriminant directions to accurately identify claims that semantically oppose source context, especially in regulated domains where factual reversal is a high-risk compliance issue.

Key insights

Type II hallucinations are context contradictions, dangerous due to their plausibility and standard similarity metric blindness.

Principles

Standard embeddings fail at contradiction detection.
NLI cross-encoders capture semantic opposition.

Method

The Discriminant Direction Approach uses NLI cross-encoders to classify (premise, hypothesis) pairs. It extracts a pure contradiction direction by orthogonalizing the raw contradiction direction against the entailment direction using Gram-Schmidt, then projects normalized hidden states onto this pure direction to score claims.

In practice

Use NLI cross-encoders for contradiction detection.
Orthogonalize discriminant directions for pure contradiction.
Implement a threshold for flagging contradictions.

Topics

LLM Hallucinations
Contradiction Detection
Natural Language Inference
Cross-Encoders
Sentence Embeddings

Best for: AI Engineer, Machine Learning Engineer, AI Researcher

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Agus’s Substack.