Type VI Hallucination: When LLMs Apply the Wrong Domain’s Rules

2026-01-11 · Source: Agus’s Substack · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Advanced, long

Summary

A new method, Type VI hallucination detection, addresses a critical gap in evaluating Large Language Model (LLM) outputs: identifying claims that are factually correct in one domain but contextually incorrect in another. This "domain mismatch" occurs when LLMs use shared vocabulary with different meanings across distinct fields, such as general Machine Learning (ML) engineering versus financial services model risk management (MRM) under SR 11-7 regulations. The proposed geometric approach builds orthonormal bases for specific domain subspaces using exemplar sentences and then measures how well a given LLM claim's embedding aligns with an expected domain versus other domains. A positive Type VI score indicates a domain mismatch, flagging outputs that might pass traditional factual checks (Types I-V) but are semantically inappropriate for the intended context.

Key takeaway

For AI Engineers or Data Scientists building LLM applications in regulated or specialized fields, you should integrate Type VI hallucination detection to prevent contextually incorrect outputs. This is especially crucial where shared vocabulary carries different meanings across domains (e.g., ML vs. regulatory compliance). Implementing this geometric approach ensures your LLM outputs are not only factually accurate but also semantically appropriate for the target domain, mitigating risks of misinterpretation and non-compliance.

Key insights

Type VI hallucination detection identifies domain-specific semantic mismatches in LLM outputs using geometric embedding analysis.

Principles

Each domain occupies a distinctive region of embedding space.
Domain mismatch is quantifiable via projection onto orthonormal bases.

Method

Build orthonormal bases for domain subspaces from exemplar sentences using QR decomposition. Measure a claim's "support" in each domain's subspace. A Type VI score is the difference between support in the best-matching wrong domain and the expected domain.

In practice

Define domain exemplars from expert-written content.
Calibrate score thresholds on known good/bad examples.
Use for post-hoc auditing of LLM-assisted documentation.

Topics

LLM Hallucination Detection
Domain Mismatch
Sentence Embeddings
QR Decomposition
Model Risk Management

Best for: Machine Learning Engineer, AI Engineer, Data Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Agus’s Substack.