Type VI Hallucination: When LLMs Apply the Wrong Domain’s Rules

· Source: Agus’s Substack · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Advanced, long

Summary

A new method, Type VI hallucination detection, addresses a critical gap in evaluating Large Language Model (LLM) outputs: identifying claims that are factually correct in one domain but contextually incorrect in another. This "domain mismatch" occurs when LLMs use shared vocabulary with different meanings across distinct fields, such as general Machine Learning (ML) engineering versus financial services model risk management (MRM) under SR 11-7 regulations. The proposed geometric approach builds orthonormal bases for specific domain subspaces using exemplar sentences and then measures how well a given LLM claim's embedding aligns with an expected domain versus other domains. A positive Type VI score indicates a domain mismatch, flagging outputs that might pass traditional factual checks (Types I-V) but are semantically inappropriate for the intended context.

Key takeaway

For AI Engineers or Data Scientists building LLM applications in regulated or specialized fields, you should integrate Type VI hallucination detection to prevent contextually incorrect outputs. This is especially crucial where shared vocabulary carries different meanings across domains (e.g., ML vs. regulatory compliance). Implementing this geometric approach ensures your LLM outputs are not only factually accurate but also semantically appropriate for the target domain, mitigating risks of misinterpretation and non-compliance.

Key insights

Type VI hallucination detection identifies domain-specific semantic mismatches in LLM outputs using geometric embedding analysis.

Principles

Method

Build orthonormal bases for domain subspaces from exemplar sentences using QR decomposition. Measure a claim's "support" in each domain's subspace. A Type VI score is the difference between support in the best-matching wrong domain and the expected domain.

In practice

Topics

Best for: Machine Learning Engineer, AI Engineer, Data Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Agus’s Substack.