Layer-Resolved Optimal Transport for Hallucination Detection in NMT and Abstractive Summarization

2026-06-11 · Source: Takara TLDR - Daily AI Papers · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Expert, quick

Summary

Layer-Resolved Optimal Transport (OT) is extended for unsupervised hallucination detection in neural machine translation (NMT) and abstractive summarization. Analyzing the Fairseq DE-EN model's six decoder layers (N=3,414), researchers found Wass-to-Unif and Wass-to-Data are complementary detectors, with detection concentrated in layers L1-L4. Hallucinated NMT translations notably lack an initial exploratory attention phase. For abstractive summarization, the OT detector on AggreFact (N=1,116) achieved 57.2%/57.6% balanced accuracy on CNN/XSum, which is below supervised MiniCheck-Flan-T5-L (69.9%/74.3%). This gap arises because unfaithful summaries can attend correctly but misrepresent content, a failure mode invisible to concentration-based OT. Structural experiments on T5-base confirmed consistent decoder organization, with Layer3 showing peak concentration and Layer12 being critical for generation quality. OT is reliable when failure is source disengagement but limited when faithfulness issues are downstream of attention.

Key takeaway

For machine learning engineers evaluating NMT or abstractive summarization models, understand that Layer-Resolved Optimal Transport effectively identifies hallucinations stemming from source disengagement. However, it is fundamentally limited in detecting unfaithful summaries where content misrepresentation occurs despite correct source attention. You should consider combining OT with other faithfulness metrics to cover a broader spectrum of potential failure modes.

Key insights

Optimal Transport detects hallucinations from source disengagement, but not content misrepresentation.

Principles

Wass-to-Unif and Wass-to-Data are complementary hallucination detectors.
Hallucination detection is layer-dependent in NMT decoders.
Unfaithful summaries can attend correctly while misrepresenting content.

Method

Optimal Transport measures geometric distance between cross-attention distributions and a reference to detect source disengagement in NMT and summarization.

In practice

Apply OT on cross-attention for NMT hallucination detection.
Use OT as an interpretability tool for attention mechanisms.
Recognize OT's limits when faithfulness failures are downstream of attention.

Topics

Neural Machine Translation
Abstractive Summarization
Hallucination Detection
Optimal Transport
Cross-Attention
Model Interpretability

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Takara TLDR - Daily AI Papers.