Dont ignore omissions!

2026-02-11 · Source: Ehud Reiter's Blog · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Intermediate, short

Summary

The NLP community significantly under-researches omissions in Large Language Model (LLM) generated text, despite their critical real-world impact, particularly in high-stakes domains like medicine and law. While "hallucinations" (false information) receive extensive attention, "omissions" (missing important information) are largely overlooked, with ACL25 showing 96 papers on hallucination versus 0 on omission, and EMNLP 2025 showing 64 versus 1. Studies like Wu et al. (2025) found that 76% of LLM-generated responses causing serious harm in medical cases were due to omissions, not hallucinations. Omissions also pose significant risks in machine translation, risk reporting, summarization, weather forecasts, and coding assistants. Detecting omissions is more challenging than detecting hallucinations, often requiring domain-specific knowledge or "gold standard" content lists, which are less common for summarization tasks.

Key takeaway

For AI Architects and Research Scientists evaluating LLM safety in critical applications like healthcare or legal tech, you must prioritize robust omission detection. Current accuracy-focused benchmarks are insufficient and likely underestimate real-world risks. Incorporate domain expert review or "gold standard" content lists into your evaluation protocols to ensure all vital information is present, as omissions can lead to severe consequences, outweighing the risks of hallucinations in many contexts.

Key insights

LLM omissions are a critical, under-researched problem, especially in high-stakes domains like medicine.

Principles

Accuracy benchmarks underestimate LLM risks.
Domain knowledge is crucial for detecting omissions.

Method

Detecting omissions often involves comparing generated text against a "gold standard" list of required content or using domain experts to identify missing key information.

In practice

Prioritize omission detection in medical LLM deployments.
Develop domain-specific content checklists for evaluation.

Topics

LLM Omissions
Hallucination Detection
Medical NLP
NLG Evaluation
AI Safety

Best for: AI Architect, AI Scientist, Research Scientist, AI Researcher, NLP Engineer, AI Product Manager

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Ehud Reiter's Blog.