Mechanisms of Prompt-Induced Hallucination in Vision-Language Models
Summary
A study by Rudman et al. investigates prompt-induced hallucinations (PIH) in large vision-language models (VLMs), where models prioritize textual prompts over conflicting visual evidence. Using a controlled object-counting task with misaligned prompts (e.g., asking for four waterlilies when only three are present), the researchers found that VLMs often correct overestimations at low object counts but increasingly conform to the prompt as the number of objects increases, even with large discrepancies. Through mechanistic analysis of three VLMs (LLaVA-OneVision-7B, Qwen2-VL-7B, and Janus-Pro-7B), the study identified a small set of attention heads, termed PIH-heads, whose ablation significantly reduces hallucinations by at least 40% without additional training. These PIH-heads, primarily located in early language model layers, mediate prompt copying and, when ablated, increase reliance on visual evidence, generalizing to tasks beyond counting, such as color identification, with up to a 94.25% reduction in prompt-color copying.
Key takeaway
For AI Engineers and Research Scientists developing or deploying VLMs, understanding and mitigating prompt-induced hallucinations is critical for reliability. You should consider implementing targeted attention head ablations, particularly in early language model layers, to reduce text-over-vision bias. This approach can significantly improve visual grounding and factual accuracy without requiring extensive retraining, enhancing model robustness in real-world applications with potentially noisy inputs.
Key insights
VLMs hallucinate by prioritizing text over vision, a behavior traceable to specific, ablatable attention heads.
Principles
- PIH increases with object count.
- PIH-heads are concentrated in early LM layers.
- Ablation shifts reliance to visual evidence.
Method
Identify PIH-mediating attention heads via mean ablation, then ablate these heads to reduce prompt-induced hallucinations and enhance visual grounding.
In practice
- Ablate PIH-heads to reduce VLM hallucinations.
- Focus on early LM layers for intervention.
- Test ablation effects across diverse tasks.
Topics
- Vision-Language Models
- Prompt-Induced Hallucinations
- Attention Head Ablation
- Mechanistic Interpretability
- Object Counting Task
Code references
Best for: AI Engineer, Computer Vision Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, AI Architect
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.