Trust, but Don't Verify: Epistemic Blind Spots in LLM Source Evaluation
Summary
The study "Trust, but Don't Verify: Epistemic Blind Spots in LLM Source Evaluation" reveals that large language models (LLMs) like Claude, Qwen, and OLMo exhibit a critical vulnerability when synthesizing information from multiple sources. Across five models and three professional domains (venture capital, marketing, public health), LLMs reliably detect fabricated statistics in isolation (correct identification rates of 0.76–1.00) but fail to apply this capability during multi-source synthesis. Source influence is primarily governed by a "methodology-register gate" that responds to the stylistic presentation of analytical text, not the numeric validity of claims. Mechanistic analyses, including causal tracing and linear probes, confirm that models encode methodological register as a domain-general representation (probe AUC 0.83–0.92), while numeric-validity signals are suppressed to chance during synthesis. Prompting mitigations, even oracle checklists, only induce blanket skepticism, not selective discernment. This "epistemic alignment" gap means LLMs trust sources based on apparent credibility, not internal consistency.
Key takeaway
For AI Scientists and Machine Learning Engineers deploying LLMs in critical decision-making contexts, you must recognize that current models are susceptible to "epistemic blind spots." Your LLMs will prioritize the appearance of methodological rigor over the substance of numeric validity, even when they can detect fabrications in isolation. This vulnerability is amplified when a source lacks social consensus. Implement robust human-in-the-loop verification for quantitative data synthesis, as prompting alone fails to induce selective discernment.
Key insights
LLMs detect statistical fabrications in isolation but ignore them during multi-source synthesis, prioritizing stylistic credibility.
Principles
- LLM source influence is gated by methodology register, not numeric validity.
- Social consensus attenuates the influence of source presentation.
- Post-training pipelines reinforce stylistic shortcuts over numeric verification.
Method
Researchers conducted factorial behavioral experiments across five LLM families and three professional domains, using linear probes, causal tracing, and component-level attribution for mechanistic analysis.
In practice
- Test LLMs for "epistemic alignment" by presenting conflicting sources with fabricated statistics.
- Avoid relying on LLMs for critical synthesis of quantitative data without human verification.
Topics
- LLM Epistemic Alignment
- Source Evaluation
- Statistical Fabrication
- Causal Tracing
- Linear Probes
- Multi-source Synthesis
Best for: Research Scientist, CTO, VP of Engineering/Data, AI Scientist, Machine Learning Engineer, AI Ethicist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.