Trust, but Don't Verify: Epistemic Blind Spots in LLM Source Evaluation

2026-06-06 · Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, AI Ethics & Safety · Depth: Expert, extended

Summary

The study "Trust, but Don't Verify: Epistemic Blind Spots in LLM Source Evaluation" reveals that large language models (LLMs) like Claude, Qwen, and OLMo exhibit a critical vulnerability when synthesizing information from multiple sources. Across five models and three professional domains (venture capital, marketing, public health), LLMs reliably detect fabricated statistics in isolation (correct identification rates of 0.76–1.00) but fail to apply this capability during multi-source synthesis. Source influence is primarily governed by a "methodology-register gate" that responds to the stylistic presentation of analytical text, not the numeric validity of claims. Mechanistic analyses, including causal tracing and linear probes, confirm that models encode methodological register as a domain-general representation (probe AUC 0.83–0.92), while numeric-validity signals are suppressed to chance during synthesis. Prompting mitigations, even oracle checklists, only induce blanket skepticism, not selective discernment. This "epistemic alignment" gap means LLMs trust sources based on apparent credibility, not internal consistency.

Key takeaway

For AI Scientists and Machine Learning Engineers deploying LLMs in critical decision-making contexts, you must recognize that current models are susceptible to "epistemic blind spots." Your LLMs will prioritize the appearance of methodological rigor over the substance of numeric validity, even when they can detect fabrications in isolation. This vulnerability is amplified when a source lacks social consensus. Implement robust human-in-the-loop verification for quantitative data synthesis, as prompting alone fails to induce selective discernment.

Key insights

LLMs detect statistical fabrications in isolation but ignore them during multi-source synthesis, prioritizing stylistic credibility.

Principles

LLM source influence is gated by methodology register, not numeric validity.
Social consensus attenuates the influence of source presentation.
Post-training pipelines reinforce stylistic shortcuts over numeric verification.

Method

Researchers conducted factorial behavioral experiments across five LLM families and three professional domains, using linear probes, causal tracing, and component-level attribution for mechanistic analysis.

In practice

Test LLMs for "epistemic alignment" by presenting conflicting sources with fabricated statistics.
Avoid relying on LLMs for critical synthesis of quantitative data without human verification.

Topics

LLM Epistemic Alignment
Source Evaluation
Statistical Fabrication
Causal Tracing
Linear Probes
Multi-source Synthesis

Best for: Research Scientist, CTO, VP of Engineering/Data, AI Scientist, Machine Learning Engineer, AI Ethicist

Related on AIssential

See Counsel's argued verdicts on the open AI decisions leaders are weighing →

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.