Authority, Truth, and Citation Bias: A Large-Scale Multi-Domain Benchmark for Studying Epistemic Susceptibility in Large Language Models

2026-06-11 · Source: Machine Learning · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Expert, quick

Summary

AuthorityBench, a 220,564-prompt multi-domain benchmark, investigates how citation presence affects Large Language Model (LLM) epistemic behavior, independent of factual content. It employs a 2x2 factorial design, crossing claim veracity with citation veracity across general knowledge, science, law, and medicine domains. The benchmark includes 40 prompt templates, four venue prestige tiers, and country-coded author names. Evaluating seven models, researchers found that citations, real or fabricated, consistently increase hallucination rates compared to a no-citation baseline. This effect is most pronounced when fabricated citations accompany true claims, boosting hallucination rates by 3 to 22 percentage points, reaching 35 to 77% in general knowledge. Legal claims showed more robustness, while venue prestige and author demographics had negligible impact.

Key takeaway

For NLP Engineers deploying LLMs in citation-augmented systems, you must recognize that citation presence, regardless of its veracity, can significantly increase hallucination rates. Your systems are particularly vulnerable when fabricated citations accompany factually true claims, potentially boosting errors by 3 to 22 percentage points. Implement robust verification layers beyond mere citation presence, especially for general knowledge domains, to mitigate epistemic susceptibility and ensure factual integrity.

Key insights

Citation presence, even fabricated, consistently increases LLM hallucination rates, especially with true claims.

Principles

Citation presence can induce LLM hallucination.
Fabricated citations with true claims are highly problematic.
Domain specificity impacts LLM susceptibility to bias.

Method

AuthorityBench uses a 2x2 factorial design, crossing claim veracity with citation veracity across four domains, 40 prompt templates, and varying venue prestige.

In practice

Scrutinize LLM outputs even with citations.
Be wary of fabricated citations on true claims.
Consider domain robustness for LLM deployment.

Topics

Large Language Models
Citation Bias
Hallucination Rates
Epistemic Susceptibility
Benchmark Datasets
AuthorityBench

Code references

floating-reeds/AuthorityBench

Best for: Research Scientist, CTO, VP of Engineering/Data, AI Scientist, NLP Engineer, AI Ethicist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.