Competing Biases underlie Overconfidence and Underconfidence in LLMs

2026-04-22 · Source: Nature Machine Intelligence · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Expert, extended

Summary

A Nature Machine Intelligence study published April 22, 2026, reveals that Large Language Models (LLMs) exhibit two competing biases that influence their confidence and decision-making: a choice-supportive bias and an overweighting of contradictory information. Researchers used a two-stage experimental paradigm with models like Gemma 3 12B, Gemma 3 27B, GPT4o, and Llama 70B Instruct, testing their responses to factual queries and reasoning tasks. The choice-supportive bias causes LLMs to inflate confidence and maintain initial answers when their prior response is visible, reducing their tendency to change answers by 71% in some conditions. Conversely, LLMs disproportionately update their confidence in response to opposing advice, deviating from optimal Bayesian reasoning. These biases, identified across diverse models and datasets, explain paradoxical behaviors of overconfidence in initial choices and underconfidence when faced with strong contradictory evidence.

Key takeaway

For research scientists developing or deploying LLMs in high-stakes applications, understanding these inherent biases is crucial for improving reliability. You should account for the choice-supportive bias by carefully designing prompt interfaces that do not inadvertently reinforce initial LLM responses, and mitigate the overweighting of contradictory advice by implementing robust confidence calibration and uncertainty quantification mechanisms. This will enhance the transparency and trustworthiness of LLM decision-making.

Key insights

LLMs exhibit both choice-supportive bias and hypersensitivity to contradictory information, leading to overconfidence and underconfidence.

Principles

Self-consistency preservation drives choice-supportive bias.
Hypersensitivity to contradiction causes disproportionate updating.

Method

A two-stage, stateless querying paradigm was used to isolate the effects of initial answer visibility and advice type on LLM confidence and change-of-mind rates, comparing observed behavior to a Bayesian ideal observer.

In practice

LLMs show reduced change-of-mind when initial answers are visible.
Opposing advice is weighted 2-3 times more strongly than supportive advice.

Topics

LLM Confidence Dynamics
Choice-Supportive Bias
Contradictory Information Overweighting
Bayesian Decision-Making
Model Calibration

Code references

dharshsky/llm-confidence-biases

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, MLOps Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Nature Machine Intelligence.