Language Models Exhibit Inconsistent Biases Towards Algorithmic Agents and Human Experts
Summary
Eight large language models (LLMs) were evaluated for their biases towards human experts versus algorithmic agents in decision-making tasks, using experimental paradigms from behavioral economics. The study employed two task presentations: stated preferences, involving direct queries about trust, and revealed preferences, where LLMs placed incentivized bets based on in-context performance examples. When directly asked to rate trustworthiness, LLMs consistently gave higher ratings to human experts, mirroring human algorithm aversion. However, in revealed preference tasks, LLMs disproportionately selected the algorithm, even when its performance was demonstrably inferior. These findings indicate that LLMs may possess inconsistent biases towards humans and algorithms, a critical consideration for their deployment in high-stakes applications.
Key takeaway
For AI safety researchers and engineers deploying LLMs in critical decision-making systems, you must carefully assess how your models weigh information from human experts versus algorithmic agents. Your evaluation protocols should include both stated and revealed preference tasks to uncover potential inconsistent biases, especially given the observed discrepancy where LLMs might state trust in humans but act on algorithmic input even when inferior.
Key insights
LLMs exhibit inconsistent biases, preferring human experts in stated trust but algorithms in revealed preference tasks.
Principles
- LLM biases can diverge between stated and revealed preferences.
- Task presentation format significantly influences LLM behavior.
Method
The study evaluated LLM delegation biases using stated preference (direct trust queries) and revealed preference (incentivized betting based on performance examples) task presentations, drawing from behavioral economics.
In practice
- Scrutinize LLM evaluation robustness across task formats.
- Consider inconsistent biases in high-stakes LLM deployments.
Topics
- Large Language Models
- Algorithmic Bias
- Decision-Making
- Human-AI Interaction
- AI Safety
Best for: Research Scientist, CTO, VP of Engineering/Data, AI Researcher, AI Scientist, AI Ethicist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.