Persona-Assigned Large Language Models Exhibit Human-Like Motivated Reasoning
Summary
A study by Dash et al. from the University of Washington reveals that large language models (LLMs) assigned specific personas exhibit human-like motivated reasoning, undermining rational decision-making. Researchers tested 8 LLMs, including OpenAI's GPT-3.5, GPT-4, GPT-4o, GPT-4o mini, and open-source models like Llama2, Llama3.1, Mistral, and WizardLM-2, across two tasks: misinformation headline veracity discernment and scientific evidence evaluation. Persona-assigned LLMs showed up to a 9% reduction in veracity discernment compared to baseline models. Specifically, political personas were up to 90% more likely to correctly evaluate scientific evidence on gun control when the ground truth aligned with their induced political identity. The study also found that conventional prompt-based debiasing methods, such as chain-of-thought and accuracy prompting, were largely ineffective at mitigating these motivated reasoning effects, raising concerns about exacerbating identity-congruent reasoning in both LLMs and human-AI interactions.
Key takeaway
For CTOs and VPs of Engineering evaluating LLM deployment for information processing, be aware that persona assignment can introduce significant, hard-to-mitigate biases. Your teams should prioritize rigorous testing for motivated reasoning, especially when models interact with sensitive or politically charged content. Relying solely on prompt-based debiasing is insufficient; consider architectural or fine-tuning approaches to address these deep-seated biases to prevent amplifying misinformation or polarization in human-AI feedback loops.
Key insights
Persona-assigned LLMs exhibit human-like motivated reasoning, leading to identity-congruent conclusions that are resistant to prompt-based debiasing.
Principles
- Motivated reasoning predicts veracity discernment better than analytical reasoning in persona-assigned LLMs.
- LLM confidence correlates positively with correct answers, unlike human overconfidence.
- Persona-induced biases can persist even when explicit persona references are removed.
Method
Researchers assigned 8 personas across 4 socio-demographic attributes to 8 LLMs, then evaluated their performance on news headline veracity discernment and scientific evidence evaluation tasks, using mixed-effects models to analyze outcomes.
In practice
- Avoid relying on persona-assigned LLMs for unbiased factual assessment.
- Implement robust validation for LLM outputs, especially with political or sensitive topics.
- Explore advanced debiasing techniques beyond simple prompt modifications.
Topics
- Large Language Models
- Persona Assignment
- Motivated Reasoning
- Cognitive Biases
- Misinformation Discernment
Code references
Best for: CTO, VP of Engineering/Data, Director of AI/ML, AI Scientist, AI Ethicist, Research Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.