What Do People Actually Want From AI? Mapping Preference Plurality
Summary
Large Language Models (LLMs) are often fine-tuned using Reinforcement Learning from Human Feedback (RLHF) to align with user preferences, but this method has significant limitations. An analysis of 1,500 open-ended responses from the PRISM dataset across 75 countries reveals that people's preferences are highly diverse; most values are requested by fewer than a quarter of respondents, with truthfulness being the sole exception at 49%. Crucially, the term "truthfulness" itself hides divergent meanings, encompassing requests for sourced claims, expert opinions, or even unpopular views. Furthermore, capabilities like human-like model behavior and features such as AI guardrails are controversial. The study also found that users make contextual distinctions (e.g., "by default" versus "if requested") that binary comparisons fail to capture, exposing fundamental problems in current alignment practices and explaining persistent hallucination rates.
Key takeaway
For AI Scientists and Ethicists developing alignment strategies, you must move beyond aggregated, binary preference models. Your current RLHF methods likely flatten diverse user values and misinterpret terms like "truthfulness." You should design systems that accommodate preference plurality and contextual distinctions (default vs. requested) to genuinely address user needs and reduce issues like persistent hallucinations.
Key insights
People's AI preferences are diverse, context-dependent, and often contradictory, challenging current single-model alignment methods.
Principles
- Aggregating preferences flattens diverse values.
- Same words hide divergent meanings.
- Contextual distinctions are crucial for preferences.
In practice
- Avoid binary comparisons for complex preferences.
- Probe deeper into stated user values.
- Recognize controversial AI features.
Topics
- Large Language Models
- Reinforcement Learning from Human Feedback
- AI Alignment
- User Preferences
- Epistemic Violence
- PRISM dataset
Best for: Research Scientist, AI Scientist, AI Ethicist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.