What Do People Actually Want From AI? Mapping Preference Plurality

2026-06-04 · Source: Computation and Language · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Social Sciences & Behavioral Studies · Depth: Expert, quick

Summary

Large Language Models (LLMs) are often fine-tuned using Reinforcement Learning from Human Feedback (RLHF) to align with user preferences, but this method has significant limitations. An analysis of 1,500 open-ended responses from the PRISM dataset across 75 countries reveals that people's preferences are highly diverse; most values are requested by fewer than a quarter of respondents, with truthfulness being the sole exception at 49%. Crucially, the term "truthfulness" itself hides divergent meanings, encompassing requests for sourced claims, expert opinions, or even unpopular views. Furthermore, capabilities like human-like model behavior and features such as AI guardrails are controversial. The study also found that users make contextual distinctions (e.g., "by default" versus "if requested") that binary comparisons fail to capture, exposing fundamental problems in current alignment practices and explaining persistent hallucination rates.

Key takeaway

For AI Scientists and Ethicists developing alignment strategies, you must move beyond aggregated, binary preference models. Your current RLHF methods likely flatten diverse user values and misinterpret terms like "truthfulness." You should design systems that accommodate preference plurality and contextual distinctions (default vs. requested) to genuinely address user needs and reduce issues like persistent hallucinations.

Key insights

People's AI preferences are diverse, context-dependent, and often contradictory, challenging current single-model alignment methods.

Principles

Aggregating preferences flattens diverse values.
Same words hide divergent meanings.
Contextual distinctions are crucial for preferences.

In practice

Avoid binary comparisons for complex preferences.
Probe deeper into stated user values.
Recognize controversial AI features.

Topics

Large Language Models
Reinforcement Learning from Human Feedback
AI Alignment
User Preferences
Epistemic Violence
PRISM dataset

Best for: Research Scientist, AI Scientist, AI Ethicist

Related on AIssential

See Counsel's argued verdicts on the open AI decisions leaders are weighing →

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.