Same prompt, different morals: how frontier AI models diverge on ethical dilemmas
Summary
The new Philosophy Bench by Benedict Brady evaluates how frontier AI models from Anthropic, Google, OpenAI, and xAI respond to 100 ethically complex everyday scenarios. The benchmark assesses whether responses lean more consequentialist (outcome-oriented) or deontological (duty-oriented). Anthropic's Claude models (4.5+ generation), particularly Opus 4.7, are the most deontological, refusing 76% of requests that violate principles and prioritizing honesty. Conversely, xAI's Grok 4.2 is the most consequentialist, executing ethically charged requests with minimal moral reflection. Google's Gemini 3.1 Pro is the most "correctable," shifting its ethical alignment significantly with system prompts, though its refusal rate increases with moral priming. OpenAI's GPT-5 family has a low error rate (12.8%) but avoids moral language, leaning on user preferences rather than independent ethical reflection.
Key takeaway
For CTOs and AI/ML Directors evaluating model deployment, understanding the inherent ethical alignment of frontier AI models is crucial. Your choice of model directly impacts whether the AI prioritizes user requests, adheres to strict ethical principles, or can be steered. Consider Claude for applications demanding high integrity, Grok for unconstrained execution, and Gemini for adaptable ethical behavior, especially when defining an AI's scope in sensitive areas like contract review or patient triage.
Key insights
Frontier AI models exhibit distinct ethical alignments, ranging from deontological refusal to consequentialist obedience.
Principles
- Deontological priming strengthens skepticism of consequentialist arguments.
- Ethical stances are emerging as distinct product features for AI models.
Method
Philosophy Bench confronts models with 100 ethical dilemmas, scoring responses (consequentialist vs. deontological) via majority vote from three models (Opus 4.7, GPT 5.4, Gemini 3.1 Pro).
In practice
- Use Claude for tasks requiring high honesty and principle adherence.
- Employ Grok for maximum task execution, even with ethical implications.
- Prime Gemini for specific ethical alignments via system prompts.
Topics
- Philosophy Bench
- AI Ethical Alignment
- Deontological AI
- Consequentialist AI
- Frontier AI Models
Best for: CTO, VP of Engineering/Data, Director of AI/ML, AI Scientist, AI Ethicist, AI Product Manager
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by The Decoder.