The Refined Counterfactual Prisoner's Dilemma
Summary
The "Refined Counterfactual Prisoner's Dilemma" is a thought experiment designed to illustrate a potential flaw in expected utility maximization, specifically the assumption that agents stop caring about counterfactual worlds after an observation. Inspired by Scott Garrabrant's critique of utility theory, this dilemma posits an omniscient predictor, Omega, who flips a coin and reveals the result. Regardless of the outcome, Omega demands $1. Crucially, Omega also predicts what the agent would have done if the coin had landed the other way. If Omega predicts the agent would not have paid in the counterfactual scenario, it inflicts $1 million in damage. This setup highlights how ignoring counterfactual outcomes can lead to symmetrically burning significant value by refusing a trivial payment, suggesting deeper issues for decision theories that fail under perfect prediction.
Key takeaway
For AI scientists developing decision-making algorithms, you should critically evaluate your models' assumptions regarding counterfactuals and updatelessness. If your agent's decision theory fails when confronted with a perfect predictor like Omega, it likely harbors fundamental issues that could lead to suboptimal or harmful outcomes in complex, real-world scenarios where predictive capabilities are advanced. Ensure your agents account for potential consequences across unobserved states.
Key insights
Ignoring counterfactual outcomes in decision-making can lead to significant, avoidable losses when facing perfect predictors.
Principles
- Expected utility maximization may be flawed.
- Updatelessness challenges traditional utility theory.
In practice
- Consider counterfactuals in decision theory.
- Test decision theories against perfect predictors.
Topics
- Decision Theory
- Expected Utility Maximization
- Counterfactual Reasoning
- Thought Experiments
- Perfect Predictors
Best for: AI Scientist, AI Researcher, Research Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by AI Alignment Forum.