Why Every Skyrim AI Becomes a Stealth Archer
Summary
An experiment involving three distinct AI agents trained to play Skyrim for 500 hours demonstrated that all three independently converged on the "stealth archer" playstyle, despite being initialized with different reward functions for warrior, mage, and thief archetypes. The researcher, Sirajval, spent $2,000 on compute to conduct this reinforcement learning study, aiming to determine if the stealth archer phenomenon was player psychology or a mathematically optimal strategy. Initially, the AIs diverged into their intended roles, but by hour 127, they began adopting bows and light armor due to the superior damage-to-damage-taken ratio and low resource cost. Even after introducing heavy penalties for using non-assigned weapons, the AIs developed stealth-based variants of their original builds, ultimately returning to stealth archery when penalties were removed. The thief AI, which naturally gravitated towards stealth archery, won a final challenge against max-level bosses in 11 minutes, outperforming the other two.
Key takeaway
For game developers or AI researchers designing complex simulated environments, you should recognize that emergent optimal strategies can arise from underlying mathematical advantages, even when attempting to force diverse behaviors. Your reward function design must account for these dominant strategies, as AIs will converge on the most efficient path. This implies that game balance is critical, and unexpected "meta" strategies can be a product of the system's inherent math, not just player preference.
Key insights
Skyrim's game mechanics make stealth archery a mathematically optimal strategy, even for AI agents.
Principles
- Dominant strategies emerge in complex systems.
- Optimization is a form of intelligence.
Method
Three reinforcement learning AIs were trained in Skyrim with distinct reward functions for warrior, mage, and thief, then observed for 500 hours to see if they converged on a common playstyle.
In practice
- Design reward functions carefully in RL.
- Consider emergent optimal strategies in game design.
Topics
- Reinforcement Learning
- Game AI
- Reward Function Design
- Optimal Strategy
- Convergent Evolution
Best for: AI Scientist, Research Scientist, AI Engineer, Machine Learning Engineer, AI Researcher
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Siraj Raval.