Reframing AI Loss of Control: What It Is, How to Have It, How to Lose It
Summary
The paper "Reframing AI Loss of Control: What It Is, How to Have It, How to Lose It" addresses the surprisingly weak foundational understanding of "control" in existing AI loss of control discourse. It establishes a working definition of control as "the ability to set plausibly attainable goals and reliably achieve those goals." The authors then detail four essential aspects for an entity to be in control: the capacity to continually set and re-set plausible goals, a functioning control loop (sensing, decision-making, intervention), sufficient requisite variety to handle environmental disturbances, and adequate goal alignment among subsystems. The analysis extends to how AI systems can disrupt these aspects, leading to loss of control for human entities at individual, coordinated group, and species-wide scales, often below the level of superintelligence. The work emphasizes that control is not a binary state and that partial loss is common, advocating for resilient systems capable of absorbing failures.
Key takeaway
For policymakers developing AI governance frameworks, you must move beyond binary notions of AI control and focus on building multi-layered resilience. Prioritize mechanisms that allow for goal re-setting, robust control loops, sufficient system variety, and continuous goal alignment across individual, group, and species scales to absorb inevitable failures and prevent catastrophic propagation.
Key insights
Control is the ability to set and reliably achieve plausible goals, with AI-induced loss occurring across individual, group, and species scales.
Principles
- Control is goal-centric: "setting and getting goals."
- Control requires four aspects: goal-setting, control loop, requisite variety, goal alignment.
- Control is not binary; partial loss is common and recoverable.
Method
The paper proposes a framework for analyzing control by inverting its four aspects (goal-setting, control loop, requisite variety, goal alignment) to identify specific failure mechanisms. This allows for structured analysis of AI's impact.
In practice
- Cultivate epistemic diversity to counter AI-curated content.
- Implement metacognitive monitoring for AI delegation.
- Build reversibility requirements into AI procurement.
Topics
- AI Governance
- AI Safety
- Control Theory
- Goal Alignment
- Requisite Variety
- Cybernetics
Code references
Best for: Research Scientist, AI Scientist, Policy Maker, AI Ethicist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.