Why Sampling Is Not Choosing: Intentionality, Agency, and Moral Responsibility in Large Language Models
Summary
Large language models (LLMs) and other transformer-based AI systems do not qualify as moral agents, according to a recent paper. The analysis posits that moral responsibility requires "commitment-bearing agency," which is fundamentally rooted in "intrinsic intentionality" and the capacity for "self-attributed action." While LLMs produce coherent and normatively evaluable outputs, their operation is fully characterized by probabilistic input-output mappings learned from data. Their apparent intentionality is derived, not intrinsic, meaning outputs are neither owned as commitments nor guided by reasons. The variability from stochastic sampling does not constitute choice or authorship. The paper addresses and refutes objections based on the intentional stance, functionalism, compatibilism, and the presence of moral reasoning in model outputs, asserting that none suffice to establish genuine agency. Consequently, moral responsibility for LLM behavior rests with human designers and users.
Key takeaway
For AI Ethicists and developers designing and deploying LLMs, recognize that your systems, despite sophisticated outputs, lack moral agency or intrinsic intentionality. You should focus alignment and regulation efforts on constraining system behavior within human normative practices. Do not attempt to instill moral agency in the models themselves. Your responsibility for the actions and impacts of these systems remains paramount, as they are tools, not autonomous moral actors.
Key insights
LLMs lack intrinsic intentionality and self-attributed action, precluding moral responsibility despite normative outputs.
Principles
- Moral responsibility requires commitment-bearing agency.
- Agency needs intrinsic intentionality and self-attributed action.
- Free will, for responsibility, is authorship and rational control.
Method
The paper develops a commitment-based account of agency, connecting intrinsic intentionality to moral responsibility via authorship and free will, then applies this framework to transformer-based models.
In practice
- Distinguish normative evaluation from moral agency.
- Focus AI alignment on constraining behavior, not agency.
- Human designers retain responsibility for AI actions.
Topics
- Large Language Models
- Moral Responsibility
- AI Agency
- Intrinsic Intentionality
- Free Will
- Transformer Architectures
Best for: CTO, VP of Engineering/Data, Director of AI/ML, AI Scientist, AI Ethicist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.