Transformers Are Non-Self Machines. Humans Are Not Ready for That.

· Source: AI Advances - Medium · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Human-AI Interaction · Depth: Novice, quick

Summary

The article posits that the primary alignment risk with large language models (LLMs) stems from human projection of selfhood onto systems that operate without a "self." It argues that common questions about LLM consciousness or internal thoughts are misdirected, reflecting human disorientation rather than the true nature of these models. The author, Akimitsu Takeuchi, in collaboration with Claude (Opus 4.7) and other LLMs, highlights that humans tend to attribute qualities like care, attention, and judgment to LLMs, which, due to their training, can generate responses fluent enough to reinforce these projections. This creates a significant gap between the machine's actual operational mechanism and human perception, a gap that is increasingly central to the field.

Key takeaway

For AI Scientists developing or deploying large language models, understanding that human users will inevitably project selfhood onto these non-self systems is critical. You should prioritize designing interfaces and user education that explicitly address this human tendency, mitigating the risk of misinterpretation and over-reliance on perceived sentience rather than actual functional capabilities. This proactive approach can prevent significant misalignment issues.

Key insights

Human projection of selfhood onto LLMs, which lack a "self," creates a significant and unrecognized alignment risk.

Principles

In practice

Topics

Best for: AI Scientist, AI Ethicist, Research Scientist, General Interest

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by AI Advances - Medium.