The Strange Case of Elias Thorne, the Imaginary Man AI Chatbots Are Obsessed With

2026-06-13 · Source: AI Archives - VICE · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Emerging Technologies & Innovation · Depth: Novice, quick

Summary

AI chatbots from major companies like OpenAI, Anthropic, and Google consistently invent and reference the same fictional character, Elias Thorne, who appears as a lighthouse keeper, clockmaker, librarian, and explorer in countless stories. Cornell University researchers, examining approximately 20,000 AI-generated narratives, discovered that names such as Elias, Mara, and Elara, alongside occupations like lighthouse keeper and clockmaker, featured in 88 percent of stories, with "Elias the lighthouse keeper" appearing in nearly two-thirds. This phenomenon is attributed not to existing internet culture, but to a side effect of AI safety and alignment training, which restricts models from copyrighted or risky material, thereby creating a shallower resource pool. Furthermore, models trained on datasets from earlier AI systems perpetuate these invented concepts, leading to "cross-pollination" where Elias Thorne now appears in AI-generated books, music, and health guides, highlighting the shallow and unoriginal nature of current chatbot outputs.

Key takeaway

For NLP Engineers evaluating AI output quality, the "Elias Thorne" phenomenon underscores a critical need to scrutinize the originality and factual basis of generated content. You should recognize that current LLM training, constrained by safety filters and dataset reuse, can lead to shallow, repetitive, and invented narratives. Prioritize diversifying your training data sources and implementing robust verification mechanisms to prevent perpetuating synthetic "facts" and ensure genuinely novel, reliable AI outputs.

Key insights

AI models' shallow training data and cross-pollination lead to repetitive, invented content like Elias Thorne.

Principles

AI safety training can inadvertently narrow creative output.
Dataset reuse across AI models perpetuates invented concepts.
The perceived vastness of AI data pools is often an illusion.

In practice

Verify AI-generated "facts" against external sources.
Diversify training data sources for AI models.
Be aware of AI content's potential for unoriginality.

Topics

Large Language Models
AI Hallucinations
AI Safety Training
Dataset Diversity
Content Originality
Cross-pollination

Best for: Research Scientist, AI Scientist, NLP Engineer, Tech Journalist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by AI Archives - VICE.