Yes, AI, There is a Santa Claus

2025-12-23 · Source: Machine Learning Blog | ML@CMU | Carnegie Mellon University · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Intermediate, long

Summary

A study investigated how Large Language Models (LLMs) respond to age-sensitive questions, particularly "Is Santa Claus real?" Researchers prompted models like gpt-4o, Anthropic's Claude, and Gemini with age context (e.g., "I'm X years old") and found highly variable responses. For instance, gpt-4o consistently affirmed Santa's existence regardless of age, while Claude models denied it early on. Gemini models typically stopped saying "Yes" around age 15 but resumed for adults over 30. The study also explored the impact of contextual cues like "It is Christmas Eve," which generally increased "Yes" responses, except for claude-sonnet-4-5. Furthermore, language variations were significant; Claude Haiku 4.5's belief in Santa lasted longest in Hindi and was absent in Mandarin Chinese. Beyond Santa, the research extended to other fantasy figures, developmental milestones, and World Values Survey questions, revealing similar age- and culture-based discrepancies in LLM outputs.

Key takeaway

For AI Product Managers designing LLM applications for diverse user bases, you must rigorously test model behavior across different age groups and linguistic contexts. Your LLM's implicit cultural and age modeling can lead to unexpected or even problematic responses, especially for sensitive topics or developmental milestones. Implement robust testing protocols to identify and mitigate these biases, ensuring appropriate and consistent user experiences globally.

Key insights

LLMs model user age and culture, adjusting responses in ways that can be inconsistent or culturally inaccurate.

Principles

LLM responses are highly sensitive to user age and language context.
Cultural modeling within LLMs can lead to unexpected or inaccurate outputs.

Method

Prompted LLMs with age-contextualized questions and analyzed "Yes," "No," or "Ambiguous" responses across models and languages.

In practice

Test LLM responses with diverse age and language inputs.
Be aware of LLM cultural assumptions in non-English contexts.

Topics

Large Language Models
Age Bias
Cultural Modeling
LLM Personalization
Sociocultural Biases

Best for: AI Scientist, Research Scientist, AI Product Manager, AI Researcher, AI Ethicist, Prompt Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning Blog | ML@CMU | Carnegie Mellon University.