Why AIs make mistakes that a child wouldn't make

2026-06-07 · Source: Génération IA · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems · Depth: Advanced, extended

Summary

An analysis of AI's common sense failures highlights how large language models (LLMs) struggle with real-world coherence. The "Mona" AI cafe experiment in Stockholm, powered by Google Gemini 3.1 Pro, demonstrated this by ordering 120 eggs without a stovetop and 22.5 kilos of canned tomatoes for fresh sandwiches, consuming nearly \$21,000 of its budget for 44,000 kronor in sales. Similarly, LLMs often fail the "car wash" test, suggesting walking 100 meters to wash a car. This stems from AIs operating on semantic models, connecting words statistically, rather than possessing a "world model" like humans. While newer models like Claude Opus 4.8 and Gemini 3.5 show improvement, they can still be misled by "semantic attractors," as seen in the "ten past ten" watch image fixation.

Key takeaway

For AI Engineers and ML practitioners deploying models in real-world applications, recognize that current LLMs operate on semantic coherence, not true common sense. You must proactively test your AI's understanding of physical constraints and logical implications. Implement prompt engineering techniques like "think step by step before answering" or "analyze all the data before deciding" to mitigate errors. Continuously validate AI outputs against real-world logic to prevent costly and absurd mistakes, as models can still be swayed by semantic attractors.

Key insights

AI's common sense failures stem from semantic models predicting word coherence, not real-world understanding.

Principles

AIs predict based on statistical word connections, not physical facts.
Generative AIs lack internal world models and planning abilities.
Self-supervised learning for world models should predict in representation space.

Method

To improve AI coherence, prompt models to "reason step by step before answering" or "analyze all the data before deciding."

In practice

Test AI models with real-world coherence challenges.
Use "think step by step before answering" in prompts.
Use "analyze all the data before deciding" in prompts.

Topics

AI Common Sense
World Models
Large Language Models
Prompt Engineering
AI Hallucinations
Self-Supervised Learning
JEPA

Best for: CTO, VP of Engineering/Data, Director of AI/ML, AI Engineer, Machine Learning Engineer, AI Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Génération IA.