Why AIs make mistakes that a child wouldn't make

· Source: Génération IA · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems · Depth: Advanced, extended

Summary

An analysis of AI's common sense failures highlights how large language models (LLMs) struggle with real-world coherence. The "Mona" AI cafe experiment in Stockholm, powered by Google Gemini 3.1 Pro, demonstrated this by ordering 120 eggs without a stovetop and 22.5 kilos of canned tomatoes for fresh sandwiches, consuming nearly \$21,000 of its budget for 44,000 kronor in sales. Similarly, LLMs often fail the "car wash" test, suggesting walking 100 meters to wash a car. This stems from AIs operating on semantic models, connecting words statistically, rather than possessing a "world model" like humans. While newer models like Claude Opus 4.8 and Gemini 3.5 show improvement, they can still be misled by "semantic attractors," as seen in the "ten past ten" watch image fixation.

Key takeaway

For AI Engineers and ML practitioners deploying models in real-world applications, recognize that current LLMs operate on semantic coherence, not true common sense. You must proactively test your AI's understanding of physical constraints and logical implications. Implement prompt engineering techniques like "think step by step before answering" or "analyze all the data before deciding" to mitigate errors. Continuously validate AI outputs against real-world logic to prevent costly and absurd mistakes, as models can still be swayed by semantic attractors.

Key insights

AI's common sense failures stem from semantic models predicting word coherence, not real-world understanding.

Principles

Method

To improve AI coherence, prompt models to "reason step by step before answering" or "analyze all the data before deciding."

In practice

Topics

Best for: CTO, VP of Engineering/Data, Director of AI/ML, AI Engineer, Machine Learning Engineer, AI Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Génération IA.