RAG Is Not Machine Learning, and the ML Toolkit Solves the Wrong Problem

2026-06-01 · Source: Towards Data Science · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering · Depth: Advanced, extended

Summary

The article argues that RAG is not machine learning, asserting that applying traditional ML toolkits to RAG problems is a costly misconception. Unlike ML, where answers are predicted, RAG problems involve finding existing answers within documents. The author details how common ML practices like hyperparameter optimization (e.g., chunk size, top-k), aggregate evaluation datasets, and feature-attribution explainability are misapplied in RAG. Instead, RAG system improvement stems from engineering efforts such as better parsing, precise retrieval, and clear prompting. The piece emphasizes viewing RAG as a search engine combined with an LLM for answer generation, where the system's intelligence resides in the development team's domain expertise, not the model itself. A case study illustrates how six months of ML-focused work failed to address a fundamental parsing issue, highlighting the importance of a structural, engineering-centric approach.

Key takeaway

For AI Engineers or MLOps teams building RAG systems, recognize that RAG is an engineering assembly problem, not a model training one. Stop optimizing "hyperparameters" like chunk size with ML tools; instead, structurally design retrieval strategies based on document and question types. Focus your evaluation on specific failure modes like parsing errors or retrieval recall, rather than aggregate accuracy, to diagnose and fix issues efficiently. This approach will prevent wasted effort and build more robust systems.

Key insights

RAG is an engineering problem, not a machine learning problem, requiring search system assembly and domain expertise.

Principles

RAG failures are fixable bugs, not statistical noise.
RAG explainability is documentary, not statistical.
Intelligence in RAG systems resides in the team's domain expertise.

Method

Improve RAG by routing different question types to specific retrieval strategies, focusing on structural decisions over numerical optimization. Evaluate per-failure-mode metrics.

In practice

Route questions to different chunking strategies (e.g., by line, section).
Prioritize retrieval evaluation over generation evaluation.
Provide citations as the primary explanation for RAG answers.

Topics

Retrieval-Augmented Generation
Information Retrieval
RAG System Design
Evaluation Metrics
Prompt Engineering
Document Intelligence

Code references

gkamradt/LLMTest_NeedleInAHaystack

Best for: AI Engineer, MLOps Engineer, Director of AI/ML

Related on AIssential

See Counsel's argued verdicts on the open AI decisions leaders are weighing →

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Towards Data Science.