Production-Grade Text-to-SQL Agent with Claude Code, LangGraph, Langfuse, FastAPI and Qdrant

2026-06-24 · Source: Naturallanguageprocessing on Medium · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering, Data Science & Analytics · Depth: Advanced, long

Summary

This article details the construction of a production-grade Text-to-SQL agent, leveraging Claude Sonnet 4.6 for SQL generation and self-correction, and Claude Haiku 4.5 for hallucination evaluation. The system integrates a 7-node LangGraph 0.2 state machine, a RAG pipeline with Voyage AI voyage-3 embeddings and Qdrant for schema and knowledge base retrieval, and a FastAPI backend with a Streamlit UI. It features robust SQL validation using sqlglot, read-only PostgreSQL execution, and a self-correction loop with up to 3 retries. Observability is provided by Langfuse SDK v4, tracing every agent run and node-level latency. The agent is tested against a realistic UDogRetail dataset and evaluated using GEval for SQL correctness, schema faithfulness, and explanation hallucination. All components are containerized with Docker and Docker Compose, ensuring a reproducible and scalable deployment.

Key takeaway

For AI Engineers building production-grade Text-to-SQL agents, prioritize a robust RAG pipeline and explicit state management. You should implement pre-execution SQL validation and a self-correction loop to enhance reliability and security. Integrate comprehensive observability like Langfuse to trace agent behavior and token usage. Rigorous evaluation with LLM-as-judge metrics is crucial for preventing hallucinations and ensuring schema faithfulness. This guides your prompt tuning and architectural decisions effectively.

Key insights

Building robust Text-to-SQL agents requires RAG, explicit state machines, and rigorous evaluation to prevent hallucinations and ensure reliability.

Principles

Ground SQL generation in retrieved schema context.
Explicitly label critical constraints in prompts.
Enforce read-only access for SQL execution.

Method

The agent uses a 7-node LangGraph state machine: RETRIEVE (Qdrant), GENERATE (Claude), VALIDATE (sqlglot), EXECUTE (PostgreSQL), EXPLAIN, CORRECT (up to 3 retries), CLARIFY.

In practice

Use "include_raw=True" for LLM calls to log token counts.
Implement sqlglot for pre-execution SQL validation.
Sanitize Decimal objects to float for JSON serialization.

Topics

Text-to-SQL
LLM Agents
LangGraph
Retrieval-Augmented Generation
Langfuse
Qdrant

Code references

Best for: AI Engineer, Machine Learning Engineer, MLOps Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Naturallanguageprocessing on Medium.