I Turned an Archived 23K-Star Text-to-SQL Project Into a Self-Hosted Tool That Actually Works Out…

· Source: Towards AI - Medium · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering, Data Science & Analytics · Depth: Intermediate, short

Summary

DataChat is a new self-hosted text-to-SQL chat interface, forked from the archived Vanna.ai project, designed to provide an out-of-the-box solution for generating SQL queries from natural language. The original Vanna.ai, which garnered over 23,000 GitHub stars, was archived in March 2026, leaving a gap for self-hosting enthusiasts. DataChat addresses several critical issues found in the original codebase, including the lack of a schema explorer, manual schema refresh requirements, hardcoded database switching, complex frontend build processes, and serialization crashes with complex data types. It integrates a full schema sidebar, automatic schema refreshing, command-line database switching, and an automated frontend build. DataChat supports both cloud LLMs like Gemini (gemini-2.5-flash) and local LLMs via Ollama (mistral-small3.1:latest), allowing users to keep data entirely local. It is MIT licensed and currently supports PostgreSQL and BigQuery.

Key takeaway

For AI Engineers or MLOps teams seeking a self-hosted text-to-SQL solution, DataChat offers a significantly improved experience over the archived Vanna.ai. You should consider deploying DataChat to streamline data access for non-SQL users, benefiting from its automated setup, integrated schema explorer, and support for local LLMs, which enhances data privacy. This tool eliminates common friction points, making it easier to integrate natural language querying into your data workflows.

Key insights

DataChat transforms an archived text-to-SQL project into a functional, self-hosted tool by resolving key usability and technical issues.

Principles

Method

The author forked the Vanna.ai 2.0.2 codebase, implemented a schema sidebar, automated schema refreshes and frontend builds, and patched serialization issues to create DataChat.

In practice

Topics

Code references

Best for: AI Engineer, MLOps Engineer, Software Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Towards AI - Medium.