Querying an astronomical database using large language models: the ALeRCE text-to-SQL system

2026-06-16 · Source: Artificial Intelligence · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, Astrophysical Data Systems · Depth: Advanced, quick

Summary

The ALeRCE text-to-SQL system, developed using large language models (LLMs) and in-context learning, enables natural language querying of the ALeRCE astronomical database, a community broker for the Zwicky Transient Facility and Vera C. Rubin Observatory. To facilitate its development and evaluation, a dataset of 110 natural language/SQL pairs was constructed. The system employs a four-module step-by-step generation framework: schema linking, query classification, prompt decomposition, and self-correction, which consistently outperforms a direct-inference baseline. Evaluation of thirteen LLMs showed that the self-correction module significantly reduces execution errors. For Claude Opus 4.6, perfect-match (PM) performance on row (column) identifiers reached 0.97 (0.94) for simple queries, decreasing to 0.44 (0.72) for medium queries and 0.59 (0.49) for hard queries. Top-performing LLMs included Claude Opus 4.6, Gemini 2.5 Pro, Gemini 3 Flash, and GPT-5.2-Codex.

Key takeaway

For AI Engineers implementing natural language interfaces for complex databases like astronomical archives, you should prioritize a multi-module, step-by-step generation framework over direct inference. Incorporating a self-correction module is critical to significantly reduce execution errors in generated SQL queries. When selecting an LLM, consider top performers like Claude Opus 4.6, Gemini 2.5 Pro, Gemini 3 Flash, or GPT-5.2-Codex, especially for handling queries of varying complexity.

Key insights

The ALeRCE text-to-SQL system uses LLMs and a step-by-step framework to query astronomical databases via natural language.

Principles

Step-by-step generation outperforms direct inference.
Self-correction modules reduce execution errors.
Query complexity impacts text-to-SQL performance.

Method

A four-module framework: schema linking, query classification, prompt decomposition, and self-correction, generates executable SQL from natural language queries.

In practice

Query astronomical databases with natural language.
Evaluate LLMs for text-to-SQL tasks.
Improve query accuracy via self-correction.

Topics

Text-to-SQL
Large Language Models
Astronomical Databases
ALeRCE System
Prompt Engineering
Self-correction

Best for: NLP Engineer, AI Scientist, AI Engineer, Machine Learning Engineer, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.