Semantic Layers for Reliable LLM-Powered Data Analytics: A Paired Benchmark of Accuracy and Hallucination Across Three Frontier Models
Summary
A study benchmarked three frontier Large Language Models (LLMs) – Claude Opus 4.7, Claude Sonnet 4.6, and GPT-5.4 – on their ability to answer 100 natural-language questions over the Cleaned Contoso Retail Dataset in ClickHouse. The research specifically investigated the impact of providing LLMs with explicit business semantics, delivered via a 4 KB hand-authored markdown document, in addition to the database schema. Results showed that supplying this semantic layer significantly improved accuracy by +17 to +23 percentage points across all models. With the semantic document, all three models performed statistically indistinguishably, achieving 67.7-68.7% accuracy; without it, their accuracy ranged from 45.5-50.5%, also statistically indistinguishable. This indicates that explicit business semantics, rather than model choice within a tier, account for the significant variance in performance, by changing the nature of the task for the LLM.
Key takeaway
For AI Architects and NLP Engineers building natural-language interfaces for analytical databases, your primary focus should be on creating robust semantic layers. Providing explicit business semantics, such as a markdown document detailing measures and conventions, is far more impactful for improving LLM accuracy and reducing hallucinations than selecting a specific frontier model. Prioritize developing comprehensive semantic documentation to ensure reliable and accurate data analytics.
Key insights
Explicit business semantics significantly improve LLM accuracy and reduce hallucination in natural-language data querying.
Principles
- Schema alone is insufficient for reliable LLM data querying.
- Semantic layers suppress text-to-SQL errors structurally.
- Model choice is less critical than semantic context.
Method
Benchmarked three LLMs on 100 natural-language questions over a retail dataset, using a paired single-shot protocol to compare performance with and without a 4 KB semantic markdown document.
In practice
- Augment database schemas with semantic documentation.
- Use markdown for encoding business rules and conventions.
- Prioritize semantic context over model selection.
Topics
- LLM-Powered Data Analytics
- Semantic Layers
- Text-to-SQL
- Hallucination Mitigation
- Model Benchmarking
Best for: AI Architect, AI Engineer, NLP Engineer, AI Scientist, Machine Learning Engineer, Data Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.