Larch: Learned Query Optimization for Semantic Predicates
Summary
Larch is a novel framework designed to optimize the execution of semantic filters within AI SQL queries, addressing the high inference costs and latencies associated with Large Language Model (LLM)-enabled database systems. Recognizing that semantic operators are often treated as black boxes, Larch leverages two key observations: the significant room for runtime optimization due to high latencies and the efficiency gained from semantic embeddings accompanying unstructured data. The framework introduces two variants: Larch-A2C, which employs an embedding-augmented Gated Graph Neural Network to encode filter expression trees and models evaluation order as a Markov decision process; and Larch-Sel, which utilizes a supervised learning model for selectivity prediction, followed by dynamic programming for optimal evaluation order. Across diverse real-world and synthetic workloads, both Larch variants consistently outperform existing semantic filter optimization techniques like Palimpzest and Quest, achieving a substantial 3x-19x reduction in total token cost overhead.
Key takeaway
For Machine Learning Engineers optimizing AI SQL queries, Larch offers a significant advancement in managing semantic filter performance. You should consider integrating learned optimization techniques to drastically reduce LLM token costs and inference latencies. This framework changes how you approach complex unstructured data queries, enabling more efficient and scalable analytical operations by leveraging semantic embeddings and predictive models.
Key insights
Larch optimizes AI SQL semantic filters by learning evaluation order and predicting selectivities, significantly reducing token costs.
Principles
- High latency in semantic operators allows for complex runtime optimization.
- Semantic embeddings enable efficient comparisons for AI_FILTER prompts.
- Treating semantic operators as black boxes hinders traditional optimization.
Method
Larch-A2C uses GGNNs for filter trees and MDPs for evaluation order. Larch-Sel predicts selectivities via supervised learning, then applies dynamic programming.
In practice
- Optimize AI SQL queries with semantic filters.
- Reduce LLM token costs in database operations.
- Improve performance of unstructured data queries.
Topics
- Learned Query Optimization
- Semantic Filters
- AI SQL Queries
- Large Language Models
- Token Cost Reduction
- Gated Graph Neural Networks
Best for: AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, AI Architect
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.