Larch: Learned Query Optimization for Semantic Predicates

· Source: Machine Learning · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Expert, quick

Summary

Larch is a novel framework designed to optimize the execution of semantic filters within AI SQL queries, addressing the high inference costs and latencies associated with Large Language Model (LLM)-enabled database systems. Recognizing that semantic operators are often treated as black boxes, Larch leverages two key observations: the significant room for runtime optimization due to high latencies and the efficiency gained from semantic embeddings accompanying unstructured data. The framework introduces two variants: Larch-A2C, which employs an embedding-augmented Gated Graph Neural Network to encode filter expression trees and models evaluation order as a Markov decision process; and Larch-Sel, which utilizes a supervised learning model for selectivity prediction, followed by dynamic programming for optimal evaluation order. Across diverse real-world and synthetic workloads, both Larch variants consistently outperform existing semantic filter optimization techniques like Palimpzest and Quest, achieving a substantial 3x-19x reduction in total token cost overhead.

Key takeaway

For Machine Learning Engineers optimizing AI SQL queries, Larch offers a significant advancement in managing semantic filter performance. You should consider integrating learned optimization techniques to drastically reduce LLM token costs and inference latencies. This framework changes how you approach complex unstructured data queries, enabling more efficient and scalable analytical operations by leveraging semantic embeddings and predictive models.

Key insights

Larch optimizes AI SQL semantic filters by learning evaluation order and predicting selectivities, significantly reducing token costs.

Principles

Method

Larch-A2C uses GGNNs for filter trees and MDPs for evaluation order. Larch-Sel predicts selectivities via supervised learning, then applies dynamic programming.

In practice

Topics

Best for: AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, AI Architect

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.