How Airtable Built the Search Layer Behind Their AI Features

2025-12-15 · Source: ByteByteGo Newsletter · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cloud Computing & IT Infrastructure, Software Development & Engineering · Depth: Advanced, long

Summary

Airtable developed a semantic search layer for its AI features, Omni and linked record recommendations, enabling natural language queries over customer data. The architecture, built on Milvus, was primarily shaped by the fact that 75% of customer databases are idle weekly. Key design priorities included 500-millisecond query latency at the 99th percentile, high-throughput writes, horizontal scalability for millions of isolated bases, and self-hosting. They adopted a "one partition per base" strategy within Milvus, overcoming performance issues with hierarchical capping (400 collections, each with 1,000 partitions). For vector indexing, HNSW was selected for its high recall and predictable performance. This memory cost was mitigated by offloading 75% of idle "cold" partitions from memory to storage, leveraging Milvus's capabilities. Recovery involves re-embedding data from source using an existing asynchronous pipeline.

Key takeaway

For AI Architects designing vector search systems for multi-tenant applications, prioritize understanding your specific data access patterns and isolation needs over generic algorithm choices. Your system's performance and cost efficiency will hinge on how you manage idle data and partition strategy. Consider hierarchical capping for scalability and leveraging existing data pipelines for robust recovery, ensuring your design aligns with real-world usage.

Key insights

Data access patterns, not algorithms, dictate optimal system architecture for vector search.

Principles

Distributed systems often require hierarchical capping for flat namespaces.
Vector search involves a tradeoff between memory, latency, and recall.
Measure actual usage patterns before optimizing for cost.

Method

Airtable implemented a hierarchical capping strategy: Milvus clusters hold 400 collections, each with up to 1,000 partitions, ensuring predictable performance for millions of isolated customer bases.

In practice

Use HNSW for high recall and low latency vector search.
Offload idle partitions to storage to reduce memory costs.
Design recovery around existing data pipelines.

Topics

Vector Databases
Milvus
Semantic Search
Multi-tenancy
HNSW Index
Data Partitioning
AI Architecture

Best for: AI Architect, AI Engineer, MLOps Engineer

Related on AIssential

See Counsel's argued verdicts on the open AI decisions leaders are weighing →

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by ByteByteGo Newsletter.