How Instacart Built a Search for Billions of Products

· Source: ByteByteGo Newsletter · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering, Data Science & Analytics · Depth: Intermediate, long

Summary

Instacart's search infrastructure evolved from a dual-system approach using Elasticsearch for keyword search and a custom FAISS service for semantic search, to a unified Postgres-based solution. Initially, the company faced challenges with Elasticsearch's denormalized data model, which caused catastrophic indexing loads due to billions of daily writes for rapidly changing grocery item data. They migrated keyword search to Postgres, leveraging GIN indexes and ts_rank, reducing write workload by a factor of ten. Later, they integrated semantic search using the pgvector extension, co-locating vectors with relational data. This consolidation eliminated the need for parallel network calls and application-layer joins, resulting in a 2x speedup, a 6% reduction in zero-result searches, and improved recall by enabling pre-filtering of inventory data before semantic search.

Key takeaway

For AI Architects and Machine Learning Engineers designing search systems with high write volumes and dynamic data, consider consolidating your search infrastructure within a relational database like Postgres using extensions such as pgvector. This approach can significantly reduce operational overhead, improve query latency by bringing compute closer to data, and enhance search quality through advanced pre-filtering capabilities, directly impacting user experience and revenue.

Key insights

Consolidating search compute with data improves performance and simplifies complex, high-write workloads.

Principles

Method

Instacart migrated from Elasticsearch to Postgres for keyword search, then integrated semantic search via pgvector, enabling real-time attribute filtering and unified data management.

In practice

Topics

Best for: AI Architect, Machine Learning Engineer, Software Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by ByteByteGo Newsletter.