How DoorDash Optimized Item Availability at Scale Using Elasticsearch

· Source: HackerNoon · Field: Technology & Digital — Software Development & Engineering, Cloud Computing & IT Infrastructure · Depth: Intermediate, medium

Summary

DoorDash significantly improved item availability checks for its high-traffic homepage by moving from live menu service calls to an Elasticsearch-based indexing system. Initial attempts to query the menu service directly resulted in high latency (P99 > 300ms) due to fan-out issues and frequent cache misses from dynamic availability data. The team then indexed real-time menu updates into Elasticsearch. They explored three indexing strategies: nested documents, which yielded 600ms P99 latency due to expensive joins; encoded terms, which reduced latency to 350ms but increased index size by 6x; and finally, integer range fields backed by BKD trees, achieving 250ms P99 latency and baseline storage. This optimization was critical for meeting the homepage's sub-300ms rendering budget.

Key takeaway

For Software Engineers building high-traffic, real-time data services, directly querying a transactional source for availability at scale is often unsustainable. You should prioritize indexing dynamic data into a purpose-built search engine like Elasticsearch. Specifically, explore range field types and BKD trees for time-based availability to achieve optimal latency and storage efficiency, validating changes with production traffic replay.

Key insights

Optimizing high-scale data availability requires indexing and careful selection of database field types.

Principles

Method

Index real-time availability updates into Elasticsearch, using integer range fields with BKD trees to represent time windows for efficient range queries.

In practice

Topics

Best for: Software Engineer, Data Engineer, IT Professional

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by HackerNoon.