How Netflix Live Streams to 100 Million Devices in 60 Seconds

· Source: ByteByteGo Newsletter · Field: Technology & Digital — Software Development & Engineering, Cloud Computing & IT Infrastructure · Depth: Advanced, long

Summary

Netflix's Live Origin is a custom-built, multi-tenant microservice on AWS EC2 designed to manage real-time video delivery for live streaming at massive scale, handling events like the Tyson vs. Paul fight with 65 million concurrent streams. It employs redundant regional pipelines and intelligent segment selection to ensure quality, leveraging a manifest design with fixed "2-second" segment durations and millisecond-grain caching for Open Connect optimization. The system evolved from AWS S3 to a custom Key-Value Storage Abstraction built on Apache Cassandra and EVCache, providing high write availability, low-latency replication, and robust read scalability up to 200 gigabits per second. Key architectural decisions include publishing isolation, priority-based rate limiting for critical traffic, and hierarchical metadata caching to effectively mitigate "404 storms" and ensure reliable content delivery. This sophisticated architecture balances write reliability, read scalability, and operational flexibility for global live events.

Key takeaway

Netflix's Live Origin architecture, supporting 65 million concurrent streams, provides a blueprint for high-scale, low-latency live content delivery. It employs redundant pipelines, intelligent segment selection, and a custom Cassandra/EVCache storage system achieving 25ms median latency and 200 Gbps+ read throughput. This design offers critical insights for AI/ML professionals building resilient, real-time inference or data processing pipelines under extreme load.

Topics

Best for: Software Engineer, DevOps Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by ByteByteGo Newsletter.