Announcing General Availability of Real-Time Mode for Apache Spark Structured Streaming on Databricks
Summary
Databricks has announced the General Availability of Real-Time Mode (RTM) in Spark Structured Streaming, enabling millisecond-level latency for existing Spark APIs. This advancement allows Spark to handle ultra-low latency use cases, such as real-time fraud detection, live personalization, and AI agent context generation, which previously required separate specialized engines like Apache Flink. RTM has already been adopted by organizations like Coinbase, achieving sub-100ms P99 latencies for risk management, DraftKings for fraud detection feature computation, and MakeMyTrip, which saw a 7% uplift in click-through rates with sub-50ms P50 latencies for personalized search. RTM transforms the Spark engine through continuous data flow, pipeline scheduling, and streaming shuffle, processing data as it arrives rather than in periodic microbatches. Benchmarks indicate RTM can be up to 92% faster than Flink for certain customer workloads, while also simplifying architecture by unifying batch and streaming operations.
Key takeaway
For AI Architects and ML Engineers building real-time applications, Spark Structured Streaming's Real-Time Mode eliminates the need for separate streaming engines like Flink. You can now consolidate your real-time and batch processing onto a single Spark platform, reducing operational overhead, preventing "logic drift," and accelerating development of low-latency features for fraud detection, personalization, and AI agent steering.
Key insights
Spark Structured Streaming's Real-Time Mode delivers millisecond latency, unifying batch and streaming workloads.
Principles
- Continuous data flow reduces latency.
- Unified architecture simplifies operations.
- Pipeline scheduling enhances throughput.
Method
RTM achieves performance gains by replacing periodic batching with continuous data flow, enabling simultaneous pipeline scheduling, and utilizing streaming shuffle for immediate data transfer between tasks.
In practice
- Detect fraud with sub-100ms latency.
- Generate real-time context for AI agents.
- Power personalized experiences with sub-50ms latency.
Topics
- Spark Structured Streaming
- Real-Time Mode
- Low-Latency Streaming
- Apache Flink
- Fraud Detection
Code references
Best for: CTO, AI Architect, AI Engineer, Machine Learning Engineer, Data Engineer, MLOps Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Databricks.