Announcing General Availability of Real-Time Mode for Apache Spark Structured Streaming on Databricks

2026-03-19 · Source: Databricks · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, Software Development & Engineering · Depth: Advanced, short

Summary

Databricks has announced the General Availability of Real-Time Mode (RTM) in Spark Structured Streaming, enabling millisecond-level latency for existing Spark APIs. This advancement allows Spark to handle ultra-low latency use cases, such as real-time fraud detection, live personalization, and AI agent context generation, which previously required separate specialized engines like Apache Flink. RTM has already been adopted by organizations like Coinbase, achieving sub-100ms P99 latencies for risk management, DraftKings for fraud detection feature computation, and MakeMyTrip, which saw a 7% uplift in click-through rates with sub-50ms P50 latencies for personalized search. RTM transforms the Spark engine through continuous data flow, pipeline scheduling, and streaming shuffle, processing data as it arrives rather than in periodic microbatches. Benchmarks indicate RTM can be up to 92% faster than Flink for certain customer workloads, while also simplifying architecture by unifying batch and streaming operations.

Key takeaway

For AI Architects and ML Engineers building real-time applications, Spark Structured Streaming's Real-Time Mode eliminates the need for separate streaming engines like Flink. You can now consolidate your real-time and batch processing onto a single Spark platform, reducing operational overhead, preventing "logic drift," and accelerating development of low-latency features for fraud detection, personalization, and AI agent steering.

Key insights

Spark Structured Streaming's Real-Time Mode delivers millisecond latency, unifying batch and streaming workloads.

Principles

Continuous data flow reduces latency.
Unified architecture simplifies operations.
Pipeline scheduling enhances throughput.

Method

RTM achieves performance gains by replacing periodic batching with continuous data flow, enabling simultaneous pipeline scheduling, and utilizing streaming shuffle for immediate data transfer between tasks.

In practice

Detect fraud with sub-100ms latency.
Generate real-time context for AI agents.
Power personalized experiences with sub-50ms latency.

Topics

Spark Structured Streaming
Real-Time Mode
Low-Latency Streaming
Apache Flink
Fraud Detection

Code references

databricks-solutions/latency-benchmarks

Best for: CTO, AI Architect, AI Engineer, Machine Learning Engineer, Data Engineer, MLOps Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Databricks.