Mercedes-Benz Builds a Cross-Cloud Data Mesh with Delta Sharing and Intelligent Replication, Cutting Costs by 66%

· Source: Databricks · Field: Technology & Digital — Cloud Computing & IT Infrastructure, Data Science & Analytics, Artificial Intelligence & Machine Learning · Depth: Intermediate, short

Summary

Mercedes-Benz, a luxury automotive brand, is transitioning to "data-defined vehicles," requiring seamless, secure, and cost-effective data sharing across its multi-cloud (AWS and Azure) and multi-region infrastructure. A critical challenge involved providing Azure-based data consumers access to approximately 60 TB of frequently updated after-sales data stored in AWS, which led to high egress costs, data latency issues with weekly full loads, and format incompatibility (Iceberg on AWS vs. Delta on Azure). To address this, Mercedes-Benz implemented a hybrid solution combining Databricks Delta Sharing for secure cross-cloud data exchange with a controlled local replication mechanism using Delta Deep Clone. This strategy allows for incremental updates and local consumption of replicated data on Azure, significantly reducing egress costs while offering flexibility in data freshness.

Key takeaway

For CTOs or VPs of Engineering managing multi-cloud data architectures, this case demonstrates how a hybrid Delta Sharing and replication strategy can significantly reduce cross-cloud egress costs. You should evaluate your data freshness requirements against transfer costs, implementing direct Delta Shares for real-time needs and incremental replication for less time-sensitive, high-volume datasets to optimize your cloud spend and improve data accessibility.

Key insights

A hybrid data sharing and replication strategy optimizes cross-cloud data access, balancing cost and freshness.

Principles

Method

Mercedes-Benz used Unity Catalog as a global catalog, Delta Sharing for cross-cloud/cross-region exchange, and Delta Deep Clone for periodic, incremental replication of large datasets to minimize egress costs for less time-sensitive workloads.

In practice

Topics

Best for: CTO, VP of Engineering/Data, Executive, Data Engineer, AI Architect, Director of AI/ML

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Databricks.