How Figma Upgraded Data Pipeline from Multi-Day Latency to Real-Time
Summary
Figma overhauled its data synchronization architecture, moving from a full sync cron job to an incremental synchronization pipeline using Change Data Capture (CDC). The original system, established in 2020, copied entire database tables daily into S3 and then Snowflake, becoming a bottleneck by 2023 due to Figma's rapid growth, costing millions annually in dedicated database replicas and delivering data that was days old. The new pipeline, built from lower-level components like Amazon RDS for snapshots, Kafka for streaming CDC events, and Snowflake stored procedures for incremental merges, reduced data freshness from over 30 hours to under three hours. This custom solution also eliminated millions in replica costs and improved performance for tables ten times larger, while enabling new features like sync-on-demand and full change history queries.
Key takeaway
For MLOps Engineers managing growing data infrastructure, adopting incremental synchronization with Change Data Capture (CDC) is crucial. This approach, as demonstrated by Figma, can drastically improve data freshness and reduce operational costs associated with full data copies. You should consider building a custom solution with components like Kafka and Snowflake if off-the-shelf tools cannot handle your scale or specific cloud integrations, ensuring robust validation and automation are integral to the design.
Key insights
Incremental synchronization via Change Data Capture significantly improves data freshness and reduces costs for large-scale data pipelines.
Principles
- Decouple data sources from destinations using streaming platforms.
- Validate data pipelines independently to ensure correctness.
- Automate bootstrap and validation processes for reliability.
Method
Figma implemented CDC to capture database write-ahead logs, streaming changes to Kafka, which Snowflake then consumes for incremental merges. This involved an initial snapshot, followed by continuous change application, with careful timestamp alignment to prevent data loss.
In practice
- Evaluate custom build vs. vendor solutions for data volume and cost.
- Implement independent validation workflows for data integrity.
- Use versioning and atomic view updates for zero-downtime re-bootstraps.
Topics
- Data Pipeline Upgrade
- Change Data Capture
- Incremental Synchronization
- Kafka Streaming Platform
- Snowflake Data Warehouse
Best for: MLOps Engineer, AI Engineer, Entrepreneur, Data Engineer, Software Engineer, AI Architect
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by ByteByteGo Newsletter.