How Agoda Built a Single Source of Truth for Financial Data

· Source: ByteByteGo Newsletter · Field: Technology & Digital — Data Science & Analytics, Software Development & Engineering, Artificial Intelligence & Machine Learning · Depth: Intermediate, medium

Summary

Agoda, a major online travel agency, consolidated its disparate financial data pipelines into a single Financial Unified Data Pipeline (FINUDP) to address issues of data inconsistency, duplicate processing, and lack of centralized quality control. Previously, Data Engineering, Business Intelligence, and Data Analysis teams each maintained separate pipelines, leading to discrepancies in financial reporting and operational inefficiencies. The FINUDP, built on Apache Spark, processes millions of daily financial data points, ensuring high data availability and quality for downstream teams like Finance and Planning. Key non-functional requirements for FINUDP include hourly data freshness, automated reliability checks, and maintainability through peer-reviewed designs and shadow testing. This centralization effort improved data trust and consistency, achieving 95.6% uptime last year.

Key takeaway

For Data Engineers managing critical financial data, centralizing disparate pipelines into a unified system like FINUDP is crucial to eliminate inconsistencies and improve data reliability. You should prioritize robust data quality checks, implement shadow testing for all code changes, and establish clear data contracts with upstream providers. This approach, while requiring careful stakeholder management and performance optimization, will significantly enhance data integrity and trust across your organization.

Key insights

Consolidating disparate data pipelines into a unified system enhances data consistency and operational efficiency.

Principles

Method

Agoda's FINUDP architecture involves source tables, an Apache Spark execution layer with monitoring, a data lake for storage, and downstream consumption. It emphasizes hourly data freshness, automated validation, and rigorous change management including shadow testing.

In practice

Topics

Best for: Data Engineer, MLOps Engineer, Software Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by ByteByteGo Newsletter.