Every Advertiser Invoice Comes From a Single Row. Here’s the Schema That Makes That Possible.

· Source: Data Engineering on Medium · Field: Technology & Digital — Data Science & Analytics, Artificial Intelligence & Machine Learning, Software Development & Engineering · Depth: Intermediate, long

Summary

Discord Quests Ads employs a "Gold layer" data schema designed to track performance advertising from user impression to final billing, ensuring every invoice originates from a single row. This constellation schema, driven by specific business questions, features three tiers of fact tables—impression, action, conversion, and billing—alongside dimension tables like `dim_quest` and `dim_user_quest_profile`. Key design principles include defining a clear row grain, separating concerns for different event types, building in MMP (Mobile Measurement Partner) parity with `discord_counted` and `mmp_counted` columns, and maintaining a full audit trail. The `fct_advertiser_billing` table calculates `billable_completions` as `MIN(quest_completions, mmp_completions)` and flags discrepancies exceeding a 2% `parity_gap_pct`.

Key takeaway

For data architects designing complex analytical systems, prioritize defining business questions before schema design to ensure every table serves a clear purpose. Embed reconciliation logic and audit trails directly into your Gold layer, like Discord's `discord_counted` and `mmp_counted` columns, to simplify billing accuracy and fraud detection. This approach streamlines data governance and ensures financial traceability, reducing operational overhead and improving trust in your data products.

Key insights

Designing data schemas around business questions and defining a clear row grain simplifies complex billing and ML data flows.

Principles

Method

Implement a constellation schema with tiered fact tables (impression, action, conversion, billing) and daily snapshot dimensions, ensuring data flows from user interaction to final invoice.

In practice

Topics

Best for: Data Engineer, Data Scientist, MLOps Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Data Engineering on Medium.