Every Advertiser Invoice Comes From a Single Row. Here’s the Schema That Makes That Possible.
Summary
Discord Quests Ads employs a "Gold layer" data schema designed to track performance advertising from user impression to final billing, ensuring every invoice originates from a single row. This constellation schema, driven by specific business questions, features three tiers of fact tables—impression, action, conversion, and billing—alongside dimension tables like `dim_quest` and `dim_user_quest_profile`. Key design principles include defining a clear row grain, separating concerns for different event types, building in MMP (Mobile Measurement Partner) parity with `discord_counted` and `mmp_counted` columns, and maintaining a full audit trail. The `fct_advertiser_billing` table calculates `billable_completions` as `MIN(quest_completions, mmp_completions)` and flags discrepancies exceeding a 2% `parity_gap_pct`.
Key takeaway
For data architects designing complex analytical systems, prioritize defining business questions before schema design to ensure every table serves a clear purpose. Embed reconciliation logic and audit trails directly into your Gold layer, like Discord's `discord_counted` and `mmp_counted` columns, to simplify billing accuracy and fraud detection. This approach streamlines data governance and ensures financial traceability, reducing operational overhead and improving trust in your data products.
Key insights
Designing data schemas around business questions and defining a clear row grain simplifies complex billing and ML data flows.
Principles
- Define a clear row grain for every fact table.
- Separate concerns for delivery, conversion, and billing data.
- Ensure every billing number is traceable to a raw event.
Method
Implement a constellation schema with tiered fact tables (impression, action, conversion, billing) and daily snapshot dimensions, ensuring data flows from user interaction to final invoice.
In practice
- Store ML model scores at impression time for feedback loops.
- Include internal and external completion counts in one fact table for immediate parity checks.
- Use `is_eligible` at impression time for fraud detection and delivery bug identification.
Topics
- Data Modeling
- Constellation Schema
- Data Architecture
- Billing Systems
- Performance Advertising
- ML Feature Engineering
- Data Quality
Best for: Data Engineer, Data Scientist, MLOps Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Data Engineering on Medium.