Making User-Sequence Data More Cost-Efficient, Faster, and Easier to Use
Summary
Pinterest's Ads Feature Engineering, Core ML, ML Data, and User Understanding teams redesigned their user-sequence platform to address high costs and fragility in ML data stacks. User sequences, defined as ordered lists of recent, relevant events with enrichments, are critical for ranking, retrieval, and recommendation systems, powering training, offline analysis, and online inference. The redesign focused on providing a consistent "events → enriched signals → sequences" contract, improving cost-efficiency, accelerating new event/enrichment onboarding, and supporting both real-time and batch production paths. Key architectural changes included a "one definition, many runtimes" approach using configuration-as-code (Python/JSON), a shared execution engine, a lambda architecture for freshness and completeness, and columnar, time-partitioned storage. This initiative led to significant infrastructure cost reductions, faster onboarding, and improved engagement metrics on major recommendation surfaces.
Key takeaway
For MLOps Engineers building or scaling user-sequence platforms for ranking and recommendation systems, adopting a "one definition, many runtimes" approach is crucial. By defining sequences and enrichments once via configuration-as-code and leveraging a shared execution engine with columnar storage, you can achieve substantial infrastructure cost reductions and accelerate new signal onboarding. This strategy ensures data consistency across training and serving, improving overall system quality and developer productivity.
Key insights
A unified definition for user sequences ensures consistency across real-time indexing, batch processing, and online serving.
Principles
- Unify signal definitions across runtimes.
- Use configuration-as-code for sequences.
- Employ lambda architecture for data quality.
Method
Define user sequences and enrichments once via configuration-as-code. Use a shared execution engine and lambda architecture to process events for real-time, batch, and serving paths, storing in columnar, time-partitioned storage.
In practice
- Implement sequence definitions as configuration-as-code.
- Use a shared engine for streaming and batch processing.
- Store sequences in columnar, time-partitioned format.
Topics
- User Sequences
- ML Data Infrastructure
- Recommendation Systems
- Lambda Architecture
- Configuration-as-Code
- Columnar Storage
Best for: AI Engineer, Machine Learning Engineer, MLOps Engineer, AI Architect
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Pinterest Engineering Blog - Medium.