Making User-Sequence Data More Cost-Efficient, Faster, and Easier to Use

· Source: Pinterest Engineering Blog - Medium · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, Cloud Computing & IT Infrastructure · Depth: Advanced, long

Summary

Pinterest's Ads Feature Engineering, Core ML, ML Data, and User Understanding teams redesigned their user-sequence platform to address high costs and fragility in ML data stacks. User sequences, defined as ordered lists of recent, relevant events with enrichments, are critical for ranking, retrieval, and recommendation systems, powering training, offline analysis, and online inference. The redesign focused on providing a consistent "events → enriched signals → sequences" contract, improving cost-efficiency, accelerating new event/enrichment onboarding, and supporting both real-time and batch production paths. Key architectural changes included a "one definition, many runtimes" approach using configuration-as-code (Python/JSON), a shared execution engine, a lambda architecture for freshness and completeness, and columnar, time-partitioned storage. This initiative led to significant infrastructure cost reductions, faster onboarding, and improved engagement metrics on major recommendation surfaces.

Key takeaway

For MLOps Engineers building or scaling user-sequence platforms for ranking and recommendation systems, adopting a "one definition, many runtimes" approach is crucial. By defining sequences and enrichments once via configuration-as-code and leveraging a shared execution engine with columnar storage, you can achieve substantial infrastructure cost reductions and accelerate new signal onboarding. This strategy ensures data consistency across training and serving, improving overall system quality and developer productivity.

Key insights

A unified definition for user sequences ensures consistency across real-time indexing, batch processing, and online serving.

Principles

Method

Define user sequences and enrichments once via configuration-as-code. Use a shared execution engine and lambda architecture to process events for real-time, batch, and serving paths, storing in columnar, time-partitioned storage.

In practice

Topics

Best for: AI Engineer, Machine Learning Engineer, MLOps Engineer, AI Architect

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Pinterest Engineering Blog - Medium.