From Clicks to Conversions: Architecting Shopping Conversion Candidate Generation at Pinterest
Summary
Pinterest developed and refined a dedicated shopping conversion candidate generation model to optimize for offsite conversion events, which are sparser and noisier than onsite engagement signals. Launched in 2023, the initial model achieved a 2.3% increase in shopping conversion volume and a 1.5% higher clickthrough rate. Further iterations in 2025 introduced a unified multi-task architecture and an advertiser-level loss function, leading to a 3.1% improvement in Return on Ad Spend for US shopping campaigns. Key technical designs include a multi-surface model, dual positive signals (conversions and click duration-weighted engagement), and negative sampling. The architecture evolved from a sequential DCN v2 and MLP to a parallel design, yielding an 11% gain in offline recall@1000, and from a multi-head to a unified multi-task approach, boosting recall@100 by 42% for conversion tasks. This system serves over 600 million monthly active users.
Key takeaway
For AI Architects designing large-scale recommendation systems, you should prioritize robust data strategies for sparse signals. Implement multi-task learning with weighted auxiliary objectives to stabilize training and improve generalization. Consider parallelizing deep and cross-network components in your retrieval models to capture richer feature interactions, especially when optimizing for high-value, low-frequency events like offsite conversions. This approach can significantly boost conversion metrics and advertiser RoAS.
Key insights
Optimizing for sparse offsite conversions requires multi-task learning, robust data design, and parallel architectural components.
Principles
- Combine sparse conversion data with weighted engagement signals.
- Train a single model across multiple surfaces for data efficiency.
- Parallelize cross-network and deep network learning for richer interactions.
Method
The model uses a two-tower retrieval architecture with DCN v2 and parallel MLP cross layers, optimized via a unified multi-task loss function incorporating advertiser-level signals.
In practice
- Implement click duration-based re-weighting for engagement data.
- Use advertiser-level loss to stabilize sparse conversion signals.
- Adopt parallel DCN v2 and MLP for enhanced feature interaction learning.
Topics
- Shopping Conversion
- Candidate Generation
- Two-tower Models
- DCN v2
- Multi-task Learning
- Feature Engineering
- Pinterest Ads
Best for: Machine Learning Engineer, AI Architect, MLOps Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Pinterest Engineering Blog - Medium.