Enhancing Ad Relevance: Integrating Real-Time Context into Sequential Recommender Models
Summary
Pinterest engineers introduced the Contextual Sequential Two Tower Model on May 8, 2026, significantly enhancing ad relevance by integrating real-time online context into their Transformer-based sequential recommender. The prior model, relying only on offline user conversion history, struggled on contextual surfaces like Related Pins, attributing less than 1% of impressions. The new architecture incorporates a context layer into the query tower, combining real-time subject Pin features, such as aggregated interest categories, with historical sequence data. Training utilizes synthetic augmented data from positive labels, while inference employs a hybrid approach, merging offline Transformer encoding with online context layer computation. This model achieved a 3x to 10x increase in Recall@K during offline evaluation. Online, it boosted median candidate relevance by ~275–300%, improved overall Related Pins ad relevance by 1.08%, and delivered a ~0.7% lift in Return on Ad Spend (ROAS), reaching ~1.4% in top countries.
Key takeaway
For AI Engineers optimizing ad recommendation systems on contextual surfaces, you should prioritize integrating real-time user context. Your existing sequential models, if purely offline, likely underperform on surfaces like Related Pins. By adopting a hybrid inference approach and training with synthetic augmented data, you can achieve significant lifts in ad relevance and Return on Ad Spend. Explore advanced fusion techniques like cross-attention to further enhance contextual signal integration.
Key insights
Integrating real-time context into sequential recommenders significantly boosts ad relevance and business performance.
Principles
- Offline historical data alone limits contextual relevance.
- Synthetic data enables training on online-only features.
- Hybrid inference combines pre-computed and real-time signals.
Method
A Contextual Sequential Two Tower Model integrates a context layer into the query tower, concatenating real-time features with Transformer output, trained via synthetic augmented data, and uses hybrid offline/online inference.
In practice
- Augment user embeddings with demographic features.
- Apply high dropout in the context layer during training.
- Investigate cross-attention for context fusion.
Topics
- Ad Relevance
- Sequential Recommender Models
- Real-Time Context
- Two-Tower Models
- Hybrid Inference
- Synthetic Data Augmentation
Best for: Machine Learning Engineer, AI Engineer, AI Architect
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Pinterest Engineering Blog - Medium.