From Candidates to Clicks: The Engineering Anatomy of Ranking
Summary
This article, Part 6 of the "RecSys for MLEs" series, details the Ranking layer in recommendation systems, contrasting it with the retrieval layer. It explains that ranking models handle 100-2,000 candidates with a 200,000x larger per-item budget (0.1ms vs. 0.0000005ms), allowing for complex features and deep neural networks. The piece covers the three main feature types: dense (continuous), sparse (high-cardinality IDs using embedding tables), and cross features. It reviews the pre-neural era's Logistic Regression and Gradient Boosted Decision Trees (GBDTs), highlighting their limitations with feature interactions and sparse data. Finally, it introduces Google's influential 2016 Wide & Deep architecture, which combines a linear "wide" component for memorization and a deep neural network for generalization, setting the stage for automatic feature crossing models like DCN v2.
Key takeaway
For AI Engineers designing large-scale recommendation systems, understanding the distinct roles and computational budgets of retrieval versus ranking is crucial. Your ranking models can and should leverage dense cross-attention, deep neural networks with hundreds of millions of parameters, and rich user history features, which are infeasible at retrieval scale. Embrace architectures like Wide & Deep to balance memorization of specific patterns with generalization to unseen user-item interactions.
Key insights
Ranking models optimize precision on a small candidate set, using a significantly higher per-item computational budget than retrieval models.
Principles
- Retrieval prioritizes recall; ranking prioritizes precision.
- Memorization and generalization require distinct architectural components.
Method
Recommendation systems employ a two-stage architecture: retrieval (e.g., Two-Tower + FAISS) for recall-focused candidate generation, followed by ranking for precision-focused sorting using richer features and models.
In practice
- Use embedding tables for high-cardinality sparse features.
- Combine linear models with deep networks for robust ranking.
- Prioritize interaction features for predictive power.
Topics
- Recommendation Systems
- Ranking Models
- Feature Engineering
- Wide & Deep Learning
- Deep Learning Architectures
Best for: AI Engineer, Machine Learning Engineer, AI Architect
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by MLWhiz: Recs|ML|GenAI.