GenPage: Towards End-to-End Generative Homepage Construction at Netflix

2026-06-29 · Source: Netflix TechBlog - Medium · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, Software Development & Engineering · Depth: Advanced, extended

Summary

Netflix has introduced GenPage, an end-to-end generative system for constructing personalized homepages, replacing its traditional multi-stage recommendation pipeline. GenPage utilizes a single decoder-only transformer model that autoregressively generates the entire homepage, including rows, entities, and layout, based on user context and request. This approach aims for whole-page optimization, improved scaling, and greater flexibility. The system employs custom tokenization for computational efficiency and product control, and its training recipe involves pretraining followed by post-training via Weighted Binary Classification or Reinforcement Learning. In online A/B tests, GenPage delivered statistically significant gains in core user engagement metrics and reduced end-to-end serving latency by 20% compared to Netflix's mature production recommender. Offline analysis revealed that enriching the prompt yielded a 6.9% improvement in WBC loss, significantly more than scaling model capacity from 120M to 900M parameters (1.3% loss reduction).

Key takeaway

For AI/ML Engineers designing or optimizing large-scale personalized recommendation systems, consider adopting an end-to-end generative approach like GenPage. This can simplify complex multi-stage pipelines, significantly improve user engagement, and reduce serving latency by 20%. Prioritize enriching your model's context and prompt engineering, as this can yield greater performance gains than merely scaling model capacity.

Key insights

A single generative transformer can replace complex multi-stage recommenders for structured, whole-page optimization.

Principles

End-to-end generative models simplify complex ML stacks.
Prompt enrichment can outweigh model capacity scaling.
RL post-training enables whole-page optimization.

Method

GenPage tokenizes user context and autoregressively generates homepages using a decoder-only transformer. Training involves pretraining via next-token prediction, then post-training with WBC or RL for page-level optimization.

In practice

Use custom tokenization for domain-specific data.
Employ multi-cadence incremental training for freshness.
Enforce business rules via constrained decoding.

Topics

Generative Recommenders
Transformer Models
Reinforcement Learning
Prompt Engineering
Netflix Personalization
End-to-End ML

Best for: AI Architect, MLOps Engineer, AI Scientist, Machine Learning Engineer, AI Engineer, Director of AI/ML

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Netflix TechBlog - Medium.