
GenPage: Towards End-to-End Generative Homepage Construction at Netflix
Netflix introduces GenPage, a generative transformer model that replaces the traditional multi-stage homepage recommendation pipeline with a single decoder-only model. Treating user history and request context as a tokenized prompt, GenPage autoregressively generates the full two-dimensional homepage layout — rows and entities together — in real time. The system uses a domain-specific tokenizer for efficiency and product control, a pretraining-then-post-training recipe (either weighted binary classification or GRPO-based RL), and production-specific techniques including semantic embedding fusion for cold start, multi-cadence incremental training, constrained decoding for business rules, and hybrid row decoding for latency. In online A/B tests against a mature production system, GenPage achieved statistically significant gains on the core engagement metric while reducing end-to-end serving latency by 20%. Offline findings show that enriching the prompt context outperforms scaling model capacity in the current regime, and RL post-training increases homepage diversity even without an explicit diversity objective.