RankGraph-2: Lifecycle Co-Design for Billion-Node Graph Learning in Recommendation

· Source: Artificial Intelligence · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Expert, quick

Summary

RankGraph-2 is a framework deployed at Meta that addresses the challenges of billion-node graph-based retrieval in recommendation systems by co-designing graph construction, representation learning, and real-time serving. This lifecycle co-design approach, applied to similarity-based retrieval (U2U2I and U2I2I), ensures each stage's requirements shape the others, such as serving needing a co-learned cluster index and training benefiting from pre-computed neighborhoods. The system reduces hundreds of trillions of edges to hundreds of billions using subsampling with popularity bias correction, pre-computes multi-hop neighborhoods via personalized PageRank, and co-learns a residual-quantization cluster index. This reduces serving computational cost by 83%. RankGraph-2 achieves 3.8x higher recall than a GAT + Deep Graph Infomax model and 2.1x higher than PyTorch-BigGraph, delivering up to +0.96% CTR and +2.75% CVR, powering over 20 retrieval launches.

Key takeaway

For AI Engineers building large-scale recommendation systems, adopting a lifecycle co-design approach for graph learning is crucial. Your team should integrate graph construction, representation learning, and serving requirements from the outset, rather than optimizing them in isolation. This strategy, exemplified by RankGraph-2's 83% serving cost reduction and significant recall improvements, enables more efficient and performant billion-node retrieval, directly impacting CTR and CVR. Consider pre-computing neighborhoods and co-training indexes to streamline your infrastructure.

Key insights

Co-designing graph construction, representation learning, and real-time serving stages is critical for scalable billion-node graph retrieval.

Principles

Method

RankGraph-2 employs subsampling with popularity bias correction, personalized PageRank for multi-hop neighborhoods, and co-learns a residual-quantization cluster index.

In practice

Topics

Best for: AI Architect, AI Scientist, Research Scientist, Machine Learning Engineer, AI Engineer, MLOps Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.