Bridging the Semantic-Collaborative Gap: An Asymmetric Graph Architecture for Cold-Start Item Recommendation

· Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Expert, extended

Summary

Tubi's production retrieval system introduces Shallow-RHS, an asymmetric graph architecture designed to solve cold-start item recommendation for new content and devices. This model formulates cold-start as an inductive graph-completion problem on a temporal bipartite device–content graph, leveraging the Kumo GNN platform. The Left-Hand Side (LHS) device tower uses watch-history message passing to capture collaborative signals, while the Right-Hand Side (RHS) content tower is intentionally shallow, encoding content solely from intrinsic features like metadata and LLM-based semantic embeddings (e.g., from OpenAI). This forces the content encoder to map intrinsic features into a collaborative-filtering-aware embedding space. For device cold-start, cohort-based embeddings are constructed from demographic features. Large-scale online A/B tests at Tubi demonstrated consistent relative improvements, including a +0.42% global Total View Time (TVT) lift and increased cold-title promotion speed by 13%.

Key takeaway

For Machine Learning Engineers building recommendation systems facing cold-start challenges, you should consider adopting an asymmetric graph architecture like Shallow-RHS. This approach allows you to generate immediate, collaborative-filtering-aware embeddings for new content using only intrinsic features, and for new devices via demographic cohorts. This strategy significantly improves engagement metrics and content promotion speed, enabling effective recommendations even without historical interaction data.

Key insights

An asymmetric graph architecture can bridge the semantic-collaborative gap for cold-start recommendations by aligning intrinsic features with behavioral signals.

Principles

Method

Formulate cold-start as an inductive graph-completion problem on a temporal bipartite device-content graph. Use an asymmetric two-tower architecture (Shallow-RHS) where the content tower is shallow and feature-only, and the device tower uses message passing. Train with temporal softmax link prediction loss.

In practice

Topics

Best for: AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, AI Architect

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.