Relational Foundation Models for Enterprise Data [Jure Leskovec] - 768
Summary
Kumo's Relational Foundation Model (RFM), co-founded by Jure Leskovec, represents a significant advancement in reasoning over structured relational data. This pre-trained model can make accurate predictions on any database and predictive task without requiring traditional model training. It operates by extracting labeled in-context examples and subgraphs from a database, which a pre-trained neural network then processes in a single forward pass to generate predictions. Unlike conventional tabular deep learning that relies on flattened, feature-engineered single tables, RFM directly learns from raw, multi-tabular data by treating the database as a graph of relationships. This approach yields substantial accuracy improvements, with Kumo reporting a 5% relative gain over state-of-the-art supervised models, increasing to 12% with fine-tuning. RFM is deployed at companies like DoorDash, Reddit, and Coinbase for applications such as fraud detection, customer churn prediction, and recommender systems, demonstrating its ability to handle noisy, incomplete, and cold-start data effectively.
Key takeaway
For Machine Learning Engineers and Data Scientists grappling with complex, multi-tabular enterprise data, Kumo's Relational Foundation Models offer a compelling alternative to traditional feature engineering. You should explore RFMs to achieve significant accuracy gains (up to 12% over SOTA) and streamline model deployment, especially for fraud detection, customer behavior, and recommendation systems. This approach minimizes manual effort and improves performance on noisy or sparse datasets, allowing you to focus on integrating predictions into downstream business processes rather than endless feature iteration.
Key insights
Relational Foundation Models leverage graph neural networks for in-context learning on raw multi-tabular data, enabling accurate predictions without explicit training.
Principles
- Databases are graphs of entity relationships.
- Direct learning on raw data surpasses feature engineering.
- Pre-trained models excel with limited data.
Method
RFM uses in-context learning: it extracts labeled subgraphs from a database, which a pre-trained neural network processes in a single forward pass to predict on unlabeled data.
In practice
- Detect fraud in financial transactions.
- Optimize ad click-through rates.
- Predict customer churn or next best actions.
Topics
- Relational Foundation Models
- Graph Neural Networks
- In-Context Learning
- Enterprise Data
- Fraud Detection
- Recommender Systems
Best for: AI Engineer, Investor, CTO, AI Scientist, Machine Learning Engineer, Data Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by The TWIML AI Podcast with Sam Charrington.