Relational Foundation Models for Enterprise Data [Jure Leskovec] - 768

· Source: The TWIML AI Podcast with Sam Charrington · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, Life Sciences & Biology · Depth: Expert, extended

Summary

Kumo's Relational Foundation Model (RFM), co-founded by Jure Leskovec, represents a significant advancement in reasoning over structured relational data. This pre-trained model can make accurate predictions on any database and predictive task without requiring traditional model training. It operates by extracting labeled in-context examples and subgraphs from a database, which a pre-trained neural network then processes in a single forward pass to generate predictions. Unlike conventional tabular deep learning that relies on flattened, feature-engineered single tables, RFM directly learns from raw, multi-tabular data by treating the database as a graph of relationships. This approach yields substantial accuracy improvements, with Kumo reporting a 5% relative gain over state-of-the-art supervised models, increasing to 12% with fine-tuning. RFM is deployed at companies like DoorDash, Reddit, and Coinbase for applications such as fraud detection, customer churn prediction, and recommender systems, demonstrating its ability to handle noisy, incomplete, and cold-start data effectively.

Key takeaway

For Machine Learning Engineers and Data Scientists grappling with complex, multi-tabular enterprise data, Kumo's Relational Foundation Models offer a compelling alternative to traditional feature engineering. You should explore RFMs to achieve significant accuracy gains (up to 12% over SOTA) and streamline model deployment, especially for fraud detection, customer behavior, and recommendation systems. This approach minimizes manual effort and improves performance on noisy or sparse datasets, allowing you to focus on integrating predictions into downstream business processes rather than endless feature iteration.

Key insights

Relational Foundation Models leverage graph neural networks for in-context learning on raw multi-tabular data, enabling accurate predictions without explicit training.

Principles

Method

RFM uses in-context learning: it extracts labeled subgraphs from a database, which a pre-trained neural network processes in a single forward pass to predict on unlabeled data.

In practice

Topics

Best for: AI Engineer, Investor, CTO, AI Scientist, Machine Learning Engineer, Data Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by The TWIML AI Podcast with Sam Charrington.