Introducing AI Runtime: Scalable, Serverless NVIDIA GPUs on Databricks for Training and Finetuning

2026-03-19 · Source: Databricks · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cloud Computing & IT Infrastructure, Data Science & Analytics · Depth: Intermediate, quick

Summary

Databricks has announced the Public Preview of AI Runtime (AIR), a new training stack designed to simplify on-demand distributed GPU training for advanced AI workloads. AIR provides serverless access to NVIDIA A10 and H100 GPUs directly within Databricks Notebooks, eliminating the need for cluster management and charging only for active GPU usage. It integrates with Databricks' orchestration suite, including Lakeflow Jobs and Declarative Automation Bundles (DABs), for production-ready GPU workloads. The runtime is optimized for distributed deep learning, bundling performance enhancements like RDMA and high-performance data loading, and comes with pre-installed dependencies and support for frameworks such as PyTorch, Ray, and Hugging Face Transformers. AIR also offers centralized governance and observability through MLflow and Unity Catalog, with current support for distributed training across 8x H100s in a single node.

Key takeaway

For AI Scientists and Research Scientists struggling with GPU infrastructure and distributed training complexities, Databricks' AI Runtime offers a streamlined solution. You can now focus on model development by leveraging on-demand A10 and H100 GPUs, pre-optimized environments, and integrated orchestration tools, significantly reducing setup and debugging time from days to hours.

Key insights

AI Runtime simplifies distributed GPU training by offering serverless, on-demand access and optimized tools within Databricks.

Principles

Focus on modeling, not infrastructure
Pay-as-you-go GPU compute
Integrate training with data pipelines

Method

Configure notebooks for A10/H100 GPUs, use Lakeflow for job orchestration, and leverage pre-optimized distributed training frameworks with MLflow for observability.

In practice

Train LLMs like MPT and DBRX
Develop computer vision models
Fine-tune LLMs for agentic tasks

Topics

AI Runtime
Distributed GPU Training
LLM Fine-tuning
MLOps
NVIDIA GPUs

Best for: AI Scientist, Research Scientist, CTO, AI Researcher, Machine Learning Engineer, MLOps Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Databricks.