Toward Compiler World Models: Learning Latent Dynamics for Efficient Tensor Program Search

2026-06-08 · Source: Machine Learning · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering · Depth: Expert, quick

Summary

A new world-model-inspired evaluator significantly enhances tensor program optimization by modeling schedule evaluation as action-conditioned latent dynamics. This method, implemented within TVM AutoScheduler, addresses the enormous search space and limitations of existing auto-schedulers that treat candidates as static code snapshots. By rolling out scheduling actions in a continuous latent space with a lightweight transition model, it avoids expensive Abstract Syntax Tree (AST) mutation and repeated code encoding. The approach improves representative-subgraph latency over Ansor by 1.37x on GPU and 1.54x on CPU, using the same 64-trial budget. It also matches Ansor-10K within 2.2% geometric mean with 10x fewer measurements. Furthermore, it accelerates full-model inference over PyTorch/PyTorch-opt(cuDNN) by 4.61x/3.67x geometric mean, demonstrating substantial efficiency gains in machine learning systems.

Key takeaway

For Machine Learning Engineers optimizing tensor programs, this dynamic evaluation approach offers significant performance gains. If you are currently using static auto-schedulers like Ansor, consider integrating world-model-inspired evaluators to achieve 1.37x to 1.54x latency improvements on GPU/CPU and accelerate full-model inference by up to 4.61x, drastically reducing measurement costs and search time.

Key insights

Modeling tensor program optimization as latent dynamics improves efficiency and performance over static evaluation.

Principles

Dynamic schedule evaluation outperforms static snapshots.
Latent space modeling reduces computational overhead.
Action-conditioned dynamics capture dependencies.

Method

The method models schedule evaluation as action-conditioned latent dynamics, rolling out scheduling actions in a continuous latent space with a lightweight transition model, then combining dynamic representation with action and hardware features for ranking.

In practice

Integrate into TVM AutoScheduler for optimization.
Apply to GPU and CPU tensor program search.
Accelerate PyTorch model inference.

Topics

Tensor Program Optimization
Compiler World Models
Latent Dynamics
TVM AutoScheduler
Machine Learning Systems
GPU Optimization

Best for: AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, AI Architect

Related on AIssential

See Counsel's argued verdicts on the open AI decisions leaders are weighing →

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.