Toward Compiler World Models: Learning Latent Dynamics for Efficient Tensor Program Search
Summary
A new world-model-inspired evaluator significantly enhances tensor program optimization by modeling schedule evaluation as action-conditioned latent dynamics. This method, implemented within TVM AutoScheduler, addresses the enormous search space and limitations of existing auto-schedulers that treat candidates as static code snapshots. By rolling out scheduling actions in a continuous latent space with a lightweight transition model, it avoids expensive Abstract Syntax Tree (AST) mutation and repeated code encoding. The approach improves representative-subgraph latency over Ansor by 1.37x on GPU and 1.54x on CPU, using the same 64-trial budget. It also matches Ansor-10K within 2.2% geometric mean with 10x fewer measurements. Furthermore, it accelerates full-model inference over PyTorch/PyTorch-opt(cuDNN) by 4.61x/3.67x geometric mean, demonstrating substantial efficiency gains in machine learning systems.
Key takeaway
For Machine Learning Engineers optimizing tensor programs, this dynamic evaluation approach offers significant performance gains. If you are currently using static auto-schedulers like Ansor, consider integrating world-model-inspired evaluators to achieve 1.37x to 1.54x latency improvements on GPU/CPU and accelerate full-model inference by up to 4.61x, drastically reducing measurement costs and search time.
Key insights
Modeling tensor program optimization as latent dynamics improves efficiency and performance over static evaluation.
Principles
- Dynamic schedule evaluation outperforms static snapshots.
- Latent space modeling reduces computational overhead.
- Action-conditioned dynamics capture dependencies.
Method
The method models schedule evaluation as action-conditioned latent dynamics, rolling out scheduling actions in a continuous latent space with a lightweight transition model, then combining dynamic representation with action and hardware features for ranking.
In practice
- Integrate into TVM AutoScheduler for optimization.
- Apply to GPU and CPU tensor program search.
- Accelerate PyTorch model inference.
Topics
- Tensor Program Optimization
- Compiler World Models
- Latent Dynamics
- TVM AutoScheduler
- Machine Learning Systems
- GPU Optimization
Best for: AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, AI Architect
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.