Automating GPU Kernel Translation with AI Agents: cuTile Python to cuTile.jl

· Source: NVIDIA Technical Blog · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering · Depth: Advanced, medium

Summary

NVIDIA CUDA Tile (cuTile) is a tile-based programming model for GPU kernels. cuTile.jl extends this model to Julia, enabling custom GPU kernel development without CUDA C++. This allows Julia's scientific computing ecosystem, including differential equations and physics simulations, to access optimized GPU acceleration. A key challenge is translating existing cuTile Python kernels to cuTile.jl due to semantic differences in indexing, broadcasting, memory layout, and loop forms, which can lead to silent data corruption rather than compiler errors. To address this, NVIDIA developed an AI-assisted workflow, packaged as an LLM skill in TileGym, which systematizes the translation process by encoding critical rules, API mappings, and validation steps. This skill facilitates accurate, repeatable cross-domain-specific language GPU kernel translation, demonstrated through matrix multiplication and softmax examples.

Key takeaway

For AI Engineers or Research Scientists porting GPU kernels between domain-specific languages like cuTile Python and cuTile.jl, you should leverage structured AI agent skills. This approach, exemplified by TileGym's conversion skill, captures critical translation rules and pitfalls, significantly reducing manual effort and preventing silent data corruption. Your team can achieve faster, more reliable kernel translations by systematizing the process with validated examples and static checkers.

Key insights

AI-assisted workflows can translate GPU kernels between DSLs by encoding domain-specific rules and pitfalls.

Principles

Method

The method involves analyzing source kernels, applying API mappings and critical rules, running static validation, testing against reference implementations, and debugging using a structured guide.

In practice

Topics

Code references

Best for: Machine Learning Engineer, AI Engineer, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by NVIDIA Technical Blog.