From AI Agents to Faster Kernels: Ben Burtenshaw & Felix LeClair (AI Plumbers #2)
Summary
Ben from Hugging Face, an advocate for high-performance kernels, discusses the Kernels Hub, a new library designed to simplify the installation of optimized kernels like Flash Attention 3 and 4. This hub reduces installation time from hours to seconds, primarily benefiting users on single instances and older hardware by lowering costs and saving time. The discussion also covers the use of AI agents, specifically Claude, in generating and optimizing kernels. While initial attempts with Claude 4.5 were challenging, defining "skills" with best practices and example scripts significantly improved its performance, enabling it to generate a kernel for Diffusers on H100 that was 50% faster than the baseline. This "skill" approach also improved open-weight models like Kimi 2.5, making them more efficient in token consumption.
Key takeaway
For Computer Vision Engineers seeking to optimize model performance on diverse hardware, you should explore the Hugging Face Kernels Hub. This resource simplifies access to high-performance kernels, including those generated by AI agents, potentially reducing operational costs and extending the utility of older hardware. Consider contributing to the Kernels GitHub repository to help expand support for a wider range of hardware configurations.
Key insights
AI agents can generate optimized kernels, but structured "skills" and robust infrastructure are crucial for widespread adoption and benefit.
Principles
- Optimized kernels extend hardware lifespan.
- Standardized kernel declarations enhance education.
Method
Define agent "skills" using best practices, reference files, and example scripts to guide kernel generation, then benchmark performance with generated test sets.
In practice
- Use Kernels Hub for rapid kernel installation.
- Explore agent-generated kernels for older hardware.
- Contribute to Hugging Face GitHub "good first issues".
Topics
- High-Performance Kernels
- Hugging Face Kernels Hub
- AI Agents
- Hardware Optimization
- Machine Learning Infrastructure
Best for: Computer Vision Engineer, Machine Learning Engineer, AI Engineer, MLOps Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by HuggingFace.