Holotron-12B - High Throughput Computer Use Agent

2026-03-17 · Source: Hugging Face - Blog · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems · Depth: Advanced, short

Summary

H Company has released Holotron-12B, a 12-billion parameter multimodal computer-use model, post-trained from the open NVIDIA Nemotron-Nano-2 VL model on proprietary data, specifically optimized for high-throughput inference and performance in production agentic workloads. Its hybrid State-Space Model (SSM) and attention architecture enables over 2x higher throughput compared to Holo2-8B on a single H100 GPU, achieving 8.9k tokens/s at 100 concurrency due to efficient VRAM utilization. Trained on approximately 14 billion tokens, Holotron-12B significantly improved WebVoyager performance from 35.1% to 80.5% and showed strong gains on localization benchmarks like OS-World-G, GroundUI, and WebClick. This model is available on Hugging Face under an NVIDIA Open Model License, proving the Nemotron VL model's strong foundation for real-world multimodal agents. H Company plans to leverage the newly announced NVIDIA Nemotron 3 Omni to further scale agentic intelligence for commercial "computer use" deployments.

Key takeaway

Holotron-12B, a 12B multimodal computer-use agent model post-trained from NVIDIA Nemotron-Nano-2 VL, leverages a hybrid State-Space Model (SSM) and attention architecture for high-throughput inference. It achieves over 2x higher throughput (8.9k tokens/s at 100 concurrency on H100) and boosts WebVoyager agent performance from 35.1% to 80.5% compared to Holo2-8B. This makes it ideal for throughput-bound agentic workloads like data generation and online reinforcement learning, enabling efficient scaling for real-world autonomous computer-use deployments.

Topics

Holotron-12B
Multimodal Models
State-Space Models
Agentic AI
NVIDIA Nemotron

Best for: AI Architect, AI Engineer, Machine Learning Engineer, MLOps Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Hugging Face - Blog.