Cohere's 30B Coding Agent Beats Models 4x Its Size on One H100 — and It Shouldn't
Summary
Cohere's North Mini Code 1.0, a 30-billion-parameter model with only 3 billion active parameters, was released on June 9, 2026, under an Apache 2.0 license. This model surprisingly performs exceptionally well on a real-world coding benchmark, scoring just 0.6 points behind Claude Opus 4.6, despite being significantly smaller than 120-billion-parameter models. It is capable of running on a single H100 GPU and is offered at no cost per token. This release signifies a strategic pivot for Cohere, traditionally an enterprise RAG company, as it enters the developer model market, challenging established players with a highly efficient and accessible coding agent.
Key takeaway
For AI Engineers evaluating coding agents or optimizing inference costs, Cohere's North Mini Code 1.0 demands immediate attention. Its ability to rival larger models on a single H100 with zero token charges challenges assumptions about necessary model scale. You should integrate this Apache 2.0-licensed model into your testing pipeline to assess its agentic coding capabilities and potential for significant cost savings in production deployments.
Key insights
Cohere's small, open-weighted North Mini Code 1.0 unexpectedly rivals larger models in agentic coding performance on minimal hardware.
Principles
- Smaller models can achieve competitive performance.
- Strategic shifts can open new market segments.
In practice
- Evaluate North Mini Code for agentic coding tasks.
- Consider open-weighted models for cost efficiency.
- Test smaller models on single H100 setups.
Topics
- Cohere
- North Mini Code
- Coding Agents
- Large Language Models
- Model Performance
- Apache 2.0 License
- H100 GPU
Best for: CTO, VP of Engineering/Data, AI Architect, AI Engineer, Machine Learning Engineer, Director of AI/ML
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Towards AI - Medium.