Cohere open-sources a coding agent that runs on a single H100

· Source: VentureBeat · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering · Depth: Advanced, short

Summary

Cohere has open-sourced North Mini Code, a 30 billion parameter mixture-of-experts (MoE) coding agent designed for agentic software engineering, including sub-agent orchestration and code review. Launched Tuesday, this model features 3 billion active parameters per token, supports a 256,000 token context window, and runs efficiently on a single H100 GPU. Available on Hugging Face under an Apache 2.0 license, it targets tasks like architecture mapping and terminal work. While Cohere claims 2.8x higher output throughput and a 30% inter-token latency advantage over Mistral Devstral Small 2, independent testing by Artificial Analysis noted it generates three times the output tokens of comparable models, potentially increasing inference costs. It ranks 8th of 127 models for output speed (210 tokens/second) and 18th on the Intelligence Index.

Key takeaway

For AI Engineers building high-volume agentic coding pipelines, Cohere's North Mini Code presents a compelling open-source, on-premises alternative to managed models like Claude Fable 5. You should carefully model the total inference cost, considering North Mini Code's higher token verbosity, against the \$50 per million output tokens of proprietary solutions. Prioritize models with purpose-built agentic training and evaluate their performance on real-world terminal environments.

Key insights

Cohere's North Mini Code offers an open-source, locally deployable MoE coding agent for complex software engineering tasks.

Principles

Method

Cohere trained North Mini Code through two stages of supervised fine-tuning, followed by reinforcement learning across 70,000 verifiable tasks and three agent scaffolds (SWE-Agent, Mini-SWE-Agent, OpenCode).

In practice

Topics

Best for: AI Architect, AI Product Manager, Entrepreneur, AI Engineer, Machine Learning Engineer, MLOps Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by VentureBeat.