ChinAI #349: Tokens Made in China?

2022-03-07 · Source: ChinAI Newsletter · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Public Policy & Governance · Depth: Intermediate, quick

Summary

On February 24, 2026, OpenRouter data revealed that Chinese models MiniMax M2.5, Moonshot AI's Kimi K2.5, and Zhipu GLM-5 were the top three most popular models on its API aggregation platform, favored by individual developers and AI start-ups. This popularity stems from their superior price-to-performance ratio; for example, MiniMax M2.5 achieved 80.2% on a software engineering task compared to Claude Opus 4.6's 80.8%, but cost only $0.30 per million tokens versus Claude's $5. This cost advantage is attributed to China's 40% lower electricity prices and algorithmic efficiencies like DeepSeek V3's MoE architecture, which reduces inference costs by approximately 36 times compared to GPT-4o. The article highlights how this trend, particularly in "agentic flows run by U.S. firms," represents an "invisible" export of Chinese computing power and electricity, challenging the narrative of U.S.-China tech decoupling.

Key takeaway

For AI developers and start-ups prioritizing cost-efficiency, you should evaluate Chinese large language models like MiniMax M2.5 and Kimi K2.5, especially for agentic workflows. Their significantly lower token costs, driven by cheaper electricity and algorithmic advantages, offer a compelling alternative to more expensive Western models. However, be mindful of potential data residency and compliance issues for sensitive corporate data, as API requests are processed through Chinese data centers.

Key insights

Chinese AI models are gaining traction among U.S. developers due to their compelling price-to-performance ratio.

Principles

Cost sensitivity drives AI model adoption.
Algorithmic efficiency reduces inference costs.
Tokens bypass traditional trade barriers.

Method

Chinese models achieve cost efficiency through lower electricity prices and advanced architectures like Mixture-of-Experts (MoE) that activate only a subset of parameters during inference.

In practice

Consider Chinese models for cost-sensitive AI agent development.
Evaluate models based on price-to-performance benchmarks.
Explore MoE architectures for inference cost reduction.

Topics

Chinese AI Models
OpenRouter Platform
Price-Performance Ratio
AI Agent Workflows
US-China Tech Relations

Best for: Machine Learning Engineer, NLP Engineer, CTO, AI Engineer, AI Researcher, Policy Maker

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by ChinAI Newsletter.