DeepSeek V4 - almost on the frontier, a fraction of the price
Summary
DeepSeek AI has released two new preview models, DeepSeek-V4-Pro and DeepSeek-V4-Flash, as part of their V4 series on April 24, 2026. Both are 1 million token context Mixture of Experts models under an MIT license. DeepSeek-V4-Pro, with 1.6T total parameters and 49B active, is now the largest open-weights model, surpassing Kimi K2.6 and GLM-5.1. DeepSeek-V4-Flash features 284B total parameters and 13B active. A key highlight is their aggressive pricing: DeepSeek-V4-Flash costs $0.14/million input tokens and $0.28/million output tokens, making it the cheapest small model, while DeepSeek-V4-Pro is the most affordable large frontier model at $1.74/million input and $3.48/million output. This cost efficiency stems from significant architectural improvements, reducing single-token FLOPs and KV cache size for long contexts compared to DeepSeek-V3.2.
Key takeaway
For AI Engineers evaluating large language models for deployment, DeepSeek-V4-Pro and DeepSeek-V4-Flash present compelling cost-performance options. Your teams can significantly reduce inference costs compared to other frontier models, especially for applications requiring 1M token contexts. Monitor quantized versions from teams like Unsloth for potential local deployment on consumer hardware, further optimizing operational expenses.
Key insights
DeepSeek V4 models offer competitive performance and industry-leading low costs through significant efficiency gains.
Principles
- Efficiency drives competitive pricing.
- MoE architectures scale parameters effectively.
Method
DeepSeek-V4 models achieve efficiency by reducing single-token FLOPs and KV cache size, particularly for 1M-token contexts, compared to prior versions like DeepSeek-V3.2.
In practice
- Use DeepSeek-V4-Flash for cost-sensitive small model tasks.
- Consider DeepSeek-V4-Pro for affordable large-scale frontier model applications.
Topics
- DeepSeek V4
- Mixture-of-Experts
- Large Language Models
- AI Model Pricing
- Model Efficiency
Code references
Best for: CTO, VP of Engineering/Data, AI Engineer, AI Scientist, Machine Learning Engineer, Director of AI/ML
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Simon Willison's Weblog.