Sarvam Challenges Deepseek On Benchmarks

· Source: AIM Network · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Emerging Technologies & Innovation · Depth: Novice, medium

Summary

Bengaluru-based Sarvam AI has released two open-weight large language models, Sarv 30B and Sarv 105B, trained in India using compute from the India AI mission. The 105B model shows strong performance in agentic browsing tasks, outperforming DeepSeek-Coder-1 on BrowseCamp, but trails rivals on coding benchmarks like SWBench. A key strength of the Sarvam models lies in their understanding of 22 Indian languages across 12 scripts, facilitated by a custom tokenizer. The 30B model is optimized for conversational AI, while the 105B model powers Sarvam's AI chatbot, "Indus." Community reception from Indian startup founders has been positive, with plans for integration into new products. Sarvam is also engaging with enterprise customers and partnering with Indian state governments, including Maharashtra and Odisha, and the State Bank of India, to drive adoption and contribute to India's open-source AI ecosystem.

Key takeaway

For NLP Engineers or CTOs evaluating LLMs for the Indian market, prioritize Sarvam AI's models for their deep understanding of 22 Indian languages and cultural nuances. While global benchmarks offer a rough idea, the models' real-world performance in Indic tasks and enterprise adoption potential, including government partnerships, suggest a strong foundation for localized applications. Consider integrating these models to serve Indian customers effectively and contribute to the national AI stack.

Key insights

Sarvam AI's models prioritize Indic language proficiency and real-world utility over global benchmark dominance.

Principles

Method

Sarvam AI trained its models using Indian compute, collected data from diverse Indian sources (web, books, internet), and developed a tokenizer covering 22 Indian languages across 12 scripts to understand cultural nuances.

In practice

Topics

Best for: NLP Engineer, CTO, VP of Engineering/Data, Tech Journalist, AI Product Manager, AI Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by AIM Network.