Indian AI lab Sarvam has unveiled a new generation of models built to prove that open source can compete on speed, cost, and capability. The lineup spans 30B and 105B-parameter large language models using a mixture-of-experts design, plus text-to-speech, speech-to-text, and a document-focused vision model. It is a clear wager that efficient, locally tuned systems can pry market share from closed platforms dominated by U.S. and Chinese giants.
Framed as a practical push rather than a race to sheer size, the release centers on real-time applications in Indian languages, from voice assistants to enterprise chat. The company says the new models were trained from scratch using government-backed compute under the IndiaAI Mission, with infrastructure from data center operator Yotta and technical support from Nvidia.

Inside the Models: Architecture, Context, and Training
Sarvam’s 30B and 105B models adopt a mixture-of-experts (MoE) architecture that routes each token through only a subset of parameters. That selective activation is designed to keep inference costs in check while preserving depth for complex reasoning. The 30B model supports a 32,000-token context window for fast, interactive use; the 105B model extends to 128,000 tokens for long workflows such as multi-document analysis and code or policy review.
The 30B was pre-trained on roughly 16 trillion tokens, a scale that reflects the growing norm among high-performing open models. The 105B was trained on trillions more, with a deliberate emphasis on Indian languages. Alongside the LLMs, the speech pair targets low-latency conversational agents, while the vision model focuses on parsing and understanding documents—a recurrent need in banking, insurance, and public services.
Crucially, Sarvam says the models were not fine-tuned derivatives of someone else’s checkpoint. Training from scratch allows tighter control over data curation, tokenizer choices for Indic scripts, and safety filters. It also makes eventual licensing and commercial use cleaner for enterprises wary of provenance questions.
Why This Bet Matters for Open Source Adoption
Open models have been steadily closing the gap with proprietary systems on common benchmarks, with community evaluations like the LMSYS Chatbot Arena frequently ranking top open contenders near commercial leaders. Meta’s Llama family and Mistral’s MoE models have shown that thoughtful scaling strategies can deliver strong performance without unlimited compute budgets.
Sarvam’s move doubles down on that trajectory with a regional twist: make models small enough to deploy affordably, strong enough for multilingual tasks, and permissive enough to run on-prem. For regulated sectors—finance, healthcare, government—those three requirements increasingly determine adoption more than headline scores on academic benchmarks.
There’s also a cost reality. Inference, not training, often dominates total ownership over time. MoE designs that keep most parameters idle per token can reduce serving costs without sacrificing headroom for complex prompts. That calculus is where open source can shine, enabling teams to fine-tune for domain tasks and run workloads on commodity or regional cloud GPUs.

Local Advantages and Use Cases Across India
India’s market is tailor-made for this push. The country has over 800 million internet users and a mobile-first culture where voice is the interface of choice for new adopters. With 22 scheduled languages recognized in the Constitution and hundreds of widely spoken dialects, generic English-centric systems routinely miss intent and nuance.
Sarvam’s speech models and document parser target high-frequency pain points: customer support in Hindi or Tamil, onboarding in Bengali or Marathi, form understanding across mixed-language PDFs, and compliance workflows that fuse chat, voice, and document intelligence. A 128K-token window can condense dozens of pages of policy or case law into coherent next steps—vital for law firms, public agencies, and large enterprises.
The company is also framing an application layer: enterprise tools under Sarvam for Work and a conversational agent platform called Samvaad. If executed well, that stack can shorten pilots and make it easier for CIOs to justify production launches instead of lingering in proofs of concept.
Funding, Partnerships, and Risks for Sarvam AI
Founded in 2023, Sarvam has raised more than $50 million from investors that include Lightspeed Venture Partners, Khosla Ventures, and Peak XV Partners. The training run tapped national compute resources via the IndiaAI Mission, a sign of policy alignment around domestic AI capacity and multilingual access.
Two open questions will shape adoption. First, how open is “open”? Sarvam plans to release the 30B and 105B, but has not specified whether training data or full training code will be public—details that matter for transparency, replication, and research. Second, how do these models fare on multilingual benchmarks like FLORES-200, IndicGLUE, or end-to-end speech metrics on Indic datasets? Enterprises will want rigorous, third-party evaluations and clear safety tooling.
A Measured Path to Scale for Open AI Deployment
Leadership signaled a restrained approach to model size, emphasizing targeted capability over raw parameter counts. That stance echoes a broader industry pivot: bigger is useful, but better routing, smarter data, and careful domain adaptation often deliver more return per dollar. In a price-sensitive market, that pragmatism could be the difference between pilots and real deployments.
If Sarvam can convert its architectural bets and local language focus into tangible cost and quality wins, it strengthens the case that open source is not just a research curiosity but a viable backbone for enterprise AI in emerging and multilingual markets. The next few quarters—licensing terms, benchmark releases, and early customer case studies—will reveal whether the gamble pays off.
