Surpassing Scale by Efficiency: A Compact 135M Parameter Foundational LLM Natively Adapted for the Bangla Language
Summary
bangla-smollm-135m is a compact 135-million parameter decoder-only foundational large language model specifically designed for efficient language modeling in the Bangla script. Developed to address the computational challenges of deploying large models on edge, mobile, and decentralized hardware for low-resource, non-Latin scripts, this model employs a deterministic intersect-and-append token merging strategy. This method, applied between TituLLMs and SmolLM2-135M, effectively mitigates subword script fragmentation without compromising early pretrained parameter states. In zero-shot multi-task benchmark evaluations, including PIQA_bn, OpenBookQA_bn, CommonsenseQA_bn, and Bangla_MMLU, bangla-smollm-135m demonstrates performance comparable to or exceeding models twice its size, such as Gemma-3-270m, and achieves parity with models in the 1B parameter tier. The model is publicly available at rnnandi/bangla-smollm-135m.
Key takeaway
For NLP Engineers developing LLMs for low-resource languages, you should consider compact, specialized architectures like bangla-smollm-135m. This model demonstrates that strategic token merging can achieve performance parity with 1B parameter models, enabling efficient deployment on computationally constrained edge or mobile systems. Evaluate similar efficiency-focused adaptation techniques to overcome subword fragmentation and reduce infrastructure costs for non-Latin script applications.
Key insights
A compact 135M parameter LLM for Bangla achieves 1B-tier performance by efficiently merging tokens, overcoming low-resource script challenges.
Principles
- Efficiency can surpass scale for low-resource languages.
- Token merging can prevent script fragmentation.
- Preserve early pretrained states during adaptation.
Method
A deterministic intersect-and-append token merging strategy, applied between TituLLMs and SmolLM2-135M, adapts foundational models for non-Latin scripts while preserving early pretrained parameter states.
In practice
- Deploy 135M models on edge/mobile systems.
- Use token merging for non-Latin script adaptation.
- Evaluate with PIQA_bn, OpenBookQA_bn, CommonsenseQA_bn, Bangla_MMLU.
Topics
- Foundational LLMs
- Bangla Language Processing
- Low-Resource NLP
- Model Efficiency
- Token Merging Strategy
- Edge AI Deployment
Best for: AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.