Surpassing Scale by Efficiency: A Compact 135M Parameter Foundational LLM Natively Adapted for the Bangla Language

2026-06-15 · Source: Computation and Language · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering · Depth: Expert, quick

Summary

bangla-smollm-135m is a compact 135-million parameter decoder-only foundational large language model specifically designed for efficient language modeling in the Bangla script. Developed to address the computational challenges of deploying large models on edge, mobile, and decentralized hardware for low-resource, non-Latin scripts, this model employs a deterministic intersect-and-append token merging strategy. This method, applied between TituLLMs and SmolLM2-135M, effectively mitigates subword script fragmentation without compromising early pretrained parameter states. In zero-shot multi-task benchmark evaluations, including PIQA_bn, OpenBookQA_bn, CommonsenseQA_bn, and Bangla_MMLU, bangla-smollm-135m demonstrates performance comparable to or exceeding models twice its size, such as Gemma-3-270m, and achieves parity with models in the 1B parameter tier. The model is publicly available at rnnandi/bangla-smollm-135m.

Key takeaway

For NLP Engineers developing LLMs for low-resource languages, you should consider compact, specialized architectures like bangla-smollm-135m. This model demonstrates that strategic token merging can achieve performance parity with 1B parameter models, enabling efficient deployment on computationally constrained edge or mobile systems. Evaluate similar efficiency-focused adaptation techniques to overcome subword fragmentation and reduce infrastructure costs for non-Latin script applications.

Key insights

A compact 135M parameter LLM for Bangla achieves 1B-tier performance by efficiently merging tokens, overcoming low-resource script challenges.

Principles

Efficiency can surpass scale for low-resource languages.
Token merging can prevent script fragmentation.
Preserve early pretrained states during adaptation.

Method

A deterministic intersect-and-append token merging strategy, applied between TituLLMs and SmolLM2-135M, adapts foundational models for non-Latin scripts while preserving early pretrained parameter states.

In practice

Deploy 135M models on edge/mobile systems.
Use token merging for non-Latin script adaptation.
Evaluate with PIQA_bn, OpenBookQA_bn, CommonsenseQA_bn, Bangla_MMLU.

Topics

Foundational LLMs
Bangla Language Processing
Low-Resource NLP
Model Efficiency
Token Merging Strategy
Edge AI Deployment

Best for: AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.