Evaluating Small Language Models for English-to-Portuguese Translation: Impact of Model Scale and Quantization
Summary
A study benchmarked dozens of Small Language Models (SLMs) ranging from 135M to 20B parameters for English-to-Portuguese translation, evaluating their performance across various architectures and quantization schemes (FP16, Q8_0, Q4_K_M). Researchers used the FLORES-101 (Portuguese subset, 1,012 sentences) and OPUS-100 (~10k sentences) datasets, measuring translation quality with BLEU, chrF, and BERTScore. Statistical analysis, including Friedman tests and Wilcoxon signed-rank post-hoc comparisons, revealed that 8-bit quantization (Q8_0) largely preserves semantic quality with minimal loss. While 4-bit quantization (Q4_K_M) showed statistically significant degradation in about half of configurations, its effect sizes were negligible to small, primarily impacting lower-capacity models. The research also found a weak correlation between model scale and translation quality, with medium-sized models sometimes outperforming larger ones.
Key takeaway
For AI Engineers designing English-to-Portuguese translation pipelines, you should prioritize 8-bit quantization (Q8_0) to achieve significant computational and deployment cost savings without substantial semantic quality degradation. Do not assume larger models inherently offer better translation quality; instead, evaluate medium-sized SLMs, as they can often match or exceed the performance of their larger counterparts depending on their specific architecture and pretraining.
Key insights
8-bit quantization maintains translation quality in SLMs, while model scale weakly correlates with performance.
Principles
- 8-bit quantization (Q8_0) preserves semantic quality.
- Model scale weakly correlates with translation quality.
Method
SLMs (135M-20B params) were benchmarked for English-to-Portuguese translation using FP16, Q8_0, and Q4_K_M quantization on FLORES-101 and OPUS-100 datasets, evaluating with BLEU, chrF, and BERTScore.
In practice
- Prioritize 8-bit quantization for efficiency.
- Consider medium-sized SLMs over larger ones.
- Evaluate model family and pretraining for quality.
Topics
- Small Language Models
- English-to-Portuguese Translation
- Model Quantization
- Machine Translation Evaluation
- FLORES-101 Dataset
Best for: AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Paper Index on ACL Anthology.