ModernBERT is more efficient than conventional BERT for chest CT findings classification in Japanese radiology reports
Summary
A study published in Scientific Reports on April 3, 2026, compared three Japanese language models—BERT Base, JMedRoBERTa, and ModernBERT—for multi-label classification of 18 chest CT findings from radiology reports. Researchers fine-tuned all models under identical conditions using the CT-RATE-JPN dataset. ModernBERT demonstrated superior efficiency, generating significantly fewer tokens and achieving faster training and inference times while maintaining comparable in-domain performance (74.7% exact match accuracy vs. 72.7% for BERT Base). However, when tested on an external, domain-shifted dataset called RR-Findings, ModernBERT showed the largest decline in exact match accuracy, with BERT Base outperforming both JMedRoBERTa and ModernBERT. Despite this, ModernBERT retained reasonable ranking ability, indicated by smaller average precision differences. The study highlights ModernBERT's computational advantages for in-domain tasks but underscores its sensitivity to linguistic variability in real-world clinical data.
Key takeaway
For NLP Engineers developing solutions for Japanese radiology reports, consider ModernBERT for its efficiency in token generation and faster training/inference. However, if your application involves diverse or domain-shifted real-world clinical data, you should prioritize extensive and varied training data or implement domain-specific calibration strategies to mitigate performance degradation and ensure robust deployment in heterogeneous clinical environments.
Key insights
ModernBERT offers computational efficiency for Japanese medical text classification but requires diverse data for robustness.
Principles
- Efficiency does not always equate to generalizability.
- Domain shift significantly impacts model performance.
Method
Three Japanese language models (BERT Base, JMedRoBERTa, ModernBERT) were fine-tuned on the CT-RATE-JPN dataset for multi-label classification of 18 chest CT findings, then evaluated on an internal test set and an external, domain-shifted RR-Findings dataset.
In practice
- Use ModernBERT for in-domain Japanese medical NLP.
- Prioritize diverse training data for clinical deployment.
Topics
- ModernBERT
- Japanese Radiology Reports
- Chest CT Classification
- Natural Language Processing
- Transformer Models
Best for: NLP Engineer, AI Scientist, Research Scientist, Machine Learning Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Machine learning : nature.com subject feeds.