Weighting What Matters: Boosting Sample Efficiency in Medical Report Generation via Token Reweighting

2026-04-24 · Source: cs.CL updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Medical Devices & Health Technology · Depth: Advanced, medium

Summary

Researchers from the Technical University of Munich and Imperial College London evaluated a token reweighting method to enhance sample efficiency in training vision-language models (VLMs) for medical report generation. This approach modifies the standard cross-entropy loss function to prioritize semantically salient tokens with high clinical importance, such as "multiple drusen" versus "no drusen." Experiments focused on ophthalmological report generation, utilizing a VLM comprising a pretrained image encoder, a Llama3-3B language model, and a projector layer. The study demonstrated that this reweighted loss consistently improved report quality, specifically for AMD staging and biomarker identification, achieving comparable performance with up to ten times less training data across various dataset scales. The diagnostic keyword set, including terms like "healthy" and "fluid," proved particularly effective.

Key takeaway

For AI Scientists developing medical VLMs in data-scarce environments, implementing token reweighting in your loss function can dramatically improve sample efficiency. This method allows you to achieve high report quality, particularly for critical diagnostic tasks like AMD staging and biomarker identification, using significantly less annotated data. Consider curating specific diagnostic keyword sets to maximize the benefit, potentially outperforming models trained on three times more data without reweighting.

Key insights

Token reweighting in VLM training significantly boosts sample efficiency for medical report generation.

Principles

Not all tokens in medical reports carry equal clinical relevance.
Loss functions should reflect semantic importance of tokens.

Method

Identify clinically significant tokens using predefined keyword sets (e.g., quantitative, diagnostic). Replace standard cross-entropy loss with a normalized weighted objective $\mathcal{L}_{\text{tw}}(x,\lambda)=-\frac{1}{\Lambda}\sum\limits_{i=1}^{T}\lambda_{i}\log p_{\theta}(x_{i}\mid x_{1<i},I_{\kappa}(x))$ to upweight these tokens.

In practice

Use diagnostic keywords for highest impact.
Apply token reweighting to data-constrained medical domains.

Topics

Vision-Language Models
Medical Report Generation
Sample Efficiency
Token Reweighting
Ophthalmology

Best for: AI Scientist, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CL updates on arXiv.org.