RBCorr: Response Bias Correction in Language Models

2026-02-16 · Source: cs.CL updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Advanced, extended

Summary

RBCorr is a novel, low-cost response bias correction strategy for language models (LMs) that addresses option preference biases in fixed-response questions. Researchers tested RBCorr on 12 open-weight LMs from the Falcon3, Gemma3, and Llama3.1 families, across yes-no, entailment, and multiple-choice questions using datasets like ARITH, bAbI, SNLI, MultiNLI, and MMLU. The study found that LMs exhibit significant response bias, which RBCorr effectively eliminates while boosting or maintaining model performance, particularly for smaller LMs. The method involves mean-normalizing LogProbs values using a small, class-balanced calibration set (typically 100 questions). Comparisons with existing methods like Contextual Calibration (CC) and Batch Calibration (BC) showed RBCorr often yields superior bias reduction and competitive accuracy gains, achieving up to 29% recovered accuracy on the BABI dataset. However, the study also revealed that LogProbs-based correction terms are highly context-specific and do not reliably transfer across different models, datasets, or prompt formats.

Key takeaway

For NLP Engineers and AI Scientists evaluating or deploying open-source language models, implementing RBCorr can significantly improve the accuracy and fairness of fixed-response evaluations, especially for smaller models. Your team should integrate this calibration-based method to uncover latent model performance and ensure benchmarks reflect true capabilities, rather than inherent response biases. Be aware that correction terms are not transferable, requiring specific calibration for each model, dataset, and prompt configuration.

Key insights

RBCorr effectively mitigates language model response bias and improves performance using a simple, low-cost LogProbs-based calibration method.

Principles

Response bias is prevalent in LMs and reduces performance.
Bias correction terms are highly context-specific and non-transferable.
Larger and instruction-tuned models inherently show less response bias.

Method

RBCorr applies mean-normalization to LogProbs values by estimating mean LogProbs for each response option from a small, class-balanced calibration set, then subtracting these means from evaluation set LogProbs.

In practice

Use RBCorr to improve smaller LM performance on closed-form tasks.
Apply RBCorr for fairer LM evaluation by removing label bias.
Calibrate bias correction terms specifically for each model, dataset, and prompt.

Topics

Language Model Bias
Response Bias Correction
LogProbs Calibration
Model Evaluation
Bias Transferability

Code references

ombbhatt/rbcorr_bias_correction

Best for: NLP Engineer, AI Scientist, Research Scientist, AI Researcher, AI Engineer, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CL updates on arXiv.org.