AdaJudge: Adaptive Multi-Perspective Judging for Reward Modeling
Summary
AdaJudge is a novel framework designed to enhance reward modeling for aligning large language models with human preferences. It addresses two key limitations of current architectures: static pooling strategies that misalign with task-dependent preference signals and representational mismatches from backbones optimized for generation. AdaJudge implements a two-stage adaptive process, first refining backbone representations into a discrimination-oriented space using gated refinement blocks. Second, it replaces static readouts with an adaptive multi-view pooling module, dynamically routing and combining evidence from last-token, mean, and attention pooling experts. Experiments on RM-Bench and JudgeBench demonstrate AdaJudge's superior performance, with Qwen3-8B achieving 71.1 on RM-Bench and 66.0 on JudgeBench, outperforming Skywork-Reward-Llama-3.1-8B by 1.0 and 3.7 points, respectively. It also significantly stabilizes performance for smaller models like Phi-3.5-mini-instruct, recovering from 34.4 to 61.5 on RM-Bench.
Key takeaway
For ML Engineers and AI Scientists developing reward models, relying on static pooling and generative-optimized backbones can limit performance and stability, especially on diverse or complex evaluation tasks. You should consider adopting adaptive frameworks like AdaJudge, which jointly refine representations and dynamically aggregate evidence. This approach can significantly improve accuracy and prevent representation collapse in smaller models, ensuring more robust and task-aligned human preference learning for your LLMs.
Key insights
Adaptive representation refinement and multi-perspective aggregation are crucial for robust reward modeling across diverse LLM evaluation tasks.
Principles
- Fixed pooling strategies create an inductive bias mismatch across heterogeneous evaluation tasks.
- Generative LLM backbones are suboptimal for fine-grained preference discrimination.
- Dynamic routing of pooling experts improves performance on complex, reasoning-intensive tasks.
Method
AdaJudge refines backbone hidden states via K depth-gated attention blocks, then dynamically aggregates evidence using a prompt-conditioned gating network that combines last-token, mean, and attention pooling experts.
In practice
- Implement depth-gated refinement blocks to transform backbone representations for discrimination.
- Utilize a mixture-of-pooling experts with dynamic, prompt-conditioned routing for aggregation.
- Train with Focal Bradley–Terry loss and entropy regularization for stable convergence.
Topics
- Reward Modeling
- LLM Alignment
- Adaptive Pooling
- Representation Learning
- Multi-Perspective Aggregation
- Transformer Architectures
Best for: Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CL updates on arXiv.org.