Human Label Variation as Stable Signal: Learning Annotator-Specific Explanation Behavior via Cross-Annotator Preference Optimization
Summary
The study "Human Label Variation as Stable Signal" investigates whether large language models (LLMs) can learn and replicate annotator-specific reasoning and preferences from free-text explanations. Researchers analyzed human label variation (HLV) across two sentence-pair tasks—natural language inference and paraphrase judgment—each involving four annotators. They observed that individual annotator patterns, while weak at the single-annotation level due to strong input-content effects, become discernible after input-content reduction and aggregation. The work compared prompting and supervised fine-tuning (SFT) baselines, then introduced Cross-Annotator Preference Optimization (CAPO). Experiments demonstrated that prompting is unstable, SFT improves behavior capture, and CAPO further enhances aggregation-aware imitation and judge-based attribution, maintaining target-specific reasoning patterns. This suggests HLV can be a stable signal for scalable, explanation-based annotation.
Key takeaway
For machine learning engineers developing explanation-based annotation systems, this research indicates that utilizing annotator-specific reasoning patterns can significantly improve model fidelity. Your teams should consider moving beyond simple label agreement by incorporating methods like Cross-Annotator Preference Optimization (CAPO) to train LLMs on individual annotator histories. This approach offers a path to more scalable and nuanced data labeling, potentially reducing annotation costs and improving explanation quality.
Key insights
Large language models can learn and reproduce individual annotator explanation styles from human label variation.
Principles
- HLV reveals annotator reasoning.
- Annotator patterns stabilize with aggregation.
- Preference optimization enhances imitation.
Method
Cross-Annotator Preference Optimization (CAPO) contrasts a target annotator's response with other valid but less target-specific annotations for the same input to learn individual explanation behavior.
In practice
- Use SFT for initial behavior capture.
- Apply CAPO for refined imitation.
- Ground annotations in annotator histories.
Topics
- Human Label Variation
- Large Language Models
- Cross-Annotator Preference Optimization
- Supervised Fine-Tuning
- Natural Language Inference
- Explanation-based Annotation
Best for: Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Takara TLDR - Daily AI Papers.