CAMO: A Class-Aware Minority-Optimized Ensemble for Robust Language Model Evaluation on Imbalanced Data

2026-04-10 · Source: cs.CL updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Expert, extended

Summary

CAMO (Class-Aware Minority-Optimized) is a novel ensemble technique designed to address class imbalance in real-world categorization tasks, which typically cause traditional ensembles to favor majority classes and degrade minority performance. CAMO employs a hierarchical decision procedure that dynamically boosts underrepresented classes by integrating vote distributions, confidence calibration, and inter-model uncertainty. The method was validated on two highly imbalanced, domain-specific benchmarks: the DIAR-AI/Emotion dataset and the ternary BEA 2025 dataset. Researchers benchmarked CAMO against seven established ensemble algorithms using eight different language models (three LLMs and five SLMs) under both zero-shot and fine-tuned settings. With refined models, CAMO consistently achieved the highest strict macro F1-score, demonstrating its effectiveness as a reliable, domain-neutral framework for unbalanced categorization.

Key takeaway

For AI Engineers and Research Scientists developing classification systems for imbalanced datasets, CAMO offers a robust solution to improve performance on critical minority classes. You should consider integrating CAMO into your model evaluation and deployment pipelines, especially after fine-tuning language models, to enhance macro F1-scores and ensure fairer, more transparent AI system outcomes in sensitive applications like fraud detection or rare disease diagnosis.

Key insights

CAMO is a hierarchical ensemble method that dynamically boosts minority class predictions using confidence and uncertainty.

Principles

Minority predictions are signals, not noise.
Ensemble choice depends on model properties.
Fairness requires reducing bias against minority classes.

Method

CAMO uses a seven-stage hierarchical decision process, including unanimity checks, strong minority consensus, isolated high-confidence minority votes, and uncertainty-triggered prioritization, combined with dynamic minority boosting.

In practice

Apply CAMO as an a posteriori pipeline component.
Use 4-bit quantization for memory-efficient fine-tuning.
Employ multi-seed training to lower variance and estimate uncertainty.

Topics

CAMO Ensemble
Class Imbalance
Minority Class Optimization
Language Model Evaluation
Hierarchical Decision Process

Best for: AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CL updates on arXiv.org.