Benchmarking Instance-Dependent Label Noise with Controlled Corruptions

2026-06-12 · Source: Machine Learning · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Expert, quick

Summary

CILN is a novel benchmark generation framework designed to create synthetic instance-dependent label noise (IDN) through controlled input corruptions. Unlike existing methods that implicitly generate noise via imperfect annotators, CILN employs a diverse voter pool to label corrupted instances, making both the source and severity of ambiguity explicit and controllable. The framework was used to construct 90 benchmark settings across multiple corruption families and severity levels, utilizing datasets such as CIFAR10, MNIST, and Adult. Experiments demonstrated that CILN's benchmarks exhibit genuine instance-dependent noise and diverse confusion structures. Notably, on CIFAR-10, CILN produced label distributions closer to human uncertainty than an existing synthetic IDN benchmark. These corruption-mediated IDN benchmarks also exposed failure modes in popular noisy-label learning methods, including Co-Teaching and DivideMix, which were not observed under comparable levels of rater-fallibility noise. This suggests that noise structure significantly influences benchmark difficulty and algorithm behavior.

Key takeaway

For Machine Learning Engineers evaluating noisy-label learning methods, traditional synthetic benchmarks may not fully expose algorithm vulnerabilities. You should integrate CILN-generated benchmarks into your evaluation pipeline to test model robustness against explicit, corruption-mediated instance-dependent noise. This approach will reveal failure modes in methods like Co-Teaching and DivideMix, ensuring your models are more resilient to diverse noise structures encountered in real-world data, beyond just noise rate considerations.

Key insights

CILN generates explicit, controllable instance-dependent label noise via input corruptions, revealing new algorithm failure modes.

Principles

Noise structure impacts algorithm behavior.
Explicit ambiguity improves IDN benchmarks.
Controlled corruptions create genuine IDN.

Method

CILN generates instance-dependent label noise by corrupting inputs, then uses a diverse voter pool to label these corrupted instances, making ambiguity source and severity explicit.

In practice

Evaluate noisy-label methods with CILN's IDN.
Explore diverse instance difficulty sources.
Compare IDN benchmarks to human uncertainty.

Topics

Instance-Dependent Label Noise
Benchmark Generation
Data Corruption
Noisy-Label Learning
Machine Learning Evaluation

Best for: Research Scientist, AI Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.