Learning under noisy supervision is governed by a feedback-truth gap
Summary
A study by Schonfeld and Wisnia introduces the "feedback–truth gap," a quantitative framework explaining why learners prioritize immediate feedback over objective truth when feedback is processed faster than task structure can be evaluated. This gap is mathematically inevitable when feedback integration and truth evaluation occur at different timescales, vanishing only when these rates match. The researchers empirically tested this prediction across three systems: neural networks trained with noisy labels (30 datasets, 2,700 runs), human probabilistic reversal learning (N=292), and human reward/punishment learning with concurrent EEG (N=25). They found the gap universally present but regulated differently: dense networks accumulated it as memorization, sparse-residual architectures suppressed it, and humans exhibited transient over-commitment followed by active recovery. Neural over-commitment (approximately 0.04–0.10) was amplified tenfold into behavioral commitment (d=3.3–3.9), highlighting that the gap's consequences depend on the regulatory mechanisms employed by each system.
Key takeaway
For AI researchers developing robust learning algorithms, understanding the feedback–truth gap is crucial. Your models will inevitably prioritize noisy feedback over true task structure if feedback is integrated too quickly. Implement architectural constraints, like sparse-residual bottlenecks, to regulate this gap and prevent memorization, especially in high-noise environments, as this can significantly improve generalization and overall performance.
Key insights
When feedback is processed faster than truth, a "feedback–truth gap" inevitably arises in learning systems.
Principles
- The feedback–truth gap is universal across diverse learning systems.
- Gap magnitude scales with noise rate and timescale ratio.
- Regulation determines whether the gap causes damage or facilitates adaptation.
Method
A two-timescale model, where fast feedback integration and slower truth evaluation create a gap, was tested across neural networks, human reversal learning, and EEG-based reward/punishment tasks using metrics like AUG_pos and T*.
In practice
- Use sparse-residual architectures to suppress memorization in neural networks.
- Monitor early training dynamics to predict later memorization regimes.
- Consider timescale ratios when designing learning algorithms.
Topics
- Feedback-Truth Gap
- Noisy Label Learning
- Neural Network Memorization
- Human Reinforcement Learning
- Sparse-Residual Networks
Code references
Best for: AI Researcher, AI Scientist, Research Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.NE updates on arXiv.org.