Adopting a human developmental visual diet yields robust and shape-based AI vision
Summary
A new study introduces a "developmental visual diet" (DVD) for training AI vision systems, drawing inspiration from human visual maturation from birth to 25 years. This approach, which gradually introduces visual acuity, contrast sensitivity, and chromatic sensitivity, aims to address the misalignment between artificial and human vision, where AI often relies on texture rather than shape. Experiments with various deep neural networks (DNNs), including ResNet-50, trained on datasets like mini-ecoset, ecoset, and ImageNet-1K, demonstrated that DVD-trained models achieved significantly higher shape bias (up to 0.94, comparable to human levels of 0.90-0.97) compared to baseline models (0.2-0.4). These models also showed enhanced recognition of abstract shapes embedded in complex backgrounds, outperforming large AI foundation models, and exhibited greater robustness to image degradations (e.g., blur, noise, weather effects) and adversarial attacks. The research highlights that guiding *how* a model learns, rather than just *how much*, offers a resource-efficient path to more human-like and robust AI vision.
Key takeaway
AI Engineers and Research Scientists developing computer vision systems should integrate the Developmental Visual Diet (DVD) preprocessing pipeline into their training regimes. This method, which simulates human visual maturation, significantly enhances shape bias, abstract shape recognition, and robustness against image degradations and adversarial attacks, even outperforming larger foundation models. Adopting DVD can lead to more human-aligned and reliable AI vision systems without requiring massive increases in data or model parameters, offering a computationally efficient path to improved performance.
Key insights
Mimicking human visual development in AI training fosters shape-based perception and robustness.
Principles
- Gradual visual input improves AI robustness.
- Contrast sensitivity is key for shape bias.
- Developmental order is crucial for learning.
Method
The DVD pipeline applies age-dependent Gaussian blur for visual acuity, frequency-domain thresholding for contrast sensitivity, and linear interpolation for chromatic sensitivity to training images, simulating human visual maturation.
In practice
- Implement DVD preprocessing for robust vision models.
- Prioritize contrast sensitivity development in training.
- Evaluate models with cue-conflict and abstract shape benchmarks.
Topics
- Developmental Visual Diet
- Shape-based AI Vision
- Contrast Sensitivity Development
- Abstract Shape Recognition
- Adversarial Robustness
Code references
Best for: AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, Computer Vision Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Nature Machine Intelligence.