The Importance of Phase in Neural Representations: An Internal Oppenheim-Lim Test of Image Classifiers
Summary
A study investigating image classifiers like PRISM2D, GFNet, ViT-B/16, and ResNet-50 reveals that, similar to human perception, these models primarily rely on Fourier phase (or sign) for image identity rather than magnitude. Using an "Internal Oppenheim-Lim Test," researchers transplanted the phase of one image onto the magnitude of another within a chosen hidden layer to observe prediction shifts. For PRISM2D, GFNet, and ViT-B/16, predictions consistently followed the phase donor, and removing image-specific magnitude had minimal impact on accuracy. While ResNet-50 initially appeared to deviate, a pre-ReLU intervention demonstrated a strong latent sign code in its later blocks, with its readout consuming a channel-wise spatial average. This suggests that despite differing exposure mechanisms due to rectification and readout geometry, these diverse architectures share a fundamental phase/sign identity code, offering a mechanistic explanation for the observed texture-shape gap between CNNs and attention models.
Key takeaway
For Machine Learning Engineers designing or debugging image classifiers, understanding that models prioritize Fourier phase for identity is crucial. If your models struggle with shape bias or robustness to texture variations, investigate how your architecture's rectification and readout geometry might be obscuring or misinterpreting phase information. You should consider explicitly designing layers to preserve or enhance phase representations, potentially improving shape recognition and reducing reliance on superficial texture cues.
Key insights
Image classifiers predominantly encode identity via Fourier phase/sign, not magnitude, across diverse architectures.
Principles
- Image identity rides on phase/sign.
- Magnitude is largely dispensable for readout.
- Rectification and readout geometry affect code exposure.
Method
The "Internal Oppenheim-Lim Test" transplants Fourier phase from one image onto another's magnitude at a chosen hidden layer to causally assess its influence on classifier predictions.
In practice
- Analyze phase representations in models.
- Design architectures prioritizing phase.
- Investigate readout mechanisms for phase.
Topics
- Neural Networks
- Image Classification
- Fourier Phase
- Representation Learning
- CNNs
- Vision Transformers
Best for: Research Scientist, AI Scientist, Machine Learning Engineer, Computer Vision Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.