The Importance of Phase in Neural Representations: An Internal Oppenheim-Lim Test of Image Classifiers

· Source: Artificial Intelligence · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Expert, quick

Summary

A study investigating image classifiers like PRISM2D, GFNet, ViT-B/16, and ResNet-50 reveals that, similar to human perception, these models primarily rely on Fourier phase (or sign) for image identity rather than magnitude. Using an "Internal Oppenheim-Lim Test," researchers transplanted the phase of one image onto the magnitude of another within a chosen hidden layer to observe prediction shifts. For PRISM2D, GFNet, and ViT-B/16, predictions consistently followed the phase donor, and removing image-specific magnitude had minimal impact on accuracy. While ResNet-50 initially appeared to deviate, a pre-ReLU intervention demonstrated a strong latent sign code in its later blocks, with its readout consuming a channel-wise spatial average. This suggests that despite differing exposure mechanisms due to rectification and readout geometry, these diverse architectures share a fundamental phase/sign identity code, offering a mechanistic explanation for the observed texture-shape gap between CNNs and attention models.

Key takeaway

For Machine Learning Engineers designing or debugging image classifiers, understanding that models prioritize Fourier phase for identity is crucial. If your models struggle with shape bias or robustness to texture variations, investigate how your architecture's rectification and readout geometry might be obscuring or misinterpreting phase information. You should consider explicitly designing layers to preserve or enhance phase representations, potentially improving shape recognition and reducing reliance on superficial texture cues.

Key insights

Image classifiers predominantly encode identity via Fourier phase/sign, not magnitude, across diverse architectures.

Principles

Method

The "Internal Oppenheim-Lim Test" transplants Fourier phase from one image onto another's magnitude at a chosen hidden layer to causally assess its influence on classifier predictions.

In practice

Topics

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, Computer Vision Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.