From Data Statistics to Feature Geometry: How Correlations Shape Superposition

2026-03-10 · Source: Machine Learning · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Advanced, quick

Summary

A new study introduces Bag-of-Words Superposition (BOWS), a controlled environment for encoding binary bag-of-words representations of internet text, to investigate how feature correlations influence superposition in neural networks. Contrary to prior research that primarily focused on sparse, uncorrelated features and viewed superposition interference as noise to be minimized, this work demonstrates that correlations can lead to constructive interference. The BOWS framework reveals that features arrange themselves based on co-activation patterns, allowing active features to interfere constructively while still utilizing ReLUs to prevent false positives. This arrangement, particularly prevalent in models trained with weight decay, naturally forms semantic clusters and cyclical structures, offering an explanation for patterns observed in real language models that were not accounted for by the traditional understanding of superposition.

Key takeaway

For research scientists investigating neural network interpretability, this work suggests that your understanding of superposition should account for feature correlations. Recognizing that interference can be constructive, not just noise, changes how you might analyze feature representations and design regularization strategies. Consider exploring how weight decay influences feature geometry in your models to potentially uncover semantic clusters and cyclical structures.

Key insights

Feature correlations can enable constructive interference in neural network superposition, forming semantic clusters.

Principles

Correlated features arrange by co-activation patterns.
Weight decay promotes constructive superposition arrangements.

Method

Bag-of-Words Superposition (BOWS) encodes binary text representations to study feature geometry under correlation, using ReLUs to manage false positives.

In practice

Explore feature geometry in language models.
Apply weight decay to enhance semantic clustering.

Topics

Mechanistic Interpretability
Superposition
Feature Geometry
Bag-of-Words Superposition
Neural Networks

Code references

LucasPrietoAl/correlations-feature-geometry

Best for: Research Scientist, AI Researcher, AI Scientist, Deep Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.