Compositionality Emerges in a Narrow Depth-Connectivity Regime: Architecture Constraints and Solution Manifolds

· Source: Machine Learning · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Expert, quick

Summary

This work reveals that compositionality, crucial for generalization in neural networks, emerges within a specific "narrow connectivity-depth sweet spot." Compositionality is observed only in certain sparse networks, where the specific connections retained are more critical than overall sparsity. Along the depth axis, it peaks within a narrow, target-dependent range, with both shallower and deeper networks failing. When these architectural conditions are not met, gradient descent yields fractured, non-compositional solutions. To address this, the authors introduce similarity-based pruning (SP) to achieve compositional connectivity and a heuristic depth predictor to estimate optimal depth. These empirical findings are supported by a theoretical framework involving compositional sparsity and feature-interference bounds.

Key takeaway

For AI Scientists designing neural network architectures, understanding that compositionality is not a given but an emergent property of specific depth and connectivity regimes is critical. You should experiment with similarity-based pruning and heuristic depth predictors to identify the "sweet spot" for your target task, avoiding architectures that silently converge to fractured solutions. Prioritize specific connection patterns over general sparsity to enhance generalization.

Key insights

Compositionality in neural networks arises only within a precise, narrow regime of network depth and connectivity.

Principles

Method

The paper introduces similarity-based pruning (SP) to recover compositional connectivity and a heuristic depth predictor to estimate optimal compositional depth.

In practice

Topics

Best for: Research Scientist, AI Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.