Dropout Neural Network Training Viewed from a Percolation Perspective

· Source: stat.ML updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Mathematics & Computational Sciences · Depth: Expert, extended

Summary

A new study investigates the existence and effect of "percolation" in training deep Neural Networks (NNs) with dropout, a regularization technique introduced by G. Hinton et al. (2012). The research models dropout's random connection removal using new percolation models for rectangular layered networks, distinguishing between bond percolation (dropconnect) and site percolation (original dropout). It characterizes the relationship between network topology (depth L, width W) and the probability of a path existing between input and output layers, establishing critical behavior. The theory demonstrates that this percolative effect can cause a breakdown in training NNs without biases, preventing learning, and heuristically extends this breakdown to NNs with biases. Specifically, for deep networks, the required training steps T(n) to avoid this issue can grow exponentially or even doubly exponentially with depth.

Key takeaway

For AI Scientists designing or training deep neural networks, this research highlights a critical "percolation problem" where excessively deep networks, especially those without biases, can fail to learn due to insufficient input-output paths during dropout. You should carefully consider the network's width-to-depth ratio and adjust training duration. For very deep networks, be prepared for exponentially or even "doubly exponentially" longer training times to ensure effective learning and avoid parameter stagnation.

Key insights

Dropout's random connection removal in deep NNs can lead to a "percolation problem" where input-output paths vanish, preventing learning.

Principles

Method

New rectangular layered network percolation models (bond for dropconnect, site for original dropout) are defined to characterize the crossing probability and its impact on NN training.

In practice

Topics

Best for: AI Scientist, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by stat.ML updates on arXiv.org.