Stable Deep Reinforcement Learning via Isotropic Gaussian Representations
Summary
Deep reinforcement learning (DRL) systems frequently encounter unstable training dynamics, marked by non-stationarity, representation collapse, and neuron dormancy. This research demonstrates that isotropic Gaussian embeddings offer a provably stable solution for tracking time-varying targets. Building on this, the authors propose Sketched Isotropic Gaussian Regularization (SIGReg), a computationally inexpensive method to shape representations towards an isotropic Gaussian distribution during training. Empirically, SIGReg significantly improves performance and stability across diverse domains. It enhanced 51 of 57 Atari games (89.5%) for PQN, yielding a mean AUC improvement of 889%, and 68% of games for PPO with a 25% average AUC gain. SIGReg also reduces representation collapse and neuron dormancy in environments like CIFAR-10 and Isaac Gym, often outperforming or matching complex second-order optimizers like Kronecker-factored optimization with greater efficiency.
Key takeaway
For Machine Learning Engineers developing deep reinforcement learning systems, non-stationarity is a core challenge. You should integrate Sketched Isotropic Gaussian Regularization (SIGReg) into your DRL pipelines. This lightweight method shapes representations towards an isotropic Gaussian distribution, enhancing training stability, reducing collapse, and mitigating neuron dormancy. Adopting SIGReg improves performance and sample efficiency, often matching complex optimizers without their overhead.
Key insights
Isotropic Gaussian representations fundamentally stabilize deep RL by mitigating non-stationarity, preventing collapse, and reducing neuron dormancy.
Principles
- Isotropic Gaussian embeddings minimize sensitivity to target drift.
- Isotropy equalizes contraction across all representation dimensions.
- Gaussian distributions maximize entropy and control tail behavior.
Method
Sketched Isotropic Gaussian Regularization (SIGReg) projects embeddings onto random directions. It then applies a univariate distribution-matching loss to each projection, enforcing isotropy and Gaussianity efficiently.
In practice
- Apply SIGReg to DRL agents to improve training stability.
- Use SIGReg to reduce representation collapse and neuron dormancy.
Topics
- Deep Reinforcement Learning
- Representation Learning
- Isotropic Gaussian Representations
- SIGReg
- Training Stability
- Non-stationarity
Code references
Best for: Research Scientist, AI Scientist, Machine Learning Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.