Stable Deep Reinforcement Learning via Isotropic Gaussian Representations

2026-06-06 · Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems · Depth: Expert, extended

Summary

Deep reinforcement learning (DRL) systems frequently encounter unstable training dynamics, marked by non-stationarity, representation collapse, and neuron dormancy. This research demonstrates that isotropic Gaussian embeddings offer a provably stable solution for tracking time-varying targets. Building on this, the authors propose Sketched Isotropic Gaussian Regularization (SIGReg), a computationally inexpensive method to shape representations towards an isotropic Gaussian distribution during training. Empirically, SIGReg significantly improves performance and stability across diverse domains. It enhanced 51 of 57 Atari games (89.5%) for PQN, yielding a mean AUC improvement of 889%, and 68% of games for PPO with a 25% average AUC gain. SIGReg also reduces representation collapse and neuron dormancy in environments like CIFAR-10 and Isaac Gym, often outperforming or matching complex second-order optimizers like Kronecker-factored optimization with greater efficiency.

Key takeaway

For Machine Learning Engineers developing deep reinforcement learning systems, non-stationarity is a core challenge. You should integrate Sketched Isotropic Gaussian Regularization (SIGReg) into your DRL pipelines. This lightweight method shapes representations towards an isotropic Gaussian distribution, enhancing training stability, reducing collapse, and mitigating neuron dormancy. Adopting SIGReg improves performance and sample efficiency, often matching complex optimizers without their overhead.

Key insights

Isotropic Gaussian representations fundamentally stabilize deep RL by mitigating non-stationarity, preventing collapse, and reducing neuron dormancy.

Principles

Isotropic Gaussian embeddings minimize sensitivity to target drift.
Isotropy equalizes contraction across all representation dimensions.
Gaussian distributions maximize entropy and control tail behavior.

Method

Sketched Isotropic Gaussian Regularization (SIGReg) projects embeddings onto random directions. It then applies a univariate distribution-matching loss to each projection, enforcing isotropy and Gaussianity efficiently.

In practice

Apply SIGReg to DRL agents to improve training stability.
Use SIGReg to reduce representation collapse and neuron dormancy.

Topics

Deep Reinforcement Learning
Representation Learning
Isotropic Gaussian Representations
SIGReg
Training Stability
Non-stationarity

Code references

asahebpa/IsoGaussian-DRL

Best for: Research Scientist, AI Scientist, Machine Learning Engineer

Related on AIssential

See Counsel's argued verdicts on the open AI decisions leaders are weighing →

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.