EMAgnet: Parameter-Space EMA Regularization for Policy Gradient Self-Play in Large Games

· Source: Artificial Intelligence · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Gaming & Interactive Media · Depth: Expert, quick

Summary

EMAgnet is a novel parameter-space exponential moving average (EMA) regularization method designed for policy gradient self-play in large, two-player zero-sum imperfect-information games. Unlike traditional uniform distribution regularization, which applies equally to all actions, EMAgnet adaptively regularizes toward an EMA of the last-iterate policy's parameters, allowing the regularization target to evolve with the agent's improving strategy. Evaluated against PPO self-play with uniform-magnet regularization, under both linear and power-law annealing schedules, EMAgnet demonstrated lower exploitability in most tested environments. It showed consistent performance gains, particularly in games featuring strictly dominated strategies and exploration challenges, indicating its effectiveness in complex game theory benchmarks.

Key takeaway

For Machine Learning Engineers developing self-play algorithms for large, imperfect-information games, consider implementing EMAgnet's parameter-space EMA regularization. This method adaptively targets evolving strategies, demonstrating lower exploitability and consistent performance gains over uniform regularization, especially in environments with strictly dominated strategies. You should evaluate EMAgnet to enhance the robustness and learning efficiency of your policy gradient systems in complex game-theoretic scenarios.

Key insights

EMAgnet uses adaptive parameter-space EMA regularization for policy gradient self-play, outperforming uniform regularization in large games.

Principles

Method

EMAgnet regularizes policy gradient methods by targeting an exponential moving average (EMA) of the last-iterate policy's parameters, allowing the regularization target to adapt as the agent's strategy improves.

In practice

Topics

Best for: Research Scientist, AI Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.