MARLIN: Multi-Agent Reinforcement Learning for Incremental DAG Discovery

2026-03-24 · Source: cs.LG updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Advanced, extended

Summary

MARLIN (Multi-Agent Reinforcement Learning for Incremental DAG Discovery) is a novel framework designed for efficient, online causal structure learning from observational data. It addresses the limitations of existing reinforcement learning (RL) methods, which often lack efficiency for online applications and struggle with non-stationary data distributions. MARLIN employs an intra-batch DAG generation policy that maps a continuous real-valued space to the DAG space. It integrates two distinct RL agents—a state-specific agent and a state-invariant agent—to incrementally uncover causal relationships. The framework also incorporates a factored action space to enhance parallelization efficiency. Extensive experiments on synthetic datasets (Linear-Gaussian, Linear-Exponential, Quadratic Regression, Gaussian Process) and real-world datasets (OnlineBoutique, Secure Water Treatment, Water Distribution) demonstrate MARLIN's superior performance in both efficiency and effectiveness compared to state-of-the-art methods, particularly in incremental DAG learning and root cause analysis tasks.

Key takeaway

For research scientists developing causal discovery systems in dynamic, online environments, MARLIN offers a robust solution for incremental DAG learning. Its multi-agent reinforcement learning approach, which disentangles state-specific and state-invariant causal mechanisms, allows for efficient adaptation to non-stationary data streams. You should consider MARLIN to improve the real-time performance and accuracy of your causal models, especially when dealing with continuously arriving data and evolving system states.

Key insights

MARLIN efficiently learns causal DAGs incrementally in online settings using a multi-agent reinforcement learning approach.

Principles

Disentangle state-specific and state-invariant causal relationships.
Map continuous real-valued space to DAG space for efficient RL.
Factor action space to enable parallel computation.

Method

MARLIN uses an intra-batch DAG generation policy and two RL agents (state-specific and state-invariant) within an incremental learning framework. It leverages a factored action space for parallelization and optimizes agents via an actor-critic approach with decoupling terms.

In practice

Apply MARLIN for real-time causal discovery in dynamic systems.
Use MARLIN-M for enhanced parallelization and faster runtime.
Implement disentangled agents for robust adaptation to data shifts.

Topics

Multi-Agent Reinforcement Learning
Causal Discovery
Incremental Learning
Directed Acyclic Graphs
Online Causal Discovery

Best for: Research Scientist, AI Researcher, AI Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.LG updates on arXiv.org.