Multi-action Tangled Program Graphs for Multi-task Reinforcement Learning with Continuous Control

· Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems · Depth: Expert, short

Summary

Researchers from IETR introduce Multi-Action Tangled Program Graphs (MATPG), a Genetic Programming (GP) algorithm designed for Multi-Task Reinforcement Learning (MTRL) in continuous control environments. MATPG extends the Tangled Program Graph (TPG) algorithm, originally for discrete MTRL, by aggregating MAPLE agents and creating a control flow to activate them. Initially, MATPG showed performance similar to MAPLE in single-task RL. This work presents a new benchmark based on the MuJoCo Half Cheetah from Gymnasium, featuring five distinct, randomly positioned obstacles, each requiring a unique behavior. Experiments on this benchmark demonstrate MATPG's superior performance in multi-task continuous control when combined with lexicase selection. The study also highlights the full interpretability of the evolved decision flow graph.

Key takeaway

For research scientists developing multi-task reinforcement learning solutions, MATPG offers a robust and interpretable genetic programming approach for continuous control. You should consider integrating MATPG, especially with lexicase selection, into your experimental designs to achieve superior performance and gain clear insights into decision-making processes, particularly in complex environments like the MuJoCo Half Cheetah benchmark.

Key insights

MATPG combines genetic programming and agent aggregation for interpretable, superior multi-task continuous reinforcement learning.

Principles

Method

MATPG aggregates MAPLE agents and establishes a control flow to activate them, enabling a single model to learn multiple behaviors in continuous MTRL environments.

In practice

Topics

Best for: Research Scientist, AI Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.