Decoupling Numerical and Structural Parameters: An Empirical Study on Adaptive Genetic Algorithms via Deep Reinforcement Learning for the Large-Scale TSP

2026-03-24 · Source: cs.NE updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Combinatorial Optimization · Depth: Advanced, long

Summary

A new study introduces DRLGA, a dual-level Deep Reinforcement Learning Genetic Algorithm framework, to address the challenge of parameter configuration in Evolutionary Algorithms (EAs) for large-scale Traveling Salesman Problems (TSPs). This framework decouples control variables into numerical parameters (e.g., crossover and mutation rates) and structural parameters (e.g., population size and operator switching), hypothesizing distinct roles. Utilizing a Recurrent PPO agent with LSTM, DRLGA dynamically regulates these parameters, acting as a probe into evolutionary dynamics. Experiments show DRLGA policies outperform static baselines, reducing the optimality gap by approximately 45% on the largest tested instance (rl5915). Ablation analysis reveals that structural plasticity is the decisive factor in preventing stagnation and escaping local optima, while numerical tuning offers local refinement. The source code is available on GitHub for reproducibility.

Key takeaway

For research scientists developing optimization algorithms for NP-hard problems, this work suggests a paradigm shift. You should prioritize designing systems that dynamically adapt structural parameters, such as population size and operator modes, over merely fine-tuning numerical rates. Focusing on structural plasticity will yield greater scalability and prevent stagnation in high-dimensional search spaces, leading to more robust and effective solutions for complex problems like the TSP.

Key insights

Dynamic structural adaptation is more critical than numerical fine-tuning for scalable evolutionary algorithms.

Principles

Structural plasticity prevents stagnation.
Numerical tuning offers local refinement.
Zero-shot generalization is achievable with scale-invariant state representation.

Method

A dual-level DRL framework with a Recurrent PPO agent and LSTM dynamically controls numerical and structural parameters in Genetic Algorithms, treating optimization as a POMDP.

In practice

Prioritize dynamic structural reconfiguration in algorithm design.
Use LSTM-based DRL for non-Markovian evolutionary histories.
Employ curriculum learning for robust generalization.

Topics

Deep Reinforcement Learning
Genetic Algorithms
Traveling Salesman Problem
Adaptive Parameter Control
Combinatorial Optimization

Code references

StarDream1314/DRLGA-TSP

Best for: Research Scientist, AI Researcher, AI Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.NE updates on arXiv.org.