FORGE: Self-Evolving Agent Memory With No Weight Updates via Population Broadcast

· Source: Artificial Intelligence · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems · Depth: Expert, quick

Summary

FORGE (Failure-Optimized Reflective Graduation and Evolution) is a novel, staged, population-based protocol designed to enhance LLM agent decision-making through self-generated, prompt-injected natural-language memory, without requiring gradient updates. It integrates a Reflexion-style inner loop where a reflection agent transforms failed trajectories into reusable knowledge artifacts, such as textual heuristics (Rules), few-shot demonstrations (Examples), or a combination (Mixed). An outer loop propagates the best-performing agent's memory to the population across stages and graduates converged instances. Evaluated on CybORG CAGE-2, a 30-step stochastic network-defense POMDP, FORGE improved average evaluation return by 1.7-7.7x over zero-shot baselines and by 29-72% over Reflexion baselines across 12 model-representation conditions, reducing major-failure rates to approximately 1%. This performance was observed across Gemini-2.5-Flash-Lite, Grok-4-Fast, Llama-4-Maverick, and Qwen3-235B.

Key takeaway

For research scientists developing LLM agents for complex, stochastic environments, FORGE offers a robust method to improve agent performance and reduce failure rates without costly model retraining. You should consider implementing a population-based memory evolution system with broadcast mechanisms, particularly if working with models exhibiting high zero-shot failure rates, as this approach can significantly mitigate capability gaps and enhance decision-making in challenging POMDPs like network defense.

Key insights

FORGE enables LLM agents to self-evolve memory via population broadcast, significantly improving decision-making without weight updates.

Principles

Method

FORGE uses a staged, population-based protocol with an inner Reflexion-style loop for memory generation (Rules, Examples, Mixed) and an outer loop for propagating best-performing memory and graduating converged instances.

In practice

Topics

Best for: Research Scientist, AI Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.