ReasoningBank: Enabling agents to learn from experience

2026-04-21 · Source: The latest research from Google · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering · Depth: Expert, medium

Summary

ReasoningBank, a novel agent memory framework developed by Jun Yan and Chen-Yu Lee at Google Cloud, enables AI agents to continuously learn from both successful and failed experiences after deployment. Unlike existing memory methods that store exhaustive action records or only successful workflows, ReasoningBank distills high-level, transferable reasoning patterns and strategic guardrails from mistakes. It structures memory items with a title, description, and distilled content, operating in a continuous loop of retrieval, extraction, and consolidation. When integrated with memory-aware test-time scaling (MaTTS), ReasoningBank further enhances learning by leveraging parallel and sequential exploration. Evaluated using Gemini-2.5-Flash on WebArena and SWE-Bench-Verified benchmarks, ReasoningBank improved success rates by 8.3% and 4.6% respectively, and reduced execution steps compared to memory-free baselines.

Key takeaway

For AI Architects designing persistent, long-running agents, ReasoningBank offers a critical advancement by enabling continuous learning from both successes and failures. You should consider integrating this framework to move beyond simple trajectory or workflow memories, as it distills higher-level strategic insights and significantly boosts agent effectiveness and efficiency. This approach can lead to more robust and adaptable agents capable of evolving their reasoning over time.

Key insights

ReasoningBank enables AI agents to learn generalizable strategies from both successes and failures for continuous self-evolution.

Principles

Distill high-level reasoning patterns, not just actions.
Learn from both successful and failed experiences.
Memory-aware scaling enhances learning signals.

Method

ReasoningBank uses a closed-loop workflow of memory retrieval, environmental interaction, LLM-as-a-judge self-assessment, and distillation of insights from trajectories into structured memory items.

In practice

Use LLM-as-a-judge for self-assessment.
Incorporate counterfactual signals from failures.
Apply parallel or sequential scaling for richer learning.

Topics

ReasoningBank
Agent Memory Framework
Learning from Failure
Memory-aware Test-Time Scaling
WebArena Benchmark

Code references

google-research/reasoning-bank

Best for: AI Architect, Research Scientist, AI Scientist, Machine Learning Engineer, AI Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by The latest research from Google.