GUIDE: Guided Updates for In-context Decision Evolution in LLM-Driven Spacecraft Operations

2026-04-16 · Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems · Depth: Expert, long

Summary

GUIDE (Guided Updates for In-context Decision Evolution) is a non-parametric policy improvement framework designed for LLM-driven spacecraft operations, addressing the limitations of static prompting in dynamic environments. It enables cross-episode adaptation without requiring model weight updates by evolving a structured, state-conditioned "playbook" of natural-language decision rules. A lightweight acting model handles real-time control, while an offline reflection process updates the playbook based on prior mission trajectories. Evaluated in an adversarial orbital interception task within the Kerbal Space Program Differential Games environment, GUIDE consistently outperformed static baselines. The framework demonstrates that context evolution in LLM agents can function as a policy search mechanism over structured decision rules for real-time, closed-loop spacecraft interaction, particularly in scenarios requiring adaptive reasoning under uncertainty.

Key takeaway

For research scientists developing autonomous agents for dynamic, real-time control systems, GUIDE offers a compelling alternative to traditional weight-update learning. You should consider implementing a "teacher-student" architecture where a lightweight online agent is guided by an evolving, natural-language playbook. This approach allows for continuous adaptation to unpredictable environments, like adversarial space operations, without the computational burden or deployment constraints of retraining large models.

Key insights

GUIDE enables LLMs to adapt in real-time by evolving natural-language decision rules without weight updates.

Principles

Separate online execution from offline policy improvement.
Context evolution can serve as a learnable policy object.
Structured natural language rules can encode adaptive behavior.

Method

GUIDE uses a "teacher-student" approach: a fixed acting model executes real-time control based on a dynamic playbook, while an offline meta-reasoning LLM (Reflector/Curator) updates this playbook via ADD/UPDATE/REMOVE operations using $\epsilon$-biased reflection sampling from past trajectories.

In practice

Use a playbook of state-conditioned rules for LLM adaptation.
Implement a two-tiered guard-avoidance regime for spacecraft.
Apply UCB1 for selecting among multiple playbook versions.

Topics

GUIDE Framework
LLM Spacecraft Operations
In-context Policy Evolution
Natural Language Playbook
Kerbal Space Program

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, Robotics Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.