Functional Cache Grafting: Robust and Rapid Code-Policy Synthesis for Embodied Agents

2026-06-12 · Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems · Depth: Expert, extended

Summary

The FCGraft (Functional Cache Grafting) framework addresses key limitations in CodeLLM-based policy generation for embodied agents, specifically delayed decoding and robustness issues. It operates by maintaining a library of function-level validated code skeletons and their associated Transformer key–value (KV) caches. FCGraft synthesizes new policies through cache grafting, which involves stitching pre-validated function segments and patching to locally adapt code regions with minimal additional decoding. This approach eliminates redundant prefill computation and reuses validated control structures. Experiments across embodied benchmarks like ALFRED, TEACh, RLBench, and real-world robotic manipulation, using Qwen2.5-Coder-14B, demonstrate FCGraft's effectiveness, achieving an 18.31% higher task success rate and 2.3x faster policy synthesis compared to RAGCache.

Key takeaway

For Robotics Engineers deploying CodeLLM-based control in dynamic, open-domain environments, FCGraft provides a robust solution to common latency and reliability issues. By implementing its function-level KV caching, you can significantly reduce policy synthesis latency by 2.3x and improve task success rates by 18.31% compared to RAGCache. Consider adopting this cache-grafting approach to ensure your embodied agents generate more stable and responsive control policies, especially in time-critical or unpredictable scenarios.

Key insights

Function-level KV caching enables robust and rapid code policy synthesis for embodied agents.

Principles

Function-level KV reuse boosts efficiency and robustness.
Cache-stitching and patching form an interdependent pipeline.
Semantic-aware cache management retains diverse functions.

Method

FCGraft stores function-level KV caches in a two-tier system, using cache-stitching for composition and cache-patching for localized error correction, guided by a semantic-aware management scheme.

In practice

Deploy on mobile and manipulator robots for dynamic tasks.
Enhance CodeLLM control in time-critical scenarios.

Topics

Functional Cache Grafting
CodeLLMs
Embodied Agents
Robotic Manipulation
Key-Value Caching
Policy Synthesis

Code references

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, Robotics Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.