Beyond tokens: a unified framework for latent communication in LLM-based multi-agent systems
Summary
The paper introduces a unified framework for latent communication in LLM-based multi-agent systems, addressing the limitations of natural language protocols like high inference cost, irreversible information loss during discretization, and ambiguity. This framework categorizes eighteen representative methods published between 2024 and 2026 along three orthogonal axes: WHAT information is communicated (Embeddings, Hidden States, KV-Caches), WHICH sender–receiver alignment is used (latent-space or layer alignment), and HOW information is fused into the receiver (concatenation, prepending, mathematical operations, cross-attention, or cache restoration). The analysis identifies five major design patterns and six open challenges, including cross-architecture alignment and security of latent channels, aiming to provide a shared vocabulary and lower entry barriers for new researchers.
Key takeaway
For AI architects designing multi-agent LLM systems, this framework highlights critical trade-offs in communication protocols. You should evaluate latent communication for tightly coupled, intermediate agent interactions where latency or information density is paramount, especially considering KV-cache methods for significant speedups. Be aware that cross-architecture alignment and channel interpretability remain key challenges requiring careful design or training.
Key insights
Latent communication in LLM multi-agent systems reduces cost and information loss by exchanging continuous representations directly.
Principles
- KV-Cache offers maximum information but highest cost and architecture dependence.
- Homogeneous agents benefit from simple, training-free layer mapping strategies.
- Concatenation and prepending are common, simple, and often training-free fusion methods.
Method
The unified framework organizes latent communication methods by three axes: WHAT (information type), WHICH (sender–receiver alignment), and HOW (information fusion strategy), systematically categorizing 18 methods.
In practice
- Prefer latent channels for tightly coupled, intermediate agent communication.
- Consider KV-cache methods for prefill-phase information to skip sender's decode.
- Use learned alignment for heterogeneous agent architectures.
Topics
- Latent Communication
- Multi-Agent Systems
- LLM Communication Protocols
- KV-Cache
- Hidden States
- Embeddings
- Agent Alignment
Code references
- enochliu98/Awesome-Latent-Communication
- chaudatascience/cipher_multiagent_debate
- XiaoDu-flying/Interlat
- LittleDinoC/StateDelta
- jacobfa/mot
Best for: Research Scientist, AI Engineer, AI Scientist, Machine Learning Engineer, AI Architect
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CL updates on arXiv.org.