Do LLMs Benefit From Their Own Words?

· Source: Computation and Language · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Natural Language Processing · Depth: Expert, quick

Summary

A recent study investigates whether large language models (LLMs) benefit from conditioning on their own prior responses in multi-turn conversations. Researchers compared standard full-context prompting with a user-turn-only approach, which omits all previous assistant responses, across three open reasoning models and one proprietary model. Surprisingly, removing prior assistant responses did not affect response quality on a large fraction of turns, and this method reduced cumulative context lengths by up to 10x. The analysis revealed that 36.4% of multi-turn conversations consist of self-contained prompts, where follow-up prompts provide sufficient instruction from user turns alone. In cases where user-turn-only prompting outperformed full context, the study identified "context pollution," where models over-conditioned on their own previous responses, leading to errors, hallucinations, or stylistic artifacts that propagated across turns. These findings led to the design of a context-filtering approach that selectively omits assistant-side context.

Key takeaway

For AI Architects and MLOps Engineers optimizing LLM deployments, consider implementing selective context filtering or user-turn-only prompting. This approach can significantly reduce memory consumption by up to 10x and mitigate "context pollution" errors, leading to improved response quality and more efficient inference, especially in multi-turn conversational agents. Evaluate your specific use cases to identify turns where assistant history is redundant or detrimental.

Key insights

LLMs often do not benefit from their own prior responses, and omitting them can improve quality and reduce context length.

Principles

Method

The study compared full-context prompting with user-turn-only prompting, analyzing response quality and context length across multiple LLMs and identifying instances of context pollution.

In practice

Topics

Best for: AI Architect, MLOps Engineer, AI Engineer, AI Researcher, AI Scientist, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.