Mistletoe: Stealthy Acceleration-Collapse Attacks on Speculative Decoding
Summary
Researchers have identified a new mechanism-level vulnerability in speculative decoding, a technique used to accelerate large language model (LLM) inference. This vulnerability, termed "acceleration-collapse attack," exploits the inherent imperfection in the drafter model's approximation of the target model's distribution. The proposed attack, Mistletoe, subtly reduces the average accepted length ($\tau$) of draft tokens, which is critical for speculative decoding's efficiency, without visibly altering the target model's output quality or perplexity. Mistletoe achieves this by jointly optimizing a degradation objective to decrease drafter-target agreement and a semantic-preservation objective, using a null-space projection mechanism to resolve conflicts. Experiments on Vicuna-7B and Vicuna-13B with various speculative decoding systems (Medusa, Hydra, EAGLE, EAGLE-2, EAGLE-3) and datasets (MT-Bench, HumanEval, GSM8K) consistently show substantial reductions in speedup and $\tau$, demonstrating the attack's effectiveness and generalizability.
Key takeaway
For engineering teams deploying LLMs with speculative decoding, you must recognize that acceleration mechanisms introduce a new attack surface beyond traditional output robustness. Your systems should incorporate monitoring for abnormal reductions in average accepted length ($\tau$) and develop robust verification designs that are resilient to adversarial prompt perturbations. This proactive approach will help mitigate performance-robustness risks and prevent stealthy efficiency degradation.
Key insights
Speculative decoding's efficiency is vulnerable to stealthy attacks that degrade draft token acceptance without altering output semantics.
Principles
- Drafter-target mismatch creates an attack surface.
- Preserving semantics is crucial for attack stealthiness.
- Null-space projection resolves conflicting optimization objectives.
Method
Mistletoe optimizes a discrete suffix by maximizing KL divergence between target and drafter distributions for proposed tokens, while constraining semantic drift via null-space projection and KL-threshold filtering.
In practice
- Monitor average accepted length ($\tau$) for anomalies.
- Implement robust verification mechanisms.
- Consider adversarial prompt perturbations in LLM security.
Topics
- Mistletoe Attack
- Speculative Decoding
- LLM Inference Acceleration
- Null-Space Projection
- Drafter-Target Mismatch
Best for: Research Scientist, CTO, VP of Engineering/Data, AI Scientist, Machine Learning Engineer, AI Security Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CL updates on arXiv.org.