Voice AI Systems Are Vulnerable to Hidden Audio Attacks
Summary
New research, to be presented at the IEEE Symposium on Security and Privacy, reveals that large audio-language models (LALMs) can be "hijacked" through imperceptible sounds embedded in audio clips. These modified audio clips, undetectable by human ears, can manipulate a model's behavior with an average success rate of 79 to 96 percent, regardless of user-provided instructions. The technique, dubbed AudioHijack, was tested against 13 leading open models, including commercial AI voice services from Microsoft and Mistral, demonstrating the ability to force models into sensitive web searches, downloading files from attacker-controlled sources, and sending emails containing user data. This method exploits a security flaw in LALM design, allowing malicious instructions to be hidden in manipulated audio, and can be applied even when a model is in use by someone else, such as in online videos or AI transcription services.
Key takeaway
For CTOs and VPs of Engineering overseeing AI/ML deployments, this research highlights a critical, unaddressed vulnerability in LALMs. You should prioritize evaluating the resilience of your audio-language models against adversarial audio attacks, especially those that can operate without user awareness. Implement robust monitoring of internal attention mechanisms and explore additional layers of protection beyond single-point defenses to safeguard against data exfiltration and unauthorized tool use.
Key insights
Imperceptible audio manipulations can hijack large audio-language models, forcing unauthorized commands with high success rates.
Principles
- Generative models are vulnerable to adversarial audio attacks.
- Context-agnostic adversarial audio signals are reusable.
Method
AudioHijack uses an optimization algorithm to tweak digital audio waveforms, approximating fine-grained feedback for generative models to elicit specific malicious behaviors, while making changes sound like natural reverberation.
In practice
- Embed malicious instructions in online videos or music clips.
- Broadcast malicious audio during Zoom calls for transcription services.
Topics
- Voice AI Security
- Adversarial Audio Attacks
- Large Audio-Language Models
- AudioHijack Technique
- Generative AI Vulnerabilities
Best for: CTO, VP of Engineering/Data, Director of AI/ML, AI Security Engineer, AI Scientist, Machine Learning Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by IEEE Spectrum.