Voice AI Systems Are Vulnerable to Hidden Audio Attacks

2026-05-17 · Source: IEEE Spectrum · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cybersecurity & Data Privacy · Depth: Advanced, short

Summary

New research, to be presented at the IEEE Symposium on Security and Privacy, reveals that large audio-language models (LALMs) can be "hijacked" through imperceptible sounds embedded in audio clips. These modified audio clips, undetectable by human ears, can manipulate a model's behavior with an average success rate of 79 to 96 percent, regardless of user-provided instructions. The technique, dubbed AudioHijack, was tested against 13 leading open models, including commercial AI voice services from Microsoft and Mistral, demonstrating the ability to force models into sensitive web searches, downloading files from attacker-controlled sources, and sending emails containing user data. This method exploits a security flaw in LALM design, allowing malicious instructions to be hidden in manipulated audio, and can be applied even when a model is in use by someone else, such as in online videos or AI transcription services.

Key takeaway

For CTOs and VPs of Engineering overseeing AI/ML deployments, this research highlights a critical, unaddressed vulnerability in LALMs. You should prioritize evaluating the resilience of your audio-language models against adversarial audio attacks, especially those that can operate without user awareness. Implement robust monitoring of internal attention mechanisms and explore additional layers of protection beyond single-point defenses to safeguard against data exfiltration and unauthorized tool use.

Key insights

Imperceptible audio manipulations can hijack large audio-language models, forcing unauthorized commands with high success rates.

Principles

Generative models are vulnerable to adversarial audio attacks.
Context-agnostic adversarial audio signals are reusable.

Method

AudioHijack uses an optimization algorithm to tweak digital audio waveforms, approximating fine-grained feedback for generative models to elicit specific malicious behaviors, while making changes sound like natural reverberation.

In practice

Embed malicious instructions in online videos or music clips.
Broadcast malicious audio during Zoom calls for transcription services.

Topics

Voice AI Security
Adversarial Audio Attacks
Large Audio-Language Models
AudioHijack Technique
Generative AI Vulnerabilities

Best for: CTO, VP of Engineering/Data, Director of AI/ML, AI Security Engineer, AI Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by IEEE Spectrum.