Pushing the Frontier of Black-Box LVLM Attacks via Fine-Grained Detail Targeting

2026-02-19 · Source: Machine Learning · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Computer Vision & Pattern Recognition · Depth: Expert, quick

Summary

M-Attack-V2 is a new black-box adversarial attack method designed to improve transferability against Large Vision-Language Models (LVLMs). It addresses limitations in prior state-of-the-art transfer-based approaches like M-Attack, which suffered from high-variance gradients due to ViT translation sensitivity and structural asymmetry in image crops. M-Attack-V2 introduces Multi-Crop Alignment (MCA) to average gradients from multiple local views, reducing variance on the source side. It also incorporates Auxiliary Target Alignment (ATA), which uses a small auxiliary set for a smoother target manifold instead of aggressive augmentation. Additionally, it reinterprets momentum as Patch Momentum, replaying historical crop gradients, and uses a refined patch-size ensemble (PE+). These enhancements significantly boost attack success rates on frontier LVLMs, increasing success on Claude-4.0 from 8% to 30%, Gemini-2.5-Pro from 83% to 97%, and GPT-5 from 98% to 100%.

Key takeaway

For research scientists developing or evaluating LVLM security, M-Attack-V2 demonstrates that refining gradient stability and local alignment significantly enhances black-box attack transferability. You should consider integrating these gradient denoising and alignment techniques into your adversarial testing frameworks to more robustly assess model vulnerabilities, especially against advanced models like GPT-5 and Gemini-2.5-Pro.

Key insights

M-Attack-V2 improves black-box LVLM attacks by denoising gradients and refining local alignment for better transferability.

Principles

High-variance gradients destabilize optimization.
Averaging gradients from multiple views reduces variance.
Smoother target manifolds improve attack transferability.

Method

M-Attack-V2 uses Multi-Crop Alignment (MCA) for source-side gradient averaging, Auxiliary Target Alignment (ATA) for target-side manifold smoothing, and Patch Momentum with a patch-size ensemble (PE+) to strengthen transferable directions.

In practice

Apply MCA to reduce gradient variance.
Use ATA for smoother target manifold generation.
Implement Patch Momentum for historical gradient replay.

Topics

Black-box Attacks
Large Vision-Language Models
Adversarial Machine Learning
Gradient-based Attacks
Transferability

Code references

vila-lab/M-Attack-V2

Best for: Research Scientist, CTO, VP of Engineering/Data, AI Researcher, AI Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.