Latent Anchor-Driven Test Generation for Deep Neural Networks

2026-06-04 · Source: cs.SE updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering, Cybersecurity & Data Privacy · Depth: Expert, extended

Summary

Latte is a black-box testing framework designed for Deep Neural Networks (DNNs) that addresses limitations in existing latent-space test generation methods, specifically regarding exploration controllability, failure diversity, and semantic drift. It operates by encoding input seeds using a pre-trained VQ-VAE, performing a seed-centered, one-step latent mutation guided by anchors sampled from alternative classes, and then decoding the mutated points back to the input space. Evaluated across five datasets, including MNIST, CIFAR10, and ImageNet, and ten DNN models like LeNet-5 and ResNet50, Latte consistently improves fault exposure and behavioral diversity. It also maintains low seed-relative semantic drift, outperforming baselines such as SINVAD and Mimicry in failure count, diversity, and testing efficiency under matched budgets.

Key takeaway

For machine learning engineers and AI scientists developing or deploying DNNs in safety-critical applications, you should consider Latte for black-box testing. This framework significantly improves fault exposure and behavioral diversity while maintaining low semantic drift, offering a more efficient alternative to existing methods. Implementing Latte can help you uncover a broader range of model weaknesses and decision instabilities, ensuring more robust and reliable DNN deployments.

Key insights

Latte employs anchor-guided, seed-centric latent space exploration to generate diverse, fault-revealing DNN test cases with low semantic drift.

Principles

Latent space mutation preserves input plausibility.
Anchor-guided exploration targets decision instability.
Controlled exploration balances fault exposure and semantic proximity.

Method

Encode input seeds via VQ-VAE. Sample anchors from alternative classes. Mutate latent representations along seed-anchor directions. Quantize and decode to input space for testing.

In practice

Utilize VQ-VAE for stable latent representations.
Adjust exploration degree (e.g., E=3) for optimal balance.
Apply both single-model and multi-model testing oracles.

Topics

Deep Neural Networks
Black-box Testing
Latent Space Exploration
VQ-VAE
Test Generation
Fault Exposure

Code references

beanduan22/Latte

Best for: Computer Vision Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, AI Security Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.SE updates on arXiv.org.