Real-Time AttentionBender: Granular Interactive Network Bending of Video Diffusion Transformers

2026-04-24 · Source: cs.CV updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Computer Vision and Pattern Recognition, Emerging Technologies & Innovation · Depth: Expert, short

Summary

Real-Time AttentionBender is a new tool designed to enhance creative agency in generative video models by enabling granular, interactive manipulation of Video Diffusion Transformers (DiT). Released as a plugin within the DayDream Scope ecosystem and utilizing open-source real-time Wan pipelines, this tool exposes self-attention, cross-attention, and feed-forward networks for independent control. Users can target specific diffusion steps, DiT layers, prompt tokens, and individual hidden neurons. This immediate, live manipulation fosters "material intimacy" with the model, offering a responsive understanding of how different components shape generated video. The authors, Adam Cole, Rebecca Fiebrink, and Mick Grierson, position AttentionBender as both an XAIxArts probe into transformer internals and an expressive instrument for exploring novel aesthetics beyond default model outputs. The paper was accepted to ACM Creativity & Cognition XAIxArts Workshop 2026 and revised on June 8, 2026.

Key takeaway

For creative technologists and AI scientists developing generative video applications, Real-Time AttentionBender offers a critical shift from prompt-only interfaces. You should consider integrating granular network bending to gain "material intimacy" with Video Diffusion Transformers, enabling direct manipulation of attention and feed-forward networks. This approach allows you to explore unique aesthetic outcomes and deeply understand model mechanics, moving beyond default outputs and fostering more expressive, interactive video generation workflows.

Key insights

Real-Time AttentionBender offers granular, interactive control over Video Diffusion Transformers, enhancing creative agency and model understanding.

Principles

Direct network bending boosts creative agency.
Live manipulation builds model intimacy.
XAI tools can be expressive instruments.

Method

Real-Time AttentionBender operates as a DayDream Scope plugin, integrating open-source Wan pipelines. It exposes DiT components like self-attention, cross-attention, and feed-forward networks for independent, real-time manipulation at granular levels, including diffusion steps, layers, tokens, and neurons.

In practice

Explore novel video aesthetics.
Probe transformer internal workings.
Enhance creative control in video generation.

Topics

Video Diffusion Transformers
Network Bending
Generative Video
Human-Computer Interaction
Explainable AI
Creative AI Tools

Best for: Computer Vision Engineer, Research Scientist, Creative Technologist, AI Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CV updates on arXiv.org.