Safe Autoregressive Image Generation with Iterative Self-Improving Codebooks
Summary
Iterative Self-Improving Codebooks (ISIC) are proposed for enhancing safety in autoregressive image generation, a method distinct from diffusion models as it sequentially predicts discretized visual tokens from a codebook. This approach leverages the unified multimodal model's inherent understanding to identify unsafe generated images without requiring human annotation. The core method involves a two-step iterative process: first, the model identifies unsafe generations to construct harmful and safe image-text pairs, which then guide updates to the codebook's inherent representations to eliminate harmful mappings. Second, adaptive fine-tuning is performed on the codebook within the harmless space using safe image-text pairs to maintain generation quality. These steps are repeated until no further safety improvement is observed, yielding a safety-enhanced model codebook without external feedback.
Key takeaway
For AI Scientists developing autoregressive image generation models, this self-improving codebook approach offers a novel path to enhance safety. You should consider integrating internal model judgment for identifying and mitigating harmful outputs, reducing reliance on costly human annotation. This iterative method allows for continuous safety improvements while preserving generation quality, providing a robust framework for deploying safer multimodal systems.
Key insights
Unified multimodal models can self-identify and eliminate unsafe image generations through iterative codebook refinement.
Principles
- Leverage model's judgment for safety.
- Fix inherent representations to remove harm.
- Iterative refinement improves safety.
Method
The method uses the unified model to identify unsafe generations, constructing harmful/safe image-text pairs to update the codebook, followed by adaptive fine-tuning with safe pairs to ensure quality.
In practice
- Automate safety filtering in generation.
- Refine codebooks without human labels.
- Maintain quality during safety updates.
Topics
- Autoregressive Models
- Image Generation
- Model Safety
- Codebook Learning
- Multimodal AI
- Self-Improvement
Best for: Computer Vision Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, AI Security Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.