Scaling Multi-Reference Image Generation with Dynamic Reward Optimization
Summary
A new research introduces OmniRef-Bench, a benchmark designed to evaluate multi-reference image generation (MRIG) models in complex scenarios involving diverse reference image types and large quantities. Evaluations using OmniRef-Bench reveal that current mainstream open-source models perform poorly in these complex MRIG tasks, with performance significantly degrading as the number of mixed-type reference images increases. To address this limitation, the authors propose DyRef, a two-stage training framework. The first stage employs supervised fine-tuning to establish foundational capabilities for complex MRIG. The second stage integrates Difficulty-aware Advantage Reweighting (DAR) for dynamic optimization adjustment and Discriminative Reward Scaling (DRS) to enhance policy optimization by increasing intra-group reward differences. Experiments confirm that DyRef substantially improves open-source model performance on both OmniRef-Bench and single-image editing benchmarks, demonstrating its effectiveness and generalization.
Key takeaway
For Machine Learning Engineers developing multi-reference image generation systems, you should consider adopting the DyRef two-stage training framework. This approach significantly enhances model performance in complex scenarios involving numerous and diverse reference images, a known weakness of current open-source models. Evaluate your models against the new OmniRef-Bench to accurately assess their capabilities and identify areas for improvement, especially when scaling reference inputs.
Key insights
DyRef, a two-stage framework, significantly improves multi-reference image generation by addressing complex scenarios and scaling challenges.
Principles
- Existing benchmarks fail to evaluate complex MRIG scenarios.
- Model performance degrades with more mixed-type reference images.
- Dynamic reward adjustment improves complex MRIG tasks.
Method
DyRef is a two-stage framework: supervised fine-tuning, then Difficulty-aware Advantage Reweighting (DAR) and Discriminative Reward Scaling (DRS) for policy optimization.
In practice
- Evaluate MRIG models using the OmniRef-Bench benchmark.
- Apply the DyRef framework to enhance open-source models.
- Utilize DAR for dynamic optimization in complex tasks.
Topics
- Multi-Reference Image Generation
- DyRef Framework
- OmniRef-Bench
- Reward Optimization
- Policy Optimization
- Computer Vision
Best for: Research Scientist, AI Scientist, Machine Learning Engineer, Computer Vision Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.