Analytic Score Optimization for Multi Dimension Video Quality Assessment
Summary
Researchers have developed UltraVQA, a large-scale multi-dimensional Video Quality Assessment (VQA) dataset featuring approximately 40,000 user-generated and professional video clips. Each video is annotated across five quality dimensions: Motion Quality, Motion Amplitude, Aesthetic Quality, Content Quality, and Clarity Quality, with scores from 1.0 to 5.0 in 0.5 intervals by at least three human raters. The dataset also includes fine-grained sub-attribute labels and GPT-generated explanatory rationales based on collective human judgments. To leverage these rich annotations, the team introduced Analytic Score Optimization (ASO), a post-training objective that reframes quality assessment as a regularized decision-making process. ASO provides a closed-form solution that naturally captures the ordinal nature of human ratings, outperforming most baselines, including closed-source APIs and open-source models, and reducing mean absolute error (MAE) in quality prediction.
Key takeaway
Research scientists developing VQA models should consider adopting multi-dimensional datasets like UltraVQA and integrating Analytic Score Optimization (ASO) into their post-training pipelines. This approach, which captures the ordinal nature of human ratings and provides interpretability through rationales, can significantly improve model accuracy and alignment with human perception compared to traditional single-score or continuous regression methods. Your models will yield more nuanced and justifiable quality assessments.
Key insights
Multi-dimensional VQA with interpretable annotations and a novel optimization method improves video quality assessment.
Principles
- Single-score VQA obscures diverse quality factors.
- Ordinal nature of human ratings is crucial for VQA.
- Rationale supervision improves model interpretability.
Method
Analytic Score Optimization (ASO) formulates discrete scoring as a KL-regularized one-step bandit, deriving a closed-form optimal score policy over discrete levels for stable, sample-efficient soft-target learning.
In practice
- Use UltraVQA for multi-dimensional VQA training.
- Apply ASO for discrete quality score alignment.
- Incorporate rationale supervision for model transparency.
Topics
- Video Quality Assessment
- Multi-dimensional VQA
- UltraVQA Dataset
- Analytic Score Optimization
- Vision-Language Models
Best for: Research Scientist, AI Researcher, AI Scientist, Computer Vision Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CV updates on arXiv.org.