Subjective and Objective Quality-of-Experience Evaluation Study for Live Video Streaming
Summary
Researchers Zehao Zhu, Wei Sun, Jun Jia, Wei Wu, Sibin Deng, Kai Li, Ying Chen, Xiongkuo Min, Jia Wang, and Guangtao Zhai have developed a new Quality of Experience (QoE) evaluation model, Tao-QoE, for live video streaming. They also introduce the TaoLive QoE dataset, comprising 42 source videos from real live broadcasts and 1,155 distorted versions. These distorted videos incorporate conventional streaming issues like compression and stalling, alongside live-specific problems such as frame skipping and variable frame rates. A human study involving 20 participants generated subjective QoE scores for the TaoLive QoE dataset. The Tao-QoE model is an end-to-end deep learning solution that integrates multi-scale semantic features and optical flow-based motion features to predict retrospective QoE scores without relying on statistical Quality of Service (QoS) features. Experiments show Tao-QoE outperforms existing models on TaoLive QoE, six other QoE datasets, and eight User-Generated Content (UGC) Video Quality Assessment (VQA) datasets.
Key takeaway
For research scientists developing or evaluating live video streaming QoE models, you should consider the TaoLive QoE dataset as a new, comprehensive benchmark. Its inclusion of live-specific distortions like frame skipping and variable frame rates, alongside traditional issues, provides a more realistic evaluation environment. Furthermore, integrating optical flow and multi-scale semantic features, as demonstrated by the Tao-QoE model, is critical for accurately capturing the nuances of live video quality and user satisfaction.
Key insights
Tao-QoE model and TaoLive QoE dataset advance live video streaming quality assessment by integrating semantic and motion features.
Principles
- Live streaming QoE requires specific distortion modeling.
- Semantic and motion features are crucial for QoE prediction.
- Optical flow effectively perceives stalling distortion.
Method
The Tao-QoE model uses a video restructuring sub-network, Swin Transformer for multi-scale semantic features, PWC-Net and 3D-CNN for optical flow motion features, and a multi-scale feature fusion sub-network before a fully connected layer for QoE regression.
In practice
- Use TaoLive QoE dataset for live streaming research.
- Integrate optical flow for detecting motion distortions.
- Consider multi-scale semantic features for perceptual quality.
Topics
- Live Video Streaming
- Quality of Experience
- TaoLive QoE Dataset
- Tao-QoE Model
- Optical Flow Features
Best for: Research Scientist, AI Scientist, Machine Learning Engineer, Computer Vision Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.