Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Perceptual quality assessment of internet videos

Xu, Jiahua, Li, Jing, Zhou, Xingguang, Zhou, Wei, Wang, Baichao and Chen, Zhibo 2021. Perceptual quality assessment of internet videos. Presented at: MM '21: 29th ACM International Conference on Multimedia, Virtual Event China, 20 - 24 October 2021. MM '21: Proceedings of the 29th ACM International Conference on Multimedia. Association for Computing Machinery, pp. 1248-1257. 10.1145/3474085.3475486

Full text not available from this repository.


With the fast proliferation of online video sites and social media platforms, user, professionally and occupationally generated content (UGC, PGC, OGC) videos are streamed and explosively shared over the Internet. Consequently, it is urgent to monitor the content quality of these Internet videos to guarantee the user experience. However, most existing modern video quality assessment (VQA) databases only include UGC videos and cannot meet the demands for other kinds of Internet videos with real-world distortions. To this end, we collect 1,072 videos from Youku, a leading Chinese video hosting service platform, to establish the Internet video quality assessment database (Youku-V1K). A special sampling method based on several quality indicators is adopted to maximize the content and distortion diversities within a limited database, and a probabilistic graphical model is applied to recover reliable labels from noisy crowdsourcing annotations. Based on the properties of Internet videos originated from Youku, we propose a spatio-temporal distortion-aware model (STDAM). First, the model works blindly which means the pristine video is unnecessary. Second, the model is familiar with diverse contents by pre-training on the large-scale image quality assessment databases. Third, to measure spatial and temporal distortions, we introduce the graph convolution and attention module to extract and enhance the features of the input video. Besides, we leverage the motion information and integrate the frame-level features into video-level features via a bi-directional long short-term memory network. Experimental results on the self-built database and the public VQA databases demonstrate that our model outperforms the state-of-the-art methods and exhibits promising generalization ability.

Item Type: Conference or Workshop Item (Paper)
Date Type: Published Online
Status: Published
Schools: Computer Science & Informatics
Publisher: Association for Computing Machinery
ISBN: 978-1-4503-8651-7
Last Modified: 27 Sep 2023 15:45

Actions (repository staff only)

Edit Item Edit Item