Zhang, Fang-Lue, Wu, Xian, Li, Rui-Long, Wang, Jue, Zheng, Zhao-Heng and Hu, Shi-Min ![]() ![]() |
Preview |
PDF
- Accepted Post-Print Version
Download (10MB) | Preview |
Abstract
Personal videos often contain visual distractors, which are objects that are accidentally captured that can distract viewers from focusing on the main subjects. We propose a method to automatically detect and localize these distractors through learning from a manually labeled dataset. To achieve spatially and temporally coherent detection, we propose extracting features at the Temporal-Superpixel (TSP) level using a traditional SVM-based learning framework. We also experiment with end-to-end learning using Convolutional Neural Networks (CNNs), which achieves slightly higher performance than other methods. The classification result is further refined in a post-processing step based on graph-cut optimization. Experimental results show that our method achieves an accuracy of 81% and a recall of 86%. We demonstrate several ways of removing the detected distractors to improve the video quality, including video hole filling; video frame replacement; and camera path re-planning. The user study results show that our method can significantly improve the aesthetic quality of videos.
Item Type: | Article |
---|---|
Date Type: | Publication |
Status: | Published |
Schools: | Computer Science & Informatics |
Publisher: | Institute of Electrical and Electronics Engineers (IEEE) |
ISSN: | 1520-9210 |
Date of First Compliant Deposit: | 25 December 2017 |
Date of Acceptance: | 12 December 2017 |
Last Modified: | 27 Nov 2024 02:15 |
URI: | https://orca.cardiff.ac.uk/id/eprint/107790 |
Citation Data
Cited 13 times in Scopus. View in Scopus. Powered By Scopus® Data
Actions (repository staff only)
![]() |
Edit Item |