Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Fine-Grained Video Retrieval With Scene Sketches

Zuo, Ran, Deng, Xiaoming, Chen, Keqi, Zhang, Zhengming, Lai, Yu-Kun ORCID: https://orcid.org/0000-0002-2094-5680, Liu, Fang, Ma, Cuixia, Wang, Hao, Liu, Yong-Jin and Wang, Hongan 2023. Fine-Grained Video Retrieval With Scene Sketches. IEEE Transactions on Image Processing 32 , pp. 3136-3149. 10.1109/TIP.2023.3278474

[thumbnail of VideoSketch_TIP.pdf]
Preview
PDF - Accepted Post-Print Version
Available under License Creative Commons Attribution.

Download (19MB) | Preview

Abstract

Benefiting from the intuitiveness and naturalness of sketch interaction, sketch-based video retrieval (SBVR) has received considerable attention in the video retrieval research area. However, most existing SBVR research still lacks the capability of accurate video retrieval with fine-grained scene content. To address this problem, in this paper we investigate a new task, which focuses on retrieving the target video by utilizing a fine-grained storyboard sketch depicting the scene layout and major foreground instances’ visual characteristics (e.g., appearance, size, pose, etc.) of video; we call such a task “fine-grained scene-level SBVR”. The most challenging issue in this task is how to perform scene-level cross-modal alignment between sketch and video. Our solution consists of two parts. First, we construct a scene-level sketch-video dataset called SketchVideo, in which sketch-video pairs are provided and each pair contains a clip-level storyboard sketch and several keyframe sketches (corresponding to video frames). Second, we propose a novel deep learning architecture called Sketch Query Graph Convolutional Network (SQ-GCN). In SQ-GCN, we first adaptively sample the video frames to improve video encoding efficiency, and then construct appearance and category graphs to jointly model visual and semantic alignment between sketch and video. Experiments show that our fine-grained scene-level SBVR framework with SQ-GCN architecture outperforms the state-of-the-art fine-grained retrieval methods. The SketchVideo dataset and SQ-GCN code are available in the project webpage https://iscas-mmsketch.github.io/FG-SL-SBVR/ .

Item Type: Article
Date Type: Published Online
Status: Published
Schools: Computer Science & Informatics
Publisher: Institute of Electrical and Electronics Engineers
ISSN: 1057-7149
Date of First Compliant Deposit: 8 June 2023
Date of Acceptance: 5 May 2023
Last Modified: 08 Nov 2023 10:39
URI: https://orca.cardiff.ac.uk/id/eprint/160246

Actions (repository staff only)

Edit Item Edit Item

Downloads

Downloads per month over past year

View more statistics