Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

MLCVNet: multi-level context VoteNet for 3D object detection

Xie, Qian, Lai, Yu-kun ORCID: https://orcid.org/0000-0002-2094-5680, Wu, Jing ORCID: https://orcid.org/0000-0001-5123-9861, Wang, Zhoutao, Zhang, Yiming, Xu, Kai and Wang, Jun 2020. MLCVNet: multi-level context VoteNet for 3D object detection. Presented at: IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13-19 June 2020. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE, pp. 10444-10453. 10.1109/CVPR42600.2020.01046

[thumbnail of MLCVNet-CVPR2020.pdf]
Preview
PDF - Accepted Post-Print Version
Download (2MB) | Preview

Abstract

In this paper, we address the 3D object detection task by capturing multi-level contextual information with the self-attention mechanism and multi-scale feature fusion. Most existing 3D object detection methods recognize objects individually, without giving any consideration on contextual information between these objects. Comparatively, we propose Multi-Level Context VoteNet (MLCVNet) to recognize 3D objects correlatively, building on the state-of-the-art VoteNet. We introduce three context modules into the voting and classifying stages of VoteNet to encode contextual information at different levels. Specifically, a Patch-to-Patch Context (PPC) module is employed to capture contextual information between the point patches, before voting for their corresponding object centroid points. Subsequently, an Object-to-Object Context (OOC) module is incorporated before the proposal and classification stage, to capture the contextual information between object candidates. Finally, a Global Scene Context (GSC) module is designed to learn the global scene context. We demonstrate these by capturing contextual information at patch, object and scene levels. Our method is an effective way to promote detection accuracy, achieving new state-of-the-art detection performance on challenging 3D object detection datasets, i.e., SUN RGBD and ScanNet. We also release our code at https://github.com/NUAAXQ/MLCVNet.

Item Type: Conference or Workshop Item (Paper)
Date Type: Published Online
Status: Published
Schools: Computer Science & Informatics
Publisher: IEEE
ISBN: 9781728171692
ISSN: 1063-6919
Date of First Compliant Deposit: 30 March 2020
Date of Acceptance: 27 February 2020
Last Modified: 26 Jan 2023 22:27
URI: https://orca.cardiff.ac.uk/id/eprint/130653

Citation Data

Cited 90 times in Scopus. View in Scopus. Powered By Scopus® Data

Actions (repository staff only)

Edit Item Edit Item

Downloads

Downloads per month over past year

View more statistics