Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Online human action recognition with spatial and temporal skeleton features using a distributed camera network

Liu, Guoliang, Zhang, Qinghui, Cao, Yichao, Tian, Guohui and Ji, Ze ORCID: 2021. Online human action recognition with spatial and temporal skeleton features using a distributed camera network. International Journal of Intelligent Systems 36 (12) , pp. 7389-7411. 10.1002/int.22591

[thumbnail of wileyNJD-AMA_zj_for_ORCA.pdf] PDF - Accepted Post-Print Version
Download (3MB)


Online action recognition is an important task for human-centered intelligent services. However, it remains a highly challenging problem due to the high varieties and uncertainties of spatial and temporal scales of human actions. In this paper, the following core ideas are proposed to deal with the online action recognition problem. First, we combine spatial and temporal skeleton features to represent human actions, which include not only geometrical features, but also multiscale motion features, such that both spatial and temporal information of the actions are covered. We use an efficient one-dimensional convolutional neural network to fuse spatial and temporal features and train them for action recognition. Second, we propose a group sampling method to combine the previous action frames and current action frames, which are based on the hypothesis that the neighboring frames are largely redundant, and the sampling mechanism ensures that the long-term contextual information is also considered. Third, the skeletons from multiview cameras are fused in a distributed manner, which can improve the human pose accuracy in the case of occlusions. Finally, we propose a Restful style based client-server service architecture to deploy the proposed online action recognition module on the remote server as a public service, such that camera networks for online action recognition can benefit from this architecture due to the limited onboard computational resources. We evaluated our model on the data sets of JHMDB and UT-Kinect, which achieved highly promising accuracy levels of 80.1% and 96.9%, respectively. Our online experiments show that our memory group sampling mechanism is far superior to the traditional sliding window.

Item Type: Article
Date Type: Publication
Status: Published
Schools: Engineering
Publisher: Wiley
ISSN: 0884-8173
Date of First Compliant Deposit: 13 September 2021
Date of Acceptance: 23 July 2021
Last Modified: 07 Nov 2023 03:26

Citation Data

Cited 3 times in Scopus. View in Scopus. Powered By Scopus® Data

Actions (repository staff only)

Edit Item Edit Item


Downloads per month over past year

View more statistics