This is the Accepted Manuscript version of an article that proposes HAR-Depth with sequential and shape learning along with the novel concept of depth history image (DHI) to address the challenges of Human action recognition (HAR). Results suggest that the proposed work of this paper performs better in terms of overall accuracy, kappa parameter and precision compared to the other state-of-the-art algorithms present in the earlier reported literature.
The UNT College of Engineering strives to educate and train engineers and technologists who have the vision to recognize and solve the problems of society. The college comprises six degree-granting departments of instruction and research.
This is the Accepted Manuscript version of an article that proposes HAR-Depth with sequential and shape learning along with the novel concept of depth history image (DHI) to address the challenges of Human action recognition (HAR). Results suggest that the proposed work of this paper performs better in terms of overall accuracy, kappa parameter and precision compared to the other state-of-the-art algorithms present in the earlier reported literature.
Physical Description
13 p.
Notes
Abstract: Human action recognition (HAR) is a challenging task due to the presence of the pose and temporal variations in the action videos. To address these challenges, HAR-Depth is proposed in this paper with sequential and shape learning along with the novel concept of depth history image (DHI). A deep bidirectional long short term memory (DBiLSTM) is constructed for sequential learning to model the temporal relationship existing between the action frames. Action information in each frame is extracted using pre-trained convolutional neural network (CNN). The depth information of each action frame is estimated and projected onto the X-Y plane to form the DHI. During shape learning, the shape information through DHI is used to train a deep pre-trained CNN network. By leveraging the trained knowledge of the pre-trained network, overfitting issue is handled. The finetuned network is used to recognize actions from query DHI images. Data augmentation is adopted to avoid overfitting of the network by virtually increasing the training set. The proposed work is evaluated on publicly available datasets like KTH, UCF sports, JHMDB, UCF101, and HMDB51 and achieves the performance accuracy of 97.67%, 95.00%, 73.13%, 92.97%, and 69.74% respectively. The results on these datasets suggest that the proposed work of this paper performs better in terms of overall accuracy, kappa parameter and precision compared to the other state-of-the-art algorithms present in the earlier reported literature.
Publication Title:
IEEE Transactions on Emerging Topics in Computational Intelligence
Volume:
5
Issue:
5
Page Start:
813
Page End:
825
Peer Reviewed:
Yes
Collections
This article is part of the following collection of related materials.
UNT Scholarly Works
Materials from the UNT community's research, creative, and scholarly activities and UNT's Open Access Repository. Access to some items in this collection may be restricted.
Sahoo, Suraj Prakash; Ari, Samit; Mahapatra, Kamalakanta & Mohanty, Saraju P.HAR-Depth: A Novel Framework for Human Action Recognition Using Sequential Learning and Depth Estimated History Images,
article,
August 24, 2020;
(https://digital.library.unt.edu/ark:/67531/metadc1913264/:
accessed May 1, 2024),
University of North Texas Libraries, UNT Digital Library, https://digital.library.unt.edu;
crediting UNT College of Engineering.