Skip to main content
placeholder image

ConvNets-based action recognition from skeleton motion maps

Journal Article


Abstract


  • With the advance of deep learning, deep learning based action recognition is an important research topic in computer vision. The skeleton sequence is often encoded into an image to better use Convolutional Neural Networks (ConvNets) such as Joint Trajectory Maps (JTM). However, this encoding method cannot effectively capture long temporal information. In order to solve this problem, This paper presents an effective method to encode spatial-temporal information into color texture images from skeleton sequences, referred to as Temporal Pyramid Skeleton Motion Maps (TPSMMs), and Convolutional Neural Networks (ConvNets) are applied to capture the discriminative features from TPSMMs for human action recognition. The TPSMMs not only capture short temporal information, but also embed the long dynamic information over the period of an action. The proposed method has been verified and achieved the state-of-the-art results on the widely used UTD-MHAD, MSRC-12 Kinect Gesture and SYSU-3D datasets.

UOW Authors


  •   Chen, Yanfang (external author)
  •   Wang, Liwei (external author)
  •   Li, Chuankun (external author)
  •   Hou, Yonghong (external author)
  •   Li, Wanqing

Publication Date


  • 2019

Citation


  • Chen, Y., Wang, L., Li, C., Hou, Y. & Li, W. (2019). ConvNets-based action recognition from skeleton motion maps. Multimedia Tools and Applications, Online First 1-19.

Scopus Eid


  • 2-s2.0-85074850199

Number Of Pages


  • 18

Start Page


  • 1

End Page


  • 19

Volume


  • Online First

Place Of Publication


  • United States

Abstract


  • With the advance of deep learning, deep learning based action recognition is an important research topic in computer vision. The skeleton sequence is often encoded into an image to better use Convolutional Neural Networks (ConvNets) such as Joint Trajectory Maps (JTM). However, this encoding method cannot effectively capture long temporal information. In order to solve this problem, This paper presents an effective method to encode spatial-temporal information into color texture images from skeleton sequences, referred to as Temporal Pyramid Skeleton Motion Maps (TPSMMs), and Convolutional Neural Networks (ConvNets) are applied to capture the discriminative features from TPSMMs for human action recognition. The TPSMMs not only capture short temporal information, but also embed the long dynamic information over the period of an action. The proposed method has been verified and achieved the state-of-the-art results on the widely used UTD-MHAD, MSRC-12 Kinect Gesture and SYSU-3D datasets.

UOW Authors


  •   Chen, Yanfang (external author)
  •   Wang, Liwei (external author)
  •   Li, Chuankun (external author)
  •   Hou, Yonghong (external author)
  •   Li, Wanqing

Publication Date


  • 2019

Citation


  • Chen, Y., Wang, L., Li, C., Hou, Y. & Li, W. (2019). ConvNets-based action recognition from skeleton motion maps. Multimedia Tools and Applications, Online First 1-19.

Scopus Eid


  • 2-s2.0-85074850199

Number Of Pages


  • 18

Start Page


  • 1

End Page


  • 19

Volume


  • Online First

Place Of Publication


  • United States