Skip to main content
placeholder image

ConvNets-based action recognition from depth maps through virtual cameras and pseudocoloring

Conference Paper


Abstract


  • In this paper, we propose to adopt ConvNets to recognize human actions from depth maps on relatively small datasets based on Depth Motion Maps (DMMs). In particular, three strategies are developed to effectively leverage the capability of ConvNets in mining discriminative features for recognition. Firstly, different viewpoints are mimicked by rotating virtual cameras around subject represented by the 3D points of the captured depth maps. This not only synthesizes more data from the captured ones, but also makes the trained ConvNets view-Tolerant. Secondly, DMMs are constructed and further enhanced for recognition by encoding them into Pseudo-RGB images, turning the spatial-Temporal motion patterns into textures and edges. Lastly, through transferring learning the models originally trained over ImageNet for image classification, the three ConvNets are trained independently on the colorcoded DMMs constructed in three orthogonal planes. The proposed algorithm was extensively evaluated on MSRAction3D, MSRAction3DExt and UTKinect-Action datasets and achieved the stateof-the-Art results on these datasets.

Authors


  •   Wang, Pichao (external author)
  •   Li, Wanqing
  •   Gao, Zhimin (external author)
  •   Tang, Chang (external author)
  •   Zhang, Jing (external author)
  •   Ogunbona, Philip O.

Publication Date


  • 2015

Citation


  • Wang, P., Li, W., Gao, Z., Tang, C., Zhang, J. & Ogunbona, P. (2015). ConvNets-based action recognition from depth maps through virtual cameras and pseudocoloring. In X. Zhou, A. F. Smeaton & Q. Tian (Eds.), Proceedings of the 23rd ACM international conference on Multimedia (pp. 1119-1122). United States: ACM.

Scopus Eid


  • 2-s2.0-84962878607

Ro Metadata Url


  • http://ro.uow.edu.au/eispapers/5381

Has Global Citation Frequency


Start Page


  • 1119

End Page


  • 1122

Place Of Publication


  • United States

Abstract


  • In this paper, we propose to adopt ConvNets to recognize human actions from depth maps on relatively small datasets based on Depth Motion Maps (DMMs). In particular, three strategies are developed to effectively leverage the capability of ConvNets in mining discriminative features for recognition. Firstly, different viewpoints are mimicked by rotating virtual cameras around subject represented by the 3D points of the captured depth maps. This not only synthesizes more data from the captured ones, but also makes the trained ConvNets view-Tolerant. Secondly, DMMs are constructed and further enhanced for recognition by encoding them into Pseudo-RGB images, turning the spatial-Temporal motion patterns into textures and edges. Lastly, through transferring learning the models originally trained over ImageNet for image classification, the three ConvNets are trained independently on the colorcoded DMMs constructed in three orthogonal planes. The proposed algorithm was extensively evaluated on MSRAction3D, MSRAction3DExt and UTKinect-Action datasets and achieved the stateof-the-Art results on these datasets.

Authors


  •   Wang, Pichao (external author)
  •   Li, Wanqing
  •   Gao, Zhimin (external author)
  •   Tang, Chang (external author)
  •   Zhang, Jing (external author)
  •   Ogunbona, Philip O.

Publication Date


  • 2015

Citation


  • Wang, P., Li, W., Gao, Z., Tang, C., Zhang, J. & Ogunbona, P. (2015). ConvNets-based action recognition from depth maps through virtual cameras and pseudocoloring. In X. Zhou, A. F. Smeaton & Q. Tian (Eds.), Proceedings of the 23rd ACM international conference on Multimedia (pp. 1119-1122). United States: ACM.

Scopus Eid


  • 2-s2.0-84962878607

Ro Metadata Url


  • http://ro.uow.edu.au/eispapers/5381

Has Global Citation Frequency


Start Page


  • 1119

End Page


  • 1122

Place Of Publication


  • United States