A Novel 3D Human Action Recognition Framework for Video Content Analysis.

Lecture Notes in Computer Science(2018)

引用 5|浏览51
暂无评分
摘要
Understanding the meanings of human actions from 3D skeleton data embedded videos is a new challenge in content-oriented video analysis. In this paper, we propose to incorporate temporal patterns of joint positions with currently popular Long Short-Term Memory (LSTM) based learning to improve both accuracy and robustness. Regarding 3D actions are formed by sub-actions, we first propose Wavelet Temporal Pattern (WTP) to extract representations of temporal patterns for each sub-action by wavelet transform. Then, we define a novel Relation-aware LSTM (R-LSTM) structure to extract features by modeling the long-term spatio-temporal correlation between body parts. Regarding WTP and R-LSTM features as heterogeneous representations for human actions, we next fuse WTP and R-LSTM features by an Auto-Encoder network to define a more effective action descriptor for classification. The experimental results on a large scale challenging dataset NTU-RGB+D and several other datasets consisting of UT-Kinect and Florence 3D actions for 3D human action analysis demonstrate the effectiveness of the proposed method.
更多
查看译文
关键词
Video analysis,3D action recognition,Long short-term memory
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要