Development of human motion prediction strategy using inception residual block

arxiv(2023)

引用 1|浏览2
暂无评分
摘要
Human Motion Prediction is a crucial task in computer vision and robotics. It has versatile application potentials, such as human-robot interactions, human action tracking for airport security systems, autonomous car navigation, and computer gaming, to name a few. However, predicting human motion based on past actions is extremely challenging due to the difficulties in correctly detecting spatial and temporal features. We propose an Inception Residual Block(IRB) to detect temporal features in human poses due to its inherent capability of processing multiple kernels to capture salient features. Here, we propose to use multiple 1-D Convolution Neural Networks (CNN) with different kernel sizes and input sequence lengths and concatenate them to get proper embedding. As kernels stride over different receptive fields, they detect smaller and bigger salient features at multiple temporal scales. Our main contribution is to propose a residual connection between input and the output of the inception block to have a continuity between the previously observed pose and the next predicted pose. With this proposed architecture, it learns prior knowledge much better about human poses, and we achieve much higher prediction accuracy as detailed in the paper. Subsequently, we further propose to feed the output of the IRB as an input to the Graph Convolution Neural Network (GCN) due to its better spatial feature learning capability. We perform a parametric analysis for a better design of our model. Subsequently, we evaluate our approach on the Human 3.6M dataset and CMU MoCap dataset and compare our short-term and long-term predictions with the state-of-the-art papers, where our model outperforms most of the pose results, the detailed reasons of which have been elaborated in the paper.
更多
查看译文
关键词
Inception module,Human 3,6M,Residual connection,Graph convolution network
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要