P-CNN: Pose-based CNN Features for Action Recognition

2015 IEEE International Conference on Computer Vision (ICCV)(2015)

引用 727|浏览255
暂无评分
摘要
This work targets human action recognition in video. While recent methods typically represent actions by statistics of local video features, here we argue for the importance of a representation derived from human pose. To this end we propose a new Pose-based Convolutional Neural Network descriptor (P-CNN) for action recognition. The descriptor aggregates motion and appearance information along tracks of human body parts. We investigate different schemes of temporal aggregation and experiment with P-CNN features obtained both for automatically estimated and manually annotated human poses. We evaluate our method on the recent and challenging JHMDB and MPII Cooking datasets. For both datasets our method shows consistent improvement over the state of the art.
更多
查看译文
关键词
pose-based CNN features,human action recognition,local video features,pose-based convolutional neural network descriptor,human body part tracking,P-CNN features,manually annotated human poses,JHMDB cooking datasets,MPII cooking datasets
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要