Actions as points: a simple and efficient detector for skeleton-based temporal action detection

Mach. Vis. Appl.(2023)

引用 0|浏览16
暂无评分
摘要
Temporal action detection, aiming to determine the fragment and category of a human action simultaneously from continuous data stream, is still a challenge issue in the field of human–robot interaction, somatosensory game and security monitoring. In this paper, we present a novel one-stage skeleton-based TAD method, Action-CenterNet(ACNet) with a simple anchor-free and fully convolutional encoder-decoder pipeline. Our approach encodes skeleton position and motion data sequence from multiple persons into multi-channel skeleton images which are subsequently preprocessed by view invariant transform and translation-scale invariant. ACNet models each action fragment as a center point along the time dimension and generates a keypoint heatmap to locate and classify action fragments. To ensure the accurate temporal coordinates, the discretization error caused by the output stride of network is also learned. Compared with two-stage methods, ACNet is end-to-end differential and flexible. ACNet is also an anchor-free method avoiding the drawbacks of anchor boxes used in anchor-based TAD methods. Experimental results on PKU-MMD dataset, NTU RGB-D dataset and HITvs dataset reveal the excellent performance of our approach.
更多
查看译文
关键词
Temporal action detection,Skeleton data,Action-CenterNet,Object detection
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要