Hierarchical Spatial-Temporal Network for Skeleton-Based Temporal Action Segmentation

Chenwei Tan, Tao Sun, Talas Fu,Yuhan Wang, Minjie Xu,Shenglan Liu

PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT X(2024)

引用 0|浏览3
暂无评分
摘要
Skeleton-based Temporal Action Segmentation (TAS) plays an important role in analyzing long videos of motion-centered human actions. Recent approaches perform spatial and temporal information modeling simultaneously in the spatial-temporal topological graph, leading to high computational costs due to the large graph magnitude. Additionally, multi-modal skeleton data has sufficient semantic information, which has not been fully explored. This paper proposes a Hierarchical Spatial-Temporal Network (HSTN) for skeleton-based TAS. In HSTN, the Multi-Branch Transfer Fusion (MBTF) module utilizes a multi-branch graph convolution structure with an attention mechanism to capture spatial dependencies in multi-modal skeleton data. In addition, the Multi-Scale Temporal Convolution (MSTC) module aggregates spatial information and performs multi-scale temporal information modeling to capture long-range dependencies. Extensive experiments on two challenging datasets are performed and our proposed method outperforms the State-of-the-Art (SOTA) methods.
更多
查看译文
关键词
Temporal action segmentation,Multi-modal fusion,Graph convolution
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要