Compact Representation and Reliable Classification Learning for Point-Level Weakly-Supervised Action Localization

IEEE TRANSACTIONS ON IMAGE PROCESSING(2022)

引用 3|浏览27
暂无评分
摘要
Point-level weakly-supervised temporal action localization (P-WSTAL) aims to localize temporal extents of action instances and identify the corresponding categories with only a single point label for each action instance for training. Due to the sparse frame-level annotations, most existing models are in the localization-by-classification pipeline. However, there exist two major issues in this pipeline: large intra-action variation due to task gap between classification and localization and noisy classification learning caused by unreliable pseudo training samples. In this paper, we propose a novel framework CRRC-Net, which introduces a co-supervised feature learning module and a probabilistic pseudo label mining module, to simultaneously address the above two issues. Specifically, the co-supervised feature learning module is applied to exploit the complementary information in different modalities for learning more compact feature representations. Furthermore, the probabilistic pseudo label mining module utilizes the feature distances from action prototypes to estimate the likelihood of pseudo samples and rectify their corresponding labels for more reliable classification learning. Comprehensive experiments are conducted on different benchmarks and the experimental results show that our method achieves favorable performance with the state-of-the-art.
更多
查看译文
关键词
Compact representation,reliable classification,point-level weakly-supervised action localization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要