Deep feature enhancing and selecting network for weakly supervised temporal action localization

Journal of Visual Communication and Image Representation(2021)

引用 3|浏览14
暂无评分
摘要
Weakly supervised temporal action localization is a challenging computer vision problem that uses only video-level labels and lacks the supervision of temporal annotations. In this task, the majority of existing methods usually identify the most discriminative snippets and ignore other relevant snippets. To address this problem, we propose a deep feature enhancing and selecting network. It generates multiple masks for both capturing more complete temporal interval of actions and keeping its high classification accuracy. After that, we further propose a novel selection strategy to balance the influence of multiple masks and improve the model performance. In the experiments, we evaluate the proposed method on the THUMOS’14 and ActivityNet datasets, and the results show the effectiveness of our approach for weakly supervised temporal action localization.
更多
查看译文
关键词
Weakly supervised,Temporal action localization,Deep learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要