Motion Accumulation and Selection Network for Video Understanding

2022 International Conference on Machine Learning, Cloud Computing and Intelligent Mining (MLCCIM)(2022)

引用 0|浏览0
暂无评分
摘要
Numerous existing studies have demonstrated the significance of motion information for video understanding. In the literature, most of the previous methods rely heavily on the temporal differences of features extracted by convolutional networks (CNN) to represent the motion. However, this type of motion representation approach may have two poten-tial drawbacks: 1) The difference operation may cause the in-completeness of the extracted moving target contour; 2) Treating all the extracted motion features equally may lead to situations where some motion features will not contribute to the classification or even produce negative incentives. For these two challenges, we propose a motion accumulation and selection network (MAS-Net) based on a novel embedded 2D CNN. According to the experimental results on typical video datasets, it is shown that MAS-Net has achieved the state-of-the-arts on both Something-Something V1&V2, while the computational load is kept at a relatively low level.
更多
查看译文
关键词
video understanding,motion accumulation,selection network
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要