Dual Attention Convolutional Network for Action Recognition

IET Image Processing(2020)

引用 12|浏览18
暂无评分
摘要
Action recognition has been an active research area for many years. Extracting discriminative spatial and temporal features of different actions plays a key role in accomplishing this task. Current popular methods of action recognition are mainly based on two-stream Convolutional Networks (ConvNets) or 3D ConvNets. However, the computational cost of two-stream ConvNets is high for the requirement of optical flow while 3D ConvNets takes too much memory because they have a large amount of parameters. To alleviate such problems, the authors propose a Dual Attention ConvNet (DANet) based on dual attention mechanism which consists of spatial attention and temporal attention. The former concentrates on main motion objects in a video frame by using ConvNet structure and the latter captures related information of multiple video frames by adopting self-attention. Their network is entirely based on 2D ConvNet and takes in only RGB frames. Experimental results on UCF-101 and HMDB-51 benchmarks demonstrate that DANet gets comparable results among leading methods, which proves the effectiveness of the dual attention mechanism.
更多
查看译文
关键词
image sequences,video signal processing,image motion analysis,feature extraction,image recognition,convolutional neural nets,image colour analysis
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要