STA-GCN: two-stream graph convolutional network with spatial–temporal attention for hand gesture recognition
The Visual Computer(2020)
摘要
Skeleton-based hand gesture recognition is an active research topic in computer graphics and computer vision and has a wide range of applications in VR/AR and robotics. Although the spatial–temporal graph convolutional network has been successfully used in skeleton-based hand gesture recognition, these works often use a fixed spatial graph according to the hand skeleton tree or use a fixed graph on the temporal dimension, which may not be optimal for hand gesture recognition. In this paper, we propose a two-stream graph attention convolutional network with spatial–temporal attention for hand gesture recognition. We adopt pose stream and motion stream as the two input streams for our network. In pose stream, we use the joint in each frame as the input; In motion stream, we use the joint offsets between neighboring frames as the input. We propose a new temporal graph attention module to model the temporal dependency and also use a spatial graph attention module to construct dynamic skeleton graph. For each stream, we adopt graph convolutional network with spatial–temporal attention to extract the features. Then, we concatenate the feature of the pose stream and motion stream for gesture recognition. We achieve the competitive performance on the main hand gesture recognition benchmark datasets, which demonstrates the effectiveness of our method.
更多查看译文
关键词
Hand gesture recognition, Graph convolutional network, Spatial-temporal attention
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络