Mining Spatial and Spatio-Temporal ROIs for Action Recognition.

dblp(2016)

引用 23|浏览80
暂无评分
摘要
Author(s): Lian, Xiaochen | Advisor(s): Yuille, Alan Loddon | Abstract: In this paper, we propose an approach to classify action sequences. We observe that in action sequences the critical features for discriminating between actions occur only within sub-regions of the image. Hence deep network approaches will address the entire image are at a disadvantage. This motivates our strategy which uses static and spatio-temporal visual cues to isolate static and spatio-temporal regions of interest (ROIs). We then use weakly supervised learning to train deep network classifiers using the ROIs as input. More specifically, we combine multiple instance learning (MIL) with convolutional neural networks (CNNs) to select discriminative action cues. This yields classifiers for static images, using the static ROIs, as well as classifiers for short image sequences (16 frames), using spatio-temporal ROIs. Extensive experiments performed on the UCF101 and HMDB51 benchmarks show that both these types of classifiers perform well individually and achieve state of the art performance when combined together. We also show qualitatively that our ROIs (selected by the algorithms) capture the most relevant parts of the image sequences.
更多
查看译文
关键词
Convolutional neural network,Supervised learning,Discriminative model,Pattern recognition,Sensory cue,Machine learning,Image (mathematics),Computer science,Action recognition,Artificial intelligence
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要