Scene Categorization by Deeply Learning Gaze Behavior in a Semisupervised Context

IEEE Transactions on Cybernetics(2021)

引用 6|浏览75
暂无评分
摘要
Accurately recognizing different categories of sceneries with sophisticated spatial configurations is a useful technique in computer vision and intelligent systems, e.g., scene understanding and autonomous driving. Competitive accuracies have been observed by the deep recognition models recently. Nevertheless, these deep architectures cannot explicitly characterize human visual perception, that is, the sequence of gaze allocation and the subsequent cognitive processes when viewing each scenery. In this paper, a novel spatially aware aggregation network is proposed for scene categorization, where the human gaze behavior is discovered in a semisupervised setting. In particular, as semantically labeling a large quantity of scene images is labor-intensive, a semisupervised and structure-preserved non-negative matrix factorization (NMF) is proposed to detect a set of visually/semantically salient regions from each scenery. Afterward, the gaze shifting path (GSP) is engineered to characterize the process of humans perceiving each scene picture. To deeply describe each GSP, a novel spatially aware CNN termed SA-Net is developed. It accepts input regions with various shapes and statistically aggregates all the salient regions along each GSP. Finally, the learned deep GSP features from the entire scene images are fused into an image kernel, which is subsequently integrated into a kernel SVM to categorize different sceneries. Comparative experiments on six scene image sets have shown the advantage of our method.
更多
查看译文
关键词
Deep model,machine learning,non-negative matrix factorization (NMF),scene categorization,semisupervised
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要