Deep Global-Local Gazing: Including Global Scene Properties in Local Saliency Computation

MOBILE INFORMATION SYSTEMS(2022)

引用 0|浏览1
暂无评分
摘要
Visual saliency models imitate the attentive mechanism of the human visual system (HVS) to detect the objects that stand out from their neighbors in the scene. Some biological phenomena in HVS, such as contextual cueing effects, suggest that the contextual information of the whole scene does guide the attentive mechanism. The saliency value of each image patch is influenced by its visual (local) features as well as the contextual information of the whole scene. Modern saliency models are based on deep convolutional neural networks. Because the convolutional operators operate locally and use weight sharing, such networks inherently have difficulty capturing global and location-dependent features. In addition, these models calculate the saliency value pixel-wise using local features. Therefore, it is necessary to provide global features along with local features. In this regard, we propose two approaches for capturing the contextual information from the scene. In our first method, we introduce a shift-variant fully connected component to capture global and location-dependent information. Instead of using the native CNN of our base model, in our second method, we use a VGGNet to capture the global and context information of the scene. To show the effectiveness of our methods, we use them to extend the SAM-ResNet saliency model. To evaluate our proposed approaches, four challenging saliency benchmark datasets were used. The experimental results showed that our methods could outperform the existing state-of-the-art saliency prediction models.
更多
查看译文
关键词
global-local saliency computation,global-local scene properties
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要