AI帮你理解科学

AI 生成解读视频

AI抽取解析论文重点内容自动生成视频


pub
生成解读视频

AI 溯源

AI解析本论文相关学术脉络


Master Reading Tree
生成 溯源树

AI 精读

AI抽取本论文的概要总结


微博一下
Important spatial pooling regions proposed in this framework encourages spatial pooling to be performed more adaptively to resist false response

Learning Important Spatial Pooling Regions for Scene Classification

CVPR, pp.3726-3733, (2014)

引用127|浏览114
EI WOS
下载 PDF 全文
引用
微博一下

摘要

We address the false response influence problem when learning and applying discriminative parts to construct the mid-level representation in scene classification. It is often caused by the complexity of latent image structure when convolving part filters with input images. This problem makes mid-level representation, even after pooling, n...更多

代码

数据

0
简介
  • Finding discriminative parts [23] to construct mid-level representation is one of the main streams in scene classification
  • Discriminative parts, such as beds in bedroom, washing machines in laundry, and other distinct components, are important to identify scenes.
  • They are generally more useful than simultaneously considering all pixels in an image.
  • The constructed mid-level representation is the input to discriminative classifiers, e.g., SVM
重点内容
  • Finding discriminative parts [23] to construct mid-level representation is one of the main streams in scene classification
  • Mid-level representation is built upon the part response map, similar to Spatial Pyramid Matching (SPM) [12]
  • False response is neglected after learning discriminative parts and constructing mid-level representation in many systems. We address this issue in this paper by introducing important spatial pooling regions (ISPRs) visualized in Figure 1, which are learned jointly with discriminative part appearance in a unified optimization framework
  • Noticing that previous methods still ignore the adverse impact of false response when constructing image representation, we develop a new scheme with better suppression of false response in order to generate more discriminative midlevel representation
  • We have presented a useful model that utilizes part appearance and spatial configuration for improving scene classification
  • Important spatial pooling regions proposed in this framework encourages spatial pooling to be performed more adaptively to resist false response
方法
  • ROI [23] MM-scene [38] DPM [20] CENTRIST [33] Object Bank [15] RBoW [21] Patches [27] Hybrid-Parts [37] LPR [25] BoP [8] VC [17] VQ [17] Mode Seeking [4] ISPR.
  • RSP [7] SP [12] SPMSM [11] Classemes [29] HIK [32] LScSPM [6] LPR [25] Hybrid-Parts + GIST-color + SP [37] LCSR [26] VC + VQ [17] ISPR IFV [30] ISPR + IFV.
  • The dimensionality of the mid-level representation is 2×64×15=1,920
结果
  • Table 2 shows the difference – the solution achieves 68.50% accuracy
结论
  • The authors have presented a useful model that utilizes part appearance and spatial configuration for improving scene classification.
  • ISPR proposed in this framework encourages spatial pooling to be performed more adaptively to resist false response.
  • Spatial information extracted from ISPR enhances the discriminative power of mid-level representation in classification.
  • The authors have evaluated the method on several representative datasets.
  • Using low-level features and the new model results in high classification accuracy
表格
  • Table1: Average classification accuracy of single-feature approaches on the MIT-indoor dataset
  • Table2: Average classification accuracy of state-of-the-art approaches fusing multiple features and our mid-level representation combining IFV on MIT-indoor dataset
  • Table3: Average classification accuracies on 15-Scene dataset
  • Table4: Average classification accuracies on UIUC 8-Sport dataset
Download tables as Excel
相关工作
  • Discovering discriminative parts is an effective technique for scene classification. The term “discriminative part” was originally introduced in object recognition [5]. As explained in [23], scene can also be regarded as a combination of parts, which are called regions of interest (ROI). Because discriminative parts provide powerful representation of scene, exploiting them drew much attention recently.

    This type of methods can be understood in three ways. First, distinct power of learned parts is used to alleviate visual ambiguity. Recent work [8, 24, 15, 16, 27, 17] discovered parts with specific visual concepts – that is, the learned part is expected to represent a cluster of visual objects. Second, unsupervised discovery of discriminative parts is dominating. Though handcrafted part filters are easier to comprehend, unsupervised frameworks [17, 28, 9, 10, 13, 20, 21, 25, 37] are more practical and efficient especially for large volume data.
基金
  • This work is supported by a grant from the Research Grants Council of the Hong Kong SAR (project No 413113) and by NSF of China (key project No 61133009)
引用论文
  • A. Bosch, A. Zisserman, and X. Muoz. Scene classification using a hybrid generative/discriminative approach. PAMI, 2008.
    Google ScholarLocate open access versionFindings
  • Y.-L. Boureau, F. Bach, Y. LeCun, and J. Ponce. Learning mid-level features for recognition. In CVPR, 2010.
    Google ScholarLocate open access versionFindings
  • N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. In CVPR, 2005.
    Google ScholarLocate open access versionFindings
  • C. Doersch, A. Gupta, and A. A. Efros. Mid-level visual element discovery as discriminative mode seeking. In NIPS, 2013.
    Google ScholarLocate open access versionFindings
  • P. F. Felzenszwalb, R. B. Girshick, D. McAllester, and D. Ramanan. Object detection with discriminatively trained part-based models. PAMI, 2010.
    Google ScholarLocate open access versionFindings
  • S. Gao, I. W. Tsang, L.-T. Chia, and P. Zhao. Local features are not lonely–laplacian sparse coding for image classification. In CVPR, 2010.
    Google ScholarLocate open access versionFindings
  • Y. Jiang, J. Yuan, and G. Yu. Randomized spatial partition for scene recognition. In ECCV, 2012.
    Google ScholarLocate open access versionFindings
  • M. Juneja, A. Vedaldi, C. Jawahar, and A. Zisserman. Blocks that shout: Distinctive parts for scene classification. In CVPR, 2013.
    Google ScholarFindings
  • H. Kang, M. Hebert, and T. Kanade. Discovering object instances from scenes of daily living. In ICCV, 2011.
    Google ScholarLocate open access versionFindings
  • G. Kim and A. Torralba. Unsupervised detection of regions of interest using iterative link analysis. In NIPS, 2009.
    Google ScholarLocate open access versionFindings
  • R. Kwitt, N. Vasconcelos, and N. Rasiwasia. Scene recognition on the semantic manifold. In ECCV, 2012.
    Google ScholarLocate open access versionFindings
  • S. Lazebnik, C. Schmid, and J. Ponce. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In CVPR, 2006.
    Google ScholarLocate open access versionFindings
  • Y. J. Lee and K. Grauman. Object-graphs for context-aware category discovery. In CVPR, 2010.
    Google ScholarLocate open access versionFindings
  • L.-J. Li and L. Fei-Fei. What, where and who? classifying events by scene and object recognition. In ICCV, 2007.
    Google ScholarLocate open access versionFindings
  • L.-J. Li, H. Su, L. Fei-Fei, and E. P. Xing. Object bank: A high-level image representation for scene classification & semantic feature sparsification. In NIPS, 2010.
    Google ScholarLocate open access versionFindings
  • L.-J. Li, H. Su, Y. Lim, and L. Fei-Fei. Objects as attributes for scene classification. In ECCV, 2012.
    Google ScholarLocate open access versionFindings
  • Q. Li, J. Wu, and Z. Tu. Harvesting mid-level visual concepts from large-scale internet images. In CVPR, 2013.
    Google ScholarFindings
  • D. G. Lowe. Distinctive image features from scale-invariant keypoints. IJCV, 2004.
    Google ScholarLocate open access versionFindings
  • A. Oliva and A. Torralba. Modeling the shape of the scene: A holistic representation of the spatial envelope. IJCV, 2001.
    Google ScholarLocate open access versionFindings
  • M. Pandey and S. Lazebnik. Scene recognition and weakly supervised object localization with deformable part-based models. In ICCV, 2011.
    Google ScholarLocate open access versionFindings
  • S. N. Parizi, J. G. Oberlin, and P. F. Felzenszwalb. Reconfigurable models for scene recognition. In CVPR, 2012.
    Google ScholarLocate open access versionFindings
  • F. Perronnin, Y. Liu, J. Sanchez, and H. Poirier. Large-scale image retrieval with compressed fisher vectors. In CVPR, 2010.
    Google ScholarFindings
  • A. Quattoni and A. Torralba. Recognizing indoor scenes. In CVPR, 2009.
    Google ScholarLocate open access versionFindings
  • B. C. Russell, W. T. Freeman, A. A. Efros, J. Sivic, and A. Zisserman. Using multiple segmentations to discover objects and their extent in image collections. In CVPR, 2006.
    Google ScholarLocate open access versionFindings
  • F. Sadeghi and M. F. Tappen. Latent pyramidal regions for recognizing scenes. In ECCV, 2012.
    Google ScholarLocate open access versionFindings
  • A. Shabou and H. LeBorgne. Locality-constrained and spatially regularized coding for scene categorization. In CVPR, 2012.
    Google ScholarLocate open access versionFindings
  • S. Singh, A. Gupta, and A. A. Efros. Unsupervised discovery of mid-level discriminative patches. In ECCV, 2012.
    Google ScholarLocate open access versionFindings
  • S. Todorovic and N. Ahuja. Unsupervised category modeling, recognition, and segmentation in images. PAMI, 2008.
    Google ScholarLocate open access versionFindings
  • L. Torresani, M. Szummer, and A. Fitzgibbon. Efficient object category recognition using classemes. In ECCV, 2010.
    Google ScholarLocate open access versionFindings
  • A. Vedaldi and B. Fulkerson. VLFeat: An open and portable library of computer vision algorithms. http://www.vlfeat.org/, 2008.
    Findings
  • J. Wang, J. Yang, K. Yu, F. Lv, T. Huang, and Y. Gong. Locality-constrained linear coding for image classification. In CVPR, 2010.
    Google ScholarLocate open access versionFindings
  • J. Wu and J. M. Rehg. Beyond the euclidean distance: Creating effective visual codebooks using the histogram intersection kernel. In ICCV, 2009.
    Google ScholarLocate open access versionFindings
  • J. Wu and J. M. Rehg. Centrist: A visual descriptor for scene categorization. PAMI, 2011.
    Google ScholarLocate open access versionFindings
  • J. Yang, K. Yu, Y. Gong, and T. Huang. Linear spatial pyramid matching using sparse coding for image classification. In CVPR, 2009.
    Google ScholarLocate open access versionFindings
  • B. Yao, G. Bradski, and L. Fei-Fei. A codebook-free and annotation-free approach for fine-grained image categorization. In CVPR, 2012.
    Google ScholarLocate open access versionFindings
  • J. Yuan, M. Yang, and Y. Wu. Mining discriminative cooccurrence patterns for visual recognition. In CVPR, 2011.
    Google ScholarLocate open access versionFindings
  • Y. Zheng, Y.-G. Jiang, and X. Xue. Learning hybrid part filters for scene recognition. In ECCV, 2012.
    Google ScholarLocate open access versionFindings
  • J. Zhu, L.-J. Li, L. Fei-Fei, and E. P. Xing. Large margin learning of upstream scene understanding models. In NIPS, 2010.
    Google ScholarLocate open access versionFindings
0
您的评分 :

暂无评分

标签
评论
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn