Recognizing Landmarks In Large-Scale Social Image Collections

LARGE-SCALE VISUAL GEO-LOCALIZATION(2016)

引用 13|浏览74
暂无评分
摘要
The dramatic growth of social media websites over the last few years has created huge collections of online images and raised new challenges in organizing them effectively. One particularly intuitive way of browsing and searching images is by the geo-spatial location of where on Earth they were taken, but most online images do not have GPS metadata associated with them. We consider the problem of recognizing popular landmarks in large-scale datasets of unconstrained consumer images by formulating a classification problem involving nearly 2 million images and 500 categories. The dataset and categories are formed automatically from geo-tagged photos from Flickr by looking for peaks in the spatial geo-tag distribution corresponding to frequently photographed landmarks. We learn models for these landmarks with a multiclass support vector machine, using classic vector-quantized interest point descriptors as features. We also incorporate the nonvisual metadata available on modern photo-sharing sites, showing that textual tags and temporal constraints lead to significant improvements in classification rate. Finally, we apply recent breakthroughs in deep learning with Convolutional Neural Networks, finding that these models can dramatically outperform the traditional recognition approaches to this problem, and even beat human observers in some cases. (This is an expanded and updated version of an earlier conference paper [ 23]).
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要