Concept-Level Multimodal Ranking of Flickr Photo Tags via Recall Based Weighting.

MMCommons @ ACM Multimedia(2016)

引用 19|浏览22
暂无评分
摘要
Social media platforms allow users to annotate photos with tags that significantly facilitate an effective semantics understanding, search, and retrieval of photos. However, due to the manual, ambiguous, and personalized nature of user tagging, many tags of a photo are in a random order and even irrelevant to the visual content. Aiming to automatically compute tag relevance for a given photo, we propose a tag ranking scheme based on voting from photo neighbors derived from multimodal information. Specifically, we determine photo neighbors leveraging geo, visual, and semantics concepts derived from spatial information, visual content, and textual metadata, respectively. We leverage high-level features instead traditional low-level features to compute tag relevance. Experimental results on a representative set of 203,840 photos from the YFCC100M dataset confirm that above-mentioned multimodal concepts complement each other in computing tag relevance. Moreover, we explore the fusion of multimodal information to refine tag ranking leveraging recall based weighting. Experimental results on the representative set confirm that the proposed algorithm outperforms state-of-the-arts.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要