Knowledge Aided Consistency for Weakly Supervised Phrase Grounding

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition(2018)

引用 90|浏览158
暂无评分
摘要
Given a natural language query, a phrase grounding system aims to localize mentioned objects in an image. In weakly supervised scenario, mapping between image regions (i.e., proposals) and language is not available in the training set. Previous methods address this deficiency by training a grounding system via learning to reconstruct language information contained in input queries from predicted proposals. However, the optimization is solely guided by the reconstruction loss from the language modality, and ignores rich visual information contained in proposals and useful cues from external knowledge. In this paper, we explore the consistency contained in both visual and language modalities, and leverage complementary external knowledge to facilitate weakly supervised grounding. We propose a novel Knowledge Aided Consistency Network (KAC Net) which is optimized by reconstructing input query and proposal's information. To leverage complementary knowledge contained in the visual features, we introduce a Knowledge Based Pooling (KBP) gate to focus on query-related proposals. Experiments show that KAC Net provides a significant improvement on two popular datasets.
更多
查看译文
关键词
weakly supervised phrase,natural language query,phrase grounding system,weakly supevised scenario,image regions,training set,language information,input queries,predicted proposals,reconstruction loss,language modality,leverage complementary external knowledge,weakly supervised grounding,Consistency Network,KAC Net,leverage complementary knowledge,visual features,query-related proposals,visual information
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要