Structured Matching For Phrase Localization

COMPUTER VISION - ECCV 2016, PT VIII(2016)

引用 100|浏览289
暂无评分
摘要
In this paper we introduce a new approach to phrase localization: grounding phrases in sentences to image regions. We propose a structured matching of phrases and regions that encourages the semantic relations between phrases to agree with the visual relations between regions. We formulate structured matching as a discrete optimization problem and relax it to a linear program. We use neural networks to embed regions and phrases into vectors, which then define the similarities (matching weights) between regions and phrases. We integrate structured matching with neural networks to enable end-to-end training. Experiments on Flickr30K Entities demonstrate the empirical effectiveness of our approach.
更多
查看译文
关键词
Vision,Language
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要