Finding Images by Dialoguing with Image

Proceedings of the 27th ACM International Conference on Multimedia(2019)

引用 1|浏览84
暂无评分
摘要
Image retrieval in complicated scene is a challenging task that requires the comprehensive understanding of an image. In this paper, we propose a scene graph based image retrieval framework that combines the scene graph generation with image retrieval and fine tuning the searching results via a dialogue mechanism. Specifically, we proposed an image retrieval oriented scene graph generation model that takes an image and a text describing the image as inputs. The additional text input is used to control the generated scene graph. It provides information for a newly introduced attributes head to better predict the attributes and helps constructing an adjacency matrix at the same time. Graph Convolutional Network is further used to gather information among nodes for precise relation estimation. Moreover, modification on the scene graph can be done by changing the text. Our proposed approach achieves the state-of-the-art performances in both scene graph based image retrieval and scene graph generation in the Visual Genome dataset.
更多
查看译文
关键词
deep learning, image retrieval, neural networks, scene graph generation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要