Structured Semantic Representation For Visual Question Answering

Dongchen Yu,Xing Gao,Hongkai Xiong

2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP)(2018)

引用 27|浏览17
暂无评分
摘要
A number of models have been proposed to capture rich semantic representation in Visual Question Answering (VQA). In this paper, we illustrate the compositionality of general cognitive ability in VQA and take the linguistic structure of language into consideration in semantic representation. We decompose the question into several components by the semantic tree and apply a tree-structured model to distill the sentence representation. In addition, we exploit the complementary image of the new dataset and optimize the classifier used to predict answers. We design a dual path network for the new VQA 2.0 dataset in training process to lead the model to effectively take advantage of the property of the dataset. Experiments show that our method could obtain more useful information and improve the performance.
更多
查看译文
关键词
Structured semantic representation, Tree-LSTM, Visual question answering
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要