DSAMR: Dual-Stream Attention Multi-hop Reasoning for Knowledge-Based Visual Question Answering
Expert systems with applications(2024)
摘要
Knowledge-based visual question answering aims to associate external knowledge facts for answering questions about images. Most existing methods emphasize high-order associations between knowledge facts and questions, and fail to consider the negative effects of unnecessary knowledge facts in multi-hop reasoning. In this paper, we propose a Dual-Stream Attention Multi-hop Reasoning (DSAMR) architecture that constructs two different attention streams to mitigate unnecessary knowledge facts. This dual-stream mechanism enables the model to reduce the attention weights on unnecessary knowledge while gathering essential knowledge by learning the implicit correlations between knowledge facts and questions. In addition, we designed a hypergraph knowledge extraction module in the architecture to extract optimal knowledge facts by evaluating the relevance of each knowledge fact to the question. The experimental results demonstrate the effectiveness of our method not only on the knowledge-based visual question answering dataset KVQA, but also on the multi-hop question answering dataset PathQuestion.
更多查看译文
关键词
Visual question answering,Multi-hop reasoning,Dual-stream attention,Hypergraph
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要