Generating Question Relevant Captions to Aid Visual Question Answering
ACL (1), pp. 3585-3594, 2019.
EI
摘要:
Visual question answering (VQA) and image captioning require a shared body of general knowledge connecting language and vision. We present a novel approach to improve VQA performance that exploits this connection by jointly generating captions that are targeted to help answer a specific visual question. The model is trained using an exi...更多
代码:
数据:
下载 PDF 全文
标签
评论