Generating Question Relevant Captions to Aid Visual Question Answering
arXiv: Computer Vision and Pattern Recognition, 2019.
Visual question answering (VQA) and image captioning require a shared body of general knowledge connecting language and vision. We present a novel approach to improve VQA performance that exploits this connection by jointly generating captions that are targeted to help answer a specific visual question. The model is trained using an exi...More
PPT (Upload PPT)