VQS: Linking Segmentations to Questions and Answers for Supervised Attention in VQA and Question-Focused Semantic Segmentation
ICCV, pp. 1829-1838, 2017.
Rich and dense human labeled datasets are among the main enabling factors for the recent advance on visionlanguage understanding. Many seemingly distant annotations (e.g., semantic segmentation and visual question answering (VQA)) are inherently connected in that they reveal different levels and perspectives of human understandings about ...More
PPT (Upload PPT)