Learning to Compose Dynamic Tree Structures for Visual Contexts
CVPR, Volume abs/1812.01880, 2019, Pages 6619-6628.
We proposed a dynamic tree structure called VCTREE to capture task-specific visual contexts, which can be encoded to support two high-level vision tasks: scene graph generation and visual Q&A
We propose to compose dynamic tree structures that place the objects in an image into a visual context, helping visual reasoning tasks such as scene graph generation and visual QA 2) the dynamic structure varies from image to image and task to task, allowing more content-/task-specific message passing among objects. To construct a VCTree,...More
PPT (Upload PPT)