Finding "It": Weakly-Supervised Reference-Aware Visual Grounding in Instructional Videos

CVPR, pp. 5948-5957, 2018.

Cited by: 28|Bibtex|Views79
EI
Other Links: dblp.uni-trier.de|academic.microsoft.com

Abstract:

Grounding textual phrases in visual content with standalone image-sentence pairs is a challenging task. When we consider grounding in instructional videos, this problem becomes profoundly more complex: the latent temporal structure of instructional videos breaks independence assumptions and necessitates contextual understanding for resolv...More

Code:

Data:

Full Text
Your rating :
0

 

Tags
Comments