Data-efficient Alignment of Multimodal Sequences by Aligning Gradient Updates and Internal Feature Distributions

Jianan Wang
Jianan Wang
Boyang Li
Boyang Li
Xiangyu Fan
Xiangyu Fan
Jing Lin
Jing Lin
Cited by: 0|Bibtex|Views8
Other Links: arxiv.org

Abstract:

The task of video and text sequence alignment is a prerequisite step toward joint understanding of movie videos and screenplays. However, supervised methods face the obstacle of limited realistic training data. With this paper, we attempt to enhance data efficiency of the end-to-end alignment network NeuMATCH [15]. Recent research [56] ...More

Code:

Data:

Full Text
Your rating :
0

 

Tags
Comments