Data-efficient Alignment of Multimodal Sequences by Aligning Gradient Updates and Internal Feature Distributions

2021 IEEE Winter Conference on Applications of Computer Vision (WACV)(2021)

引用 0|浏览51
暂无评分
摘要
The task of video and text sequence alignment is a pre-requisite step toward joint understanding of movie videos and screenplays. However, supervised methods face the obstacle of limited realistic training data. With this pa-per, we attempt to enhance data efficiency of the end-to-end alignment network NeuMATCH [15]. Recent research [56] suggests that network components dealing with different moda...
更多
查看译文
关键词
Training,Computer vision,Adaptive systems,Conferences,Training data,Motion pictures,Reservoirs
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要