谷歌浏览器插件
订阅小程序
在清言上使用

MSVD-Turkish: A Large-Scale Dataset for Video Captioning in Turkish

2019 27th Signal Processing and Communications Applications Conference (SIU)(2019)

引用 6|浏览43
暂无评分
摘要
Automatically generating natural language descriptions for videos, aka video captioning, has been recently introduced as a challenging integrated vision and language problem. Although researchers have demonstrated numerous solutions for English, to date there has been no study on Turkish language due to the lack of suitable datasets to train Turkish video captioning models. To tackle this, in this study we construct a largescale Turkish benchmark dataset by carefully translating English descriptions from MSVD dataset to Turkish. Moreover, we implement several neural models, including LSTM-based sequence-to-sequence architectures with temporal attention mechanisms, and report the performances of these strong baselines on our dataset. We hope that our dataset will serve as a good resource for future efforts on Turkish video captioning.
更多
查看译文
关键词
Video captioning,computer vision,natural language processing,machine learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要