ANIM-400K: A Large-Scale Dataset for Automated End-To-End Dubbing of Video
CoRR(2024)
摘要
The Internet's wealth of content, with up to 60
starkly contrasts the global population, where only 18.8
and just 5.1
online information access. Unfortunately, automated processes for dubbing of
video - replacing the audio track of a video with a translated alternative -
remains a complex and challenging task due to pipelines, necessitating precise
timing, facial movement synchronization, and prosody matching. While end-to-end
dubbing offers a solution, data scarcity continues to impede the progress of
both end-to-end and pipeline-based methods. In this work, we introduce
Anim-400K, a comprehensive dataset of over 425K aligned animated video segments
in Japanese and English supporting various video-related tasks, including
automated dubbing, simultaneous translation, guided video summarization, and
genre/theme/style classification. Our dataset is made publicly available for
research purposes at https://github.com/davidmchan/Anim400K.
更多查看译文
关键词
Automated Dubbing,Speech Translation,Video,Anime,Datasets
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要