AUTOMATIC DUBBING OF VIDEOS WITH MULTIPLE SPEAKERS

user-5f165ac04c775ed682f5819f（2018）

引用 0|浏览20

暂无评分

摘要

A machine-learning model that automatically converts audio streams from an audio-visual content from a source language to a destination language is described. In response to determining that an audio stream should be translated, a machine-learning-based dubbing model is invoked for a specific destination language. In case of multiple speakers, voice embedding techniques are used to match dubbed audio streams to the corresponding speakers. The sentiment in the original speaker’s voice is preserved by training the model with targeted data set in the destination language.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要