Emotionally Enhanced Talking Face Generation

PROCEEDINGS OF THE 1ST INTERNATIONAL WORKSHOP ON MULTIMEDIA CONTENT GENERATION AND EVALUATION, MCGE 2023: New Methods and Practice(2023)

引用 0|浏览25
暂无评分
摘要
Several works have developed end-to-end pipelines for generating lip-synced talking faces with real-world applications, such as teaching and language translation in videos. However, these prior works fail to create realistic-looking videos since they focus little on people's expressions and emotions. Moreover, these methods' effectiveness largely depends on the faces in the training dataset, which means they may not perform well on unseen faces. To mitigate this, we build a talking face generation framework conditioned on a categorical emotion to generate videos with appropriate expressions, making them more realistic and convincing. With a broad range of six emotions, i.e., happiness, sadness, fear, anger, disgust, and neutral, we show that our model can adapt to arbitrary identities, emotions, and languages. Our proposed framework has a user-friendly web interface with a real-time experience for talking face generation with emotions. We also conduct a user study for subjective evaluation of our interface's usability, design, and functionality. Project page: https://midas.iiitd.edu.in/emo/.
更多
查看译文
关键词
talking face generation,emotion capture,lip sync,multimodal
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要