VideoMaster: a multimodal micro game video recreator

IJCAI（2023）

引用 0|浏览10

暂无评分

摘要

To free human from laborious video production, this paper proposes the building of VideoMaster, a multimodal system equipped with four capabilities: highlight extraction, video describing, video dubbing and video editing. It extracts interesting episodes from long game videos, generates subtitles for each episode, reads the subtitles through synthesized speech, and finally re-creates a better short video through video editing. Notably, VideoMaster takes a combination of deep learning and traditional computer vision techniques to extract highlights with fine-to-coarse labels, utilizes a novel framework named PCSG-v (probabilistic context sensitive grammar for video) for video description generation, and imitates a target speaker's voice to read the description. To the best of our knowledge, VideoMaster is the first multimedia system that can automatically produce product-level micro-videos without heavy human annotation.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要