Semantic Text Summarization Of Long Videos

Shagan Sah,Sourabh Kulhare, Allison Gray,Subhashini Venugopalan,Emily Prud'hommeaux,Raymond W. Ptucha

2017 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2017)（2017）

引用 30|浏览77

暂无评分

摘要

Long videos captured by consumers are typically tied to some of the most important moments of their lives, yet ironically are often the least frequently watched. The time required to initially retrieve and watch sections can be daunting. In this work we propose novel techniques for summarizing and annotating long videos. Existing video summarization techniques focus exclusively on identifying keyframes and subshots, however evaluating these summarized videos is a challenging task. Our work proposes methods to generate visual summaries of long videos, and in addition proposes techniques to annotate and generate textual summaries of the videos using recurrent networks. Interesting segments of long video are extracted based on image quality as well as cinematographic and consumer preference. Key frames from the most impactful segments are converted to textual annotations using sequential encoding and decoding deep learning models. Our summarization technique is benchmarked on the VideoSet dataset, and evaluated by humans for informative and linguistic content. We believe this to be the first fully automatic method capable of simultaneous visual and textual summarization of long consumer videos.

查看译文

关键词

semantic text summarization,annotating long videos,video summarization techniques,visual summaries,recurrent networks,image quality,cinematographic,consumer preference,sequential encoding,textual annotations,decoding deep learning models,VideoSet dataset

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要