TagBook: A Semantic Video Representation without Supervision for Event Detection.

Masoud Mazloom,Xirong Li,Cees G. M. Snoek

IEEE Trans. Multimedia（2016）

引用 45|浏览64

暂无评分

摘要

We consider the problem of event detection in video for scenarios where only a few, or even zero, examples are available for training. For this challenging setting, the prevailing solutions in the literature rely on a semantic video representation obtained from thousands of pretrained concept detectors. Different from existing work, we propose a new semantic video representation that is based on freely available social tagged videos only, without the need for training any intermediate concept detectors. We introduce a simple algorithm that propagates tags from a video's nearest neighbors, similar in spirit to the ones used for image retrieval, but redesign it for video event detection by including video source set refinement and varying the video tag assignment. We call our approach TagBook and study its construction, descriptiveness, and detection performance on the TRECVID 2013 and 2014 multimedia event detection datasets and the Columbia Consumer Video dataset. Despite its simple nature, the proposed TagBook video representation is remarkably effective for few-example and zero-example event detection, even outperforming very recent state-of-the-art alternatives building on supervised representations.

查看译文

关键词

Event detection,Semantics,Detectors,Training,Multimedia communication,Streaming media,Feature extraction

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要