Selecting Relevant Web Trained Concepts for Automated Event Retrieval
2015 IEEE International Conference on Computer Vision (ICCV)(2015)
摘要
Complex event retrieval is a challenging research problem, especially when no training videos are available. An alternative to collecting training videos is to train a large semantic concept bank a priori. Given a text description of an event, event retrieval is performed by selecting concepts linguistically related to the event description and fusing the concept responses on unseen videos. However, defining an exhaustive concept lexicon and pre-training it requires vast computational resources. Therefore, recent approaches automate concept discovery and training by leveraging large amounts of weakly annotated web data. Compact visually salient concepts are automatically obtained by the use of concept pairs or, more generally, n-grams. However, not all visually salient n-grams are necessarily useful for an event query--some combinations of concepts may be visually compact but irrelevant--and this drastically affects performance. We propose an event retrieval algorithm that constructs pairs of automatically discovered concepts and then prunes those concepts that are unlikely to be helpful for retrieval. Pruning depends both on the query and on the specific video instance being evaluated. Our approach also addresses calibration and domain adaptation issues that arise when applying concept detectors to unseen videos. We demonstrate large improvements over other vision based systems on the TRECVID MED 13 dataset.
更多查看译文
关键词
relevant Web trained concept,automated event retrieval,complex event retrieval,training video,text description,event description,exhaustive concept lexicon,computational resource,concept discovery,Web data,visually salient n-gram,event query,event retrieval algorithm,video instance,domain adaptation,concept detector,unseen video,vision based system,TRECVID MED 13 dataset
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络