Event Specific Multimodal Pattern Mining For Knowledge Base Construction

Hongzhi Li,Joseph G. Ellis,Heng Ji,Shih-Fu Chang

MM '16: ACM Multimedia Conference Amsterdam The Netherlands October, 2016（2016）

引用 22|浏览152

暂无评分

摘要

Knowledge bases, which consist of a collection of entities, attributes, and the relations between them are widely used and important for many information retrieval tasks. Knowledge base schemas are often constructed manually using experts with specific domain knowledge for the field of interest. Once the knowledge base is generated then many tasks such as automatic content extraction and knowledge base population can be performed, which have so far been robustly studied by the Natural Language Processing community. However, the current approaches ignore visual information that could be used to build or populate these structured ontologies. Preliminary work on visual know-ledge base construction only explores limited basic objects and scene relations. In this paper, we propose a novel multimodal pattern mining approach towards constructing a high-level "event" schema semi-automatically, which has the capability to extend text only methods for schema construction. We utilize a large unconstrained corpus of weakly-supervised image-caption pairs related to high-level events such as "attack" and "demonstration" to both discover visual aspects of an event, and name these visual components automatically. We compare our method with several state-of-the-art visual pattern mining approaches and demonstrate that our proposed method can achieve dramatic improvements in terms of the number of concepts discovered (33% gain), semantic consistence of visual patterns (52% gain), and correctness of pattern naming (150% gain).

查看译文

关键词

Multimodal,Pattern Mining,Convolutional Neural Network,Event Schema,Multimodal Knowledge Base

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要