Annotating Web Tables through Knowledge Bases: A Context-Based Approach

2020 7th Swiss Conference on Data Science (SDS)(2020)

引用 6|浏览64
暂无评分
摘要
The Web has a collection of over 150 million tables, which as a whole represents an invaluable source of semi-structured knowledge. Such tables are commonly referred to as Web tables, and are considerably easier to leverage in automated processes than completely unstructured, free-format text. Understanding the semantics of Web tables is important since they are used in various applications like knowledge base augmentation, information retrieval or natural language interfaces for databases. The task of understanding the semantics of a given Web table is known as Web table annotation. In recent years, it has been tackled through methods where the table is enriched using existing knowledge bases containing valuable information on the domain at hand, its entities and their mutual relationships.In this paper, we present two novel and unsupervised Web table annotation methods, which leverage the context of the tables to better capture their semantics. Our first method is lookup-based and exploits text similarity to find reference entities in the knowledge base. The second method uses distributional vector representations – a.k.a. embeddings – of the Web tables to elicit their context and disambiguate their semantics. Experiments show that our proposed approach outperforms the state of the art in Web table annotation by up to 18%. Another contribution of this work is a manually corrected version of one of the popular gold standard datasets, Limaye, with annotations from DBpedia. Our dataset and code are publicly available 1 .
更多
查看译文
关键词
Web Table Annotation,Knowledge Base,Embeddings
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要