Recoin: Relative Completeness in Wikidata.

WWW '18: The Web Conference 2018 Lyon France April, 2018(2018)

引用 65|浏览37
暂无评分
摘要
The collaborative knowledge base Wikidata is the central storage of Wikimedia projects, containing over 45 million data items. It acts as the hub for interlinking Wikipedia pages about a specific item in different languages, automates features such as infoboxes in Wikipedia, and is increasingly used for other applications such as data enrichment and question answering. Tracking the quality of Wikidata is an important issue for this project. In this paper we focus particularly on the completeness aspect. Several automated techniques have been adopted by Wikis to track and manage completeness, yet these techniques are generally subjective and do not provide a clear quality estimate at the level of entities. In this paper, we present an approach towards measuring Relative Completeness in Wikidata by comparison with data present for similar entities. This relative completeness approach is easily scalable with the introduction of new classes in the knowledge base, and has been implemented for all available entities in Wikidata. The results provide an intuition on the completeness of an entity comparing it with other similar entities. Here, we present our implementation approach along with a discussion on strategies and open challenges.
更多
查看译文
关键词
Wikidata, Wikipedia, Data Completeness, Data Quality, Knowledge Bases
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要