Inference over the web
Inference over the web(2011)
摘要
The World Wide Web contains vast amounts of text written about nearly any topic imaginable. Recent work in Information Extraction has sought to recover the information stated in this text, aggregating it into massive bodies of knowledge. These knowledge bases have the potential to significantly improve future Web search engines and Web-based Question-Answering systems, allowing them to answer more complex queries. However, despite its size there are still a large number of facts that are never explicitly mentioned on the Web. Much of the knowledge available on the Web is implicit, and must be inferred from other facts, possibly stated on separate pages. A system wishing to access this implicit knowledge must not only determine what inferences should be made, but also it must do so in a way that handles the noise, scale, and diversity of knowledge on the Web. This dissertation demonstrates that it is possible for systems to discover the implicit knowledge that exists within large knowledge bases extracted from the Web. It describes SHERLOCK-HOLMES, an unsupervised system that learns first-order Horn-clauses from facts extracted from the Web. Experiments show that the rules it learns can infer many facts not explicitly stated in the corpus, and furthermore that the long-tailed nature of facts on the Web allows the system to learn and use the rules in a scalable way.
更多查看译文
关键词
future Web search engine,unsupervised system,knowledge base,large number,Web-based Question-Answering system,implicit knowledge,Information Extraction,large knowledge base,complex query,World Wide Web
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络