SEISA: set expansion by iterative similarity aggregation.

WWW '11: 20th International World Wide Web Conference Hyderabad India March, 2011(2011)

引用 113|浏览31
暂无评分
摘要
In this paper, we study the problem of expanding a set of given seed entities into a more complete set by discovering other entities that also belong to the same concept set. A typical example is to use "Canon" and "Nikon" as seed entities, and derive other entities (e.g., "Olympus") in the same concept set of camera brands. In order to discover such relevant entities, we exploit several web data sources, including lists extracted from web pages and user queries from a web search engine. While these web data are highly diverse with rich information that usually cover a wide range of the domains of interest, they tend to be very noisy. We observe that previously proposed random walk based approaches do not perform very well on these noisy data sources. Accordingly, we propose a new general framework based on iterative similarity aggregation, and present detailed experimental results to show that, when using general-purpose web data for set expansion, our approach outperforms previous techniques in terms of both precision and recall.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要