Person Name Disambiguation in the Web Using Adaptive Threshold Clustering.

JASIST(2017)

引用 13|浏览40
暂无评分
摘要
In this article, we present a new clustering algorithm for Person Name Disambiguation in web search results. The algorithm groups web results according to the individuals they refer to. The best state-of-the-art approaches require training data in order to learn thresholds for deciding when to group the webpages. However, the ambiguity level of person names on the web could not be previously estimated and the results of those methods strongly depend on the thresholds obtained with the training collections. We present the concept of adaptive threshold, which avoids the need of a previous supervised learning process, depending only on the content of the compared documents to decide if they refer to the same person. We evaluated our approach using three datasets reaching close results to those obtained by the most successful algorithms in the state-of-the-art that require such a learning process, and outperforming the results of those obtained by algorithms that do not require it.
更多
查看译文
关键词
adaptive threshold clustering,person,web
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要