String Joins with Synonyms

database systems for advanced applications(2020)

引用 1|浏览52
暂无评分
摘要
String matching is a fundamental operation in many applications such as data integration, information retrieval and text mining. Since users express the same meaning in a variety of ways that are not textually similar, existing works have proposed variants of Jaccard similarity by using synonyms to consider semantics beyond textual similarities. However, they may produce a non-negligible number of false positives in some applications by employing set semantics and miss some true positives due to approximations. In this paper, we define new match relationships between a pair of strings under synonym rules and develop an efficient algorithm to verify the match relationships for a pair of strings. In addition, we propose two filtering methods to prune non-matching string pairs. We also develop join algorithms with synonyms based on the filtering methods and the match relationships. Experimental results with real-life datasets confirm the effectiveness of our proposed algorithms.
更多
查看译文
关键词
synonyms,string
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要