Towards Representation Independent Similarity Search Over Graph Databases

ACM International Conference on Information and Knowledge Management(2016)

引用 6|浏览119
暂无评分
摘要
Finding similar or strongly related entities in a graph database is a fundamental problem in data management and analytics. Similarity search algorithms usually leverage the structural properties of the data graph to quantify the degree of similarity or relevance between entities. Nevertheless, the same information can be represented in many different structures and the structural properties observed over particular representations do not necessarily hold for alternative structures. Thus, these algorithms are effective on some representations and ineffective on others, i.e., not representation independent. We formally define the property of representation independence for similarity search algorithms as their robustness against transformations that modify the structure of databases and preserve their information content. We formalize two widespread groups of such transformations called relationship-reorganizing and entity-rearranging transformations. We show that current similarity search algorithms are not representation independent under these transformations and propose an algorithm called R-PathSim, which is provably robust under relationship reorganizing transformations and a subset of entity-rearranging transformations. Our empirical results suggest that current similarity search algorithms except for R-PathSim are highly sensitive to the data representation. These results also indicate that R-PathSim is as effective or more effective than other similarity search algorithms.
更多
查看译文
关键词
Graph mining,Structural similarity search,Database design,Representation independence
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要