VBLSH: Volume-balancing locality-sensitive hashing algorithm for K-nearest neighbors search

Information Sciences(2022)

引用 2|浏览10
暂无评分
摘要
K-nearest neighbors search (K-NNS) is a fundamental problem in many areas of machine learning and data mining. In an attempt to solve NNS problems by locality-sensitive hashing (LSH)-based algorithms and avoid the F1-trap, in this paper, we theoretically prove the necessity of maintaining close edges of each dimension, the independence of the partition location and the precision of retrieval. Based on the theories, a new method is proposed for constructing the hash function family. Then, a volume-balancing locality-sensitive hashing (VBLSH) algorithm is implemented. In this algorithm, the problem of constructing the hash function family is transformed into an optimization problem under a fixed number of hypercubes. By solving the optimization problem, different hyperplane intervals are set for each dimension to ensure that the hash functions can evenly segment the data space, where each segmented hypercube has a close volume and close side lengths. Finally, the analysis of the algorithm shows that the VBLSH algorithm has a high index coding efficiency and hash code calculation efficiency. The experimental results show that the VBLSH algorithm can simultaneously take into account the efficiency, precision and recall rate, and avoid the F1-trap; it shows excellent effectiveness on 4 datasets.
更多
查看译文
关键词
Information retrieval,K-nearest neighbors search,Locality-sensitive hashing,High-dimension massive data query
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要