A Learning to Tune Framework for LSH

Xiu Tang,Sai Wu,Gang Chen,Jinyang Gao,Wei Cao,Zhifei Pang

2021 IEEE 37TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2021)（2021）

引用 2|浏览29

暂无评分

摘要

Nearest neighbor (NN) search in high-dimensional spaces is inherently computationally expensive due to the curse of dimensionality. As a well-known solution to approximate NN search, locality-sensitive hashing (LSH) is able to answer c-approximate NN (c-ANN) queries in sublinear time with a welldefined performance bound. The success of LSH family mainly depends on the design of randomly projected hash functions. However, instead of randomly drawing hash functions from a conventional hashing family such as Gaussian projection for Euclidean space, we argue that whether there could be a set of data sensitive hashing functions with higher capacity to distinguish nearby points and far away points, which could have rigorous performance guarantee like conventional LSH. To this end, we propose a learning to tune framework, called LSH-tuning, which consists of a pruning model and a learning to rank model. The pruning model reduces the total number of hash tables to maximize the separating capacity on the given data distribution and minimize the storage overhead. The learning to rank model ranks hash tables based on their effectiveness on NN retrieval. We also have a theoretic model that guides us to gradually search more hash tables and probe nearby buckets. Extensive experiments with real-world data demonstrate that LSH-tuning is capable of outperforming existing proposals with respect to both efficiency and storage overhead.

查看译文

关键词

LSH,KNN,Index,Neural Network

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要