Random Draw Forest: A Salient Index For Similarity Search Over Multimedia Data

2018 IEEE FOURTH INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM)(2018)

引用 1|浏览7
暂无评分
摘要
The approximate nearest neighbor(ANN) search over high dimensional multimedia data has become an unavoidable service for online applications. Returning fast and high-quality results of unknown queries are the largest challenge that most algorithms faced with. Locality Sensitive Hashing(LSH) is a well-known ANN search algorithm while suffers from inefficient index structure and poor accuracy in the distributed scheme. The traditional index structures have most significant bits(MSB) problem, which is their indexing strategies have an implicit assumption that the bits from one direction in the hash value have higher priority. In this paper, we propose new content-based index called Random Draw Forest(RDF), which not only applies a content-based partition strategy to reduce the search range for fast query response, but also uses the shuffling permutations on hash values to solve the most significant bits problem. We also study the trade-off between query's efficiency and accuracy after applying our partition strategy. In the experiment, we show the effect of parameters and the salient performance of RDF compared with other LSH-based methods to meet the online ANN search.
更多
查看译文
关键词
aproximate nearest neighbor, locality sensitive hashing, index structure, distributed design
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要