On Densification for Minwise Hashing

UAI(2020)

引用 24|浏览82
暂无评分
摘要
One Permutation Hashing (OPH) is a significantly more efficient alternative to the popular minwise hashing. To produce a sketch of size k, OPH requires just one hash function whereas the classical minwise hashing requires k hash functions. However, OPH does not have the desirable locality sensitive hashing (LSH) property that is important for indexing. [Srivastava and Li 2014] proposed the novel idea of densification to produce LSH sketches from OPH sketches, and gave the first densification routine. In this paper, we give a necessary and sufficient condition for a densification routine to result in LSH sketches when applied to OPH sketches. Furthermore, we give a novel densification routine that for every input, takes O(k log k) time in expectation and achieves better variance than the previous best bound obtained by [Srivastava 2017]. The running time of the densification routine given in [Srivastava 2017] for worst case inputs is O(k(2)) in expectation.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要