Hierarchical Clustering better than Average-Linkage.
SODA '19: Symposium on Discrete Algorithms San Diego California January, 2019(2019)
摘要
Hierarchical Clustering (HC) is a widely studied problem in exploratory data analysis, usually tackled by simple agglomerative procedures like average-linkage, single-linkage or complete-linkage. In this paper we focus on two objectives, introduced recently to give insight into the performance of average-linkage clustering: a similarity based HC objective proposed by [21] and a dissimilarity based HC objective proposed by [9]. In both cases, we present tight counterexamples showing that average-linkage cannot obtain better than 1/3 and 2/3 approximations respectively (in the worst-case), settling an open question raised in [21]. This matches the approximation ratio of a random solution, raising a natural question: can we beat average-linkage for these objectives? We answer this in the affirmative, giving two new algorithms based on semidefinite programming with provably better guarantees.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络