Condensed Silhouette: An Optimized Filtering Process for Cluster Selection in K-Means

Procedia Computer Science(2020)

引用 12|浏览43
暂无评分
摘要
In K-Means based clustering algorithms, different initial seeds can lead to different clustering results. Selecting the best result from different initial seeds is called the filtering process. The filtering process follows three steps, 1- performing several clustering trials, 2- scoring each trial, and 3- choosing the trial with the best score. A typical method to score the clustering results of K-Means is the within-cluster sum of squares (WCSS). There are more advanced methods that can be used to score the clustering trials. These methods usually provide a better score with the cost of being more computationally demanding. In this paper, we propose Condensed Silhouette, which is a very efficient version of the Silhouette algorithm. For this purpose, we replace the elements of Silhouette algorithm with similar elements of the K-Means algorithm. This helps us to maintain the accuracy of the Silhouette and at the same time, significantly reduce the computational requirements of the method. Our experiments on 14 real datasets show the effectiveness of the proposed method.
更多
查看译文
关键词
K-Means,Silhouette,Computation,WCSS,Evaluation,Davies Bouldin
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要