Cluster Center Initialization and Outlier Detection Based on Distance and Density for the K-Means Algorithm

intelligent systems design and applications(2018)

引用 2|浏览29
暂无评分
摘要
K-means algorithm, the most classic partition-based clustering method, has its disadvantages. If there are outliers in the data sets, the K-means algorithm may lead to serious deviation of the mean value. In addition, random initialization is very sensitive to the input data parameters. In this paper, we propose initialization and outlier detection based on distance and density for the K-means algorithm (KMIDDO), an improvement method to optimize the initial center points, especially it has more effective in the case of outliers. What’s more, we extend an outlier detection method to improve the clustering effect. We hope the distance between every two center points is as far as possible and the density of the center points are as large as they can. In terms of initialization, we calculate the distance and density of points. In the outliers detection, we take the outliers as a single class based on the distance and density. Experiments are conducted to illustrate the effectiveness and accuracy of the proposed algorithms on several synthetic and real datasets.
更多
查看译文
关键词
K-means, Outlier detection, Initial center points, Clustering
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要