Exploring the uniform effect of FCM clustering: A data distribution perspective

Knowledge-Based Systems(2016)

引用 60|浏览122
暂无评分
摘要
Fuzzy c-means (FCM) is a well-known and widely used fuzzy clustering method. Though there have been considerable studies that focused on the improvement of FCM algorithm or its applications, it is still necessary to understand the effect of data distributions on the performance of FCM. In this paper, we present an organized study of FCM clustering from the perspective of data distribution. We first analyze the structure of the objective function of FCM and find that FCM has the same uniform effect as K-means. Namely, FCM also tends to produce clusters of relatively uniform sizes. The coefficient of variation (CV) is introduced to measure the variation of cluster sizes in a given data set. Then based on the change of CV values between the original “true” cluster sizes and the cluster sizes partitioned by FCM clustering, a necessary but not sufficient criterion for the validation of FCM clustering is proposed from the data distribution perspective. Finally, our experiments on six synthetic data sets and ten real-world data sets further demonstrate the uniform effect of FCM. It tends to reduce the variation in cluster sizes when the CV value of the original data distribution is larger than 0.88, and increase the variation when the variation of original “true” cluster sizes is low.
更多
查看译文
关键词
Fuzzy c-means (FCM),Data distribution,Uniform effect,Coefficient of variation (CV),Clustering
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要