A Cluster Validity Indexing Method Based on Entropy for Solving Cluster Overlapping Problem.

Phen-Lan Lin, Ping-Hsuan Huang,Po-Whei Huang

Frontiers in Artificial Intelligence and Applications(2015)

引用 0|浏览3
暂无评分
摘要
Data clustering technique can be used in many fields, such as data mining, statistical data analysis, image analysis, pattern recognition, etc. Good clustering can result in computational reduction in related application programs; however, it is hard to achieve without knowing how many clusters that a data set should be partitioned, which is common in many applications. The way to find the optimal number of clusters is called cluster validity. In this paper, we proposed a new cluster validity indexing method that aims to solve cluster overlapping problem. Our method adapts the concept of cluster validity index defined as the ratio of compactness and separation and enhances it by integrating an entropy-based weight to the definition of separation so that the new weighted-separation of two overlapped clusters will be larger than that of two non-overlapped clusters, where the distance between the two cluster-centroids are the same. Experiments on six synthetic datasets comprising 3 to 10 clusters with some clusters overlapped each other demonstrate that our proposed method achieves 100% accuracy of validity index for all these datasets and is superior to all other compared methods.
更多
查看译文
关键词
Cluster validity index,Fuzzy C-means,compactness,separation,weighted-separation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要