Fractal Mining - Self Similarity-Based Clustering And Its Applications

DATA MINING AND KNOWLEDGE DISCOVERY HANDBOOK, SECOND EDITION(2010)

引用 8|浏览10
暂无评分
摘要
Self-similarity is the property of being invariant with respect to the scale used to look at the data set. Self-similarity can be measured using the fractal dimension. Fractal dimension is an important charactaristics for many complex systems and can serve as a powerful representation technique. In this chapter, we present a new clustering algorithm, based on self-similarity properties of the data sets, and also its applications to other fields in Data Mining, such as projected clustering and trend analysis. Clustering is a widely used knowledge discovery technique. The new algorithm which we call Fractal Clustering (FC) places points incrementally in the cluster for which the change in the fractal dimension after adding the point is the least. This is a very natural way of clustering points, since points in the same clusterhave a great degree of self-similarity among them (and much less self-similarity with respect to points in other clusters). FC requires one scan of the data, is suspendable at will, providing the best answer possible at that point, and is incremental. We show via experiments that FC effectively deals with large data sets, high-dimensionality and noise and is capable of recognizing clusters of arbitrary shape.
更多
查看译文
关键词
self-similarity,clustering,projected clustering,trend analysis
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要