A novel density peak based semi-supervised clustering algorithm

2016 International Conference on Machine Learning and Cybernetics (ICMLC)(2016)

引用 3|浏览69
暂无评分
摘要
With the rapid development of technology, acquiring and storing big data from various fields is no longer a problem. Instead, how to utilize the data becomes an important and hot research topic. Clustering is one of the important tasks for big data utility. However, there exists one well-known challenge for the task, i.e. it is difficult to incorporate prior information into the clustering results. In this paper, we proposed a density peak based semi-supervised clustering algorithm, which is able to leverage label information of some seed objects for obtaining a better clustering result. Specifically, we first adopted a density based clustering algorithm to identify density peaks as the possible cluster centers for a dataset, and then proposed a graph-based algorithm to assign each center a class label by utilizing some given seed objects. Finally, we leveraged the label information of seed objects and identified centers to generate must-link and cannot-link constraints for clustering. Extensive experiments have been conducted on various publicly available data sets to verify the effectiveness of the proposed method, and the results showed that the proposed density-peak based semi-supervised algorithm outperforms the existing methods substantially.
更多
查看译文
关键词
Semi-supervised model,Seed object,Dijsktra algorithm,Density peaks algorithm
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要