Hulling versus Clustering - Two Complementary Applications of Non-Negative Matrix Factorization

2021 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC 2021)(2021)

引用 0|浏览0
暂无评分
摘要
In this paper we make a comparison of two NMF based techniques of dataset characterization: clustering and hulling. The characteristics of a dataset should be understood as describing the content of a data set through several characteristic representatives. Hulling (defined later) characterizes the data by saying that the data points are somewhere between the representatives, while clustering characterizes the data by saying that the data points are close to one or the other representative. The precision of such a characteristic will be measured as a deviation from the idea of characterization, i.e. the distance of the actual data points from the closest representatives in the case of clustering and from the interior of the hull spanned by the representatives. We show that for low-dimensional data the hull-based characterization precision is much better than in case of clustering. Clustering and hulling are two examples of sophisticated optimization problems. Evolutionary algorithms are an excellent tool for solving such problems. However, in the case of large, high-dimensional data sets, their usefulness decreases. In this paper, we discuss heuristics for hulling for massive data. We hope that it will inspire the creation of an effective evolutionary algorithm dedicated to solving such problems.
更多
查看译文
关键词
Non-negative Matrix Factorization, clustering, hulling, k-means
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要