AI helps you reading Science
AI Insight
AI extracts a summary of this paper
Weibo:
Spectral Embedded Clustering
IJCAI, pp.1181-1186, (2009)
WOS SCOPUS EI
Keywords
Abstract
In this paper, we propose a new spectral clustering method, referred to as Spectral Embedded Clustering (SEC), to minimize the normalized cut criterion in spectral clustering as well as control the mismatch between the cluster assignment matrix and the low dimensional embedded representation of the data. SEC is based on the observation th...More
Code:
Data:
Introduction
- Clustering is a fundamental task of many machine learning, data mining and pattern recognition problems.
- The researchers usually first project the high dimensional data onto the low dimensional subspace via some dimension reduction techniques such as Principle Component Analysis (PCA).
- To achieve better clustering performance, several works have been proposed to perform K-means clustering and dimension reduction iteratively for high dimensional data [la Torre and Kanade, 2006; Ding and Li, 2007; Ye et al, 2007].
- DisKmeans did not consider the geometry structure (a.k.a. manifold) of the data
Highlights
- Clustering is a fundamental task of many machine learning, data mining and pattern recognition problems
- Many clustering algorithms have been developed such as K-means clustering, mixture models [McLachlan and Peel, 2000], spectral clustering [Ng et al, 2001; Shi and Malik, 2000; Yu and Shi, 2003], support vector clustering [Ben-Hur et al, 2001], and maximum margin clustering [Xu et al, 2005; Zhang et al, 2007; Li et al, 2009]
- We have the following observations: 1) When the traditional EM-like technique is used in KM and DKM to assign cluster labels, DKM and KM lead to different results
- Even for the dataset with clear manifold structure such as COIL-20 and UMIST, Spectral Embedded Clustering is still better than Spectral Clustering and Clustering with Local and Global Regularization. 5) For low dimensional data sets (e.g., Iris and Vote), Spectral Embedded Clustering is slightly better than DKM with some range of parameters μ, and DKM slightly outperforms Spectral Embedded Clustering with other range of parameters γ
- Observing that the cluster assignment matrix can always be represented by a low dimensional linear mapping of the highdimensional data, we propose Spectral Embedded Clustering
- We prove that spectral clustering, Clustering with Local and Global Regularization, K-means and Discriminative K-means are all the special cases of Spectral Embedded Clustering in terms of the objective functions
Methods
- The authors compare the proposed Spectral Embedded Clustering (SEC) with Spectral Clustering (SC) [Yu and Shi, 2003], CLGR [Wang et al, 2007], K-means (KM) and Discriminative K-means(DKM) [Ye et al, 2008].
- The spectral relaxation + spectral rotation to compute the assignment matrix for SEC, SC and CLGR.
- The authors implement K-means and Discriminative K-means by using the spectral relaxation + spectral rotation for cluster assignment.
- As K-means and Discriminative K-means turn to the same when the spectral relaxation is used, the authors denote the results as KM-r in this work
Results
- The clustering results from various algorithms are reported in Table 2 and Table 3.
- The authors have the following observations: 1) When the traditional EM-like technique is used in KM and DKM to assign cluster labels, DKM and KM lead to different results.
- 5) For low dimensional data sets (e.g., Iris and Vote), SEC is slightly better than DKM with some range of parameters μ, and DKM slightly outperforms SEC with other range of parameters γ.
- For all high dimensional data sets, SEC outperforms DKM in most range of parameters μ in term of both ACC and NMI
Conclusion
- Observing that the cluster assignment matrix can always be represented by a low dimensional linear mapping of the highdimensional data, the authors propose Spectral Embedded Clustering Performance
Tables
- Table1: Dataset Description
- Table2: Performance comparison of clustering accuracy from KM, DKM, KM-r, SC, CLGR and SEC on eight databases
- Table3: Performance comparison of normalized mutual information from KM, DKM, KM-r, SC, CLGR and SEC on eight databases
Funding
- ∗This material is based upon work funded by Singapore National Research Foundation Interactive Digital Media R&D Program (Grant No NRF2008IDM-IDM-004-018) and NSFC (Grant No 60835002)
Reference
- [Ben-Hur et al., 2001] A. Ben-Hur, D. Horn, H.T. Siegelmann, and V. Vapnik. Support vector clustering. 2:125–137, 2001.
- [Cai et al., 2005] Deng Cai, Xiaofei He, and Jiawei Han. Document clustering using locality preserving indexing. IEEE Trans. Knowl. Data Eng., 17(12):1624–1637, 2005.
- [Ding and Li, 2007] Chris H. Q. Ding and Tao Li. Adaptive dimension reduction using discriminant analysis and -means clustering. In ICML, pages 521–528, 2007.
- [Ding et al., 2002] Chris H. Q. Ding, Xiaofeng He, Hongyuan Zha, and Horst D. Simon. Adaptive dimension reduction for clustering high dimensional data. In ICDM, pages 147–154, 2002.
- [Jain and Dubes, 1988] A.K. Jain and R.C. Dubes. Algorithms for Clustering Data. Prentice Hall, Englewood Cliffs, NJ, 1988.
- [la Torre and Kanade, 2006] Fernando De la Torre and Takeo Kanade. Discriminative cluster analysis. In ICML, pages 241– 248, 2006.
- [Li et al., 2004] Tao Li, Sheng Ma, and Mitsunori Ogihara. Document clustering via adaptive subspace iteration. In SIGIR, pages 218–225, 2004.
- [Li et al., 2009] Y. Li, I.W. Tsang, J. T. Kwok, and Z. Zhou. Tighter and convex maximum margin clustering. In AISTATS, 2009.
- [McLachlan and Peel, 2000] G. McLachlan and D. Peel. Finite Mixture Models. John Wiley & Sons, New York, 2000.
- [Ng et al., 2001] Andrew Y. Ng, Michael I. Jordan, and Yair Weiss. On spectral clustering: Analysis and an algorithm. In NIPS, pages 849–856, 2001.
- [Shi and Malik, 2000] Jianbo Shi and Jitendra Malik. Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell., 22(8):888–905, 2000.
- [Wang et al., 2007] Fei Wang, Changshui Zhang, and Tao Li. Clustering with local and global regularization. In AAAI, pages 657– 662, 2007.
- [Wu and Scholkopf, 2007] M. Wu and B. Scholkopf. Transductive classification via local learning regularization. In AISTATS, pages 628–635, 03 2007.
- [Xu et al., 2005] L. Xu, J. Neufeld, B. Larson, and D. Schuurmans. Maximum margin clustering. Cambridge, MA, 2005. MIT Press.
- [Ye et al., 2007] Jieping Ye, Zheng Zhao, and Huan Liu. Adaptive distance metric learning for clustering. In CVPR, 2007.
- [Ye et al., 2008] Jieping Ye, Zheng Zhao, and Mingrui Wu. Discriminative k-means for clustering. In Advances in Neural Information Processing Systems 20, pages 1649–1656. 2008.
- [Ye, 2005] Jieping Ye. Characterization of a family of algorithms for generalized discriminant analysis on undersampled problems. Journal of Machine Learning Research, 6:483–502, 2005.
- [Ye, 2007] Jieping Ye. Least squares linear discriminant analysis. In ICML, pages 1087–1093, 2007.
- [Yu and Shi, 2003] Stella X. Yu and Jianbo Shi. Multiclass spectral clustering. In ICCV, pages 313–319, 2003.
- [Zelnik-Manor and Perona, 2004] Lihi Zelnik-Manor and Pietro Perona. Self-tuning spectral clustering. In NIPS, 2004.
- [Zha et al., 2001] Hongyuan Zha, Xiaofeng He, Chris H. Q. Ding, Ming Gu, and Horst D. Simon. Spectral relaxation for k-means clustering. In NIPS, pages 1057–1064, 2001.
- [Zhang et al., 2007] K. Zhang, I.W. Tsang, and J.T. Kwok. Maximum margin clustering made practical. In ICML, Corvallis, Oregon, USA, June 2007.
Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn