AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
Observing that the cluster assignment matrix can always be represented by a low dimensional linear mapping of the highdimensional data, we propose Spectral Embedded Clustering

Spectral Embedded Clustering

IJCAI, pp.1181-1186, (2009)

Cited: 84|Views40
WOS SCOPUS EI
Full Text
Bibtex
Weibo

Abstract

In this paper, we propose a new spectral clustering method, referred to as Spectral Embedded Clustering (SEC), to minimize the normalized cut criterion in spectral clustering as well as control the mismatch between the cluster assignment matrix and the low dimensional embedded representation of the data. SEC is based on the observation th...More

Code:

Data:

0
Introduction
  • Clustering is a fundamental task of many machine learning, data mining and pattern recognition problems.
  • The researchers usually first project the high dimensional data onto the low dimensional subspace via some dimension reduction techniques such as Principle Component Analysis (PCA).
  • To achieve better clustering performance, several works have been proposed to perform K-means clustering and dimension reduction iteratively for high dimensional data [la Torre and Kanade, 2006; Ding and Li, 2007; Ye et al, 2007].
  • DisKmeans did not consider the geometry structure (a.k.a. manifold) of the data
Highlights
  • Clustering is a fundamental task of many machine learning, data mining and pattern recognition problems
  • Many clustering algorithms have been developed such as K-means clustering, mixture models [McLachlan and Peel, 2000], spectral clustering [Ng et al, 2001; Shi and Malik, 2000; Yu and Shi, 2003], support vector clustering [Ben-Hur et al, 2001], and maximum margin clustering [Xu et al, 2005; Zhang et al, 2007; Li et al, 2009]
  • We have the following observations: 1) When the traditional EM-like technique is used in KM and DKM to assign cluster labels, DKM and KM lead to different results
  • Even for the dataset with clear manifold structure such as COIL-20 and UMIST, Spectral Embedded Clustering is still better than Spectral Clustering and Clustering with Local and Global Regularization. 5) For low dimensional data sets (e.g., Iris and Vote), Spectral Embedded Clustering is slightly better than DKM with some range of parameters μ, and DKM slightly outperforms Spectral Embedded Clustering with other range of parameters γ
  • Observing that the cluster assignment matrix can always be represented by a low dimensional linear mapping of the highdimensional data, we propose Spectral Embedded Clustering
  • We prove that spectral clustering, Clustering with Local and Global Regularization, K-means and Discriminative K-means are all the special cases of Spectral Embedded Clustering in terms of the objective functions
Methods
  • The authors compare the proposed Spectral Embedded Clustering (SEC) with Spectral Clustering (SC) [Yu and Shi, 2003], CLGR [Wang et al, 2007], K-means (KM) and Discriminative K-means(DKM) [Ye et al, 2008].
  • The spectral relaxation + spectral rotation to compute the assignment matrix for SEC, SC and CLGR.
  • The authors implement K-means and Discriminative K-means by using the spectral relaxation + spectral rotation for cluster assignment.
  • As K-means and Discriminative K-means turn to the same when the spectral relaxation is used, the authors denote the results as KM-r in this work
Results
  • The clustering results from various algorithms are reported in Table 2 and Table 3.
  • The authors have the following observations: 1) When the traditional EM-like technique is used in KM and DKM to assign cluster labels, DKM and KM lead to different results.
  • 5) For low dimensional data sets (e.g., Iris and Vote), SEC is slightly better than DKM with some range of parameters μ, and DKM slightly outperforms SEC with other range of parameters γ.
  • For all high dimensional data sets, SEC outperforms DKM in most range of parameters μ in term of both ACC and NMI
Conclusion
  • Observing that the cluster assignment matrix can always be represented by a low dimensional linear mapping of the highdimensional data, the authors propose Spectral Embedded Clustering Performance
Tables
  • Table1: Dataset Description
  • Table2: Performance comparison of clustering accuracy from KM, DKM, KM-r, SC, CLGR and SEC on eight databases
  • Table3: Performance comparison of normalized mutual information from KM, DKM, KM-r, SC, CLGR and SEC on eight databases
Download tables as Excel
Funding
  • ∗This material is based upon work funded by Singapore National Research Foundation Interactive Digital Media R&D Program (Grant No NRF2008IDM-IDM-004-018) and NSFC (Grant No 60835002)
Reference
  • [Ben-Hur et al., 2001] A. Ben-Hur, D. Horn, H.T. Siegelmann, and V. Vapnik. Support vector clustering. 2:125–137, 2001.
    Google ScholarFindings
  • [Cai et al., 2005] Deng Cai, Xiaofei He, and Jiawei Han. Document clustering using locality preserving indexing. IEEE Trans. Knowl. Data Eng., 17(12):1624–1637, 2005.
    Google ScholarLocate open access versionFindings
  • [Ding and Li, 2007] Chris H. Q. Ding and Tao Li. Adaptive dimension reduction using discriminant analysis and -means clustering. In ICML, pages 521–528, 2007.
    Google ScholarLocate open access versionFindings
  • [Ding et al., 2002] Chris H. Q. Ding, Xiaofeng He, Hongyuan Zha, and Horst D. Simon. Adaptive dimension reduction for clustering high dimensional data. In ICDM, pages 147–154, 2002.
    Google ScholarLocate open access versionFindings
  • [Jain and Dubes, 1988] A.K. Jain and R.C. Dubes. Algorithms for Clustering Data. Prentice Hall, Englewood Cliffs, NJ, 1988.
    Google ScholarFindings
  • [la Torre and Kanade, 2006] Fernando De la Torre and Takeo Kanade. Discriminative cluster analysis. In ICML, pages 241– 248, 2006.
    Google ScholarLocate open access versionFindings
  • [Li et al., 2004] Tao Li, Sheng Ma, and Mitsunori Ogihara. Document clustering via adaptive subspace iteration. In SIGIR, pages 218–225, 2004.
    Google ScholarLocate open access versionFindings
  • [Li et al., 2009] Y. Li, I.W. Tsang, J. T. Kwok, and Z. Zhou. Tighter and convex maximum margin clustering. In AISTATS, 2009.
    Google ScholarLocate open access versionFindings
  • [McLachlan and Peel, 2000] G. McLachlan and D. Peel. Finite Mixture Models. John Wiley & Sons, New York, 2000.
    Google ScholarFindings
  • [Ng et al., 2001] Andrew Y. Ng, Michael I. Jordan, and Yair Weiss. On spectral clustering: Analysis and an algorithm. In NIPS, pages 849–856, 2001.
    Google ScholarLocate open access versionFindings
  • [Shi and Malik, 2000] Jianbo Shi and Jitendra Malik. Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell., 22(8):888–905, 2000.
    Google ScholarLocate open access versionFindings
  • [Wang et al., 2007] Fei Wang, Changshui Zhang, and Tao Li. Clustering with local and global regularization. In AAAI, pages 657– 662, 2007.
    Google ScholarLocate open access versionFindings
  • [Wu and Scholkopf, 2007] M. Wu and B. Scholkopf. Transductive classification via local learning regularization. In AISTATS, pages 628–635, 03 2007.
    Google ScholarLocate open access versionFindings
  • [Xu et al., 2005] L. Xu, J. Neufeld, B. Larson, and D. Schuurmans. Maximum margin clustering. Cambridge, MA, 2005. MIT Press.
    Google ScholarFindings
  • [Ye et al., 2007] Jieping Ye, Zheng Zhao, and Huan Liu. Adaptive distance metric learning for clustering. In CVPR, 2007.
    Google ScholarFindings
  • [Ye et al., 2008] Jieping Ye, Zheng Zhao, and Mingrui Wu. Discriminative k-means for clustering. In Advances in Neural Information Processing Systems 20, pages 1649–1656. 2008.
    Google ScholarLocate open access versionFindings
  • [Ye, 2005] Jieping Ye. Characterization of a family of algorithms for generalized discriminant analysis on undersampled problems. Journal of Machine Learning Research, 6:483–502, 2005.
    Google ScholarLocate open access versionFindings
  • [Ye, 2007] Jieping Ye. Least squares linear discriminant analysis. In ICML, pages 1087–1093, 2007.
    Google ScholarLocate open access versionFindings
  • [Yu and Shi, 2003] Stella X. Yu and Jianbo Shi. Multiclass spectral clustering. In ICCV, pages 313–319, 2003.
    Google ScholarLocate open access versionFindings
  • [Zelnik-Manor and Perona, 2004] Lihi Zelnik-Manor and Pietro Perona. Self-tuning spectral clustering. In NIPS, 2004.
    Google ScholarLocate open access versionFindings
  • [Zha et al., 2001] Hongyuan Zha, Xiaofeng He, Chris H. Q. Ding, Ming Gu, and Horst D. Simon. Spectral relaxation for k-means clustering. In NIPS, pages 1057–1064, 2001.
    Google ScholarLocate open access versionFindings
  • [Zhang et al., 2007] K. Zhang, I.W. Tsang, and J.T. Kwok. Maximum margin clustering made practical. In ICML, Corvallis, Oregon, USA, June 2007.
    Google ScholarLocate open access versionFindings
0
Your rating :

No Ratings

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn