AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
This paper presents Deep Embedded Clustering, or Deep Embedded Clustering— an algorithm that clusters a set of data points in a jointly optimized feature space

Unsupervised Deep Embedding for Clustering Analysis

international conference on machine learning, (2016)

Cited by: 1381|Views399
EI

Abstract

Clustering is central to many data-driven application domains and has been studied extensively in terms of distance functions and grouping algorithms. Relatively little work has focused on learning representations for clustering. In this paper, we propose Deep Embedded Clustering (DEC), a method that simultaneously learns feature represen...More

Code:

Data:

0
Introduction
  • Clustering, an essential data analysis and visualization tool, has been studied extensively in unsupervised machine learning from different perspectives: What defines a cluster? What is the right distance metric? How to efficiently group instances into clusters? How to validate clusters? And so on.
  • Little work has focused on the unsupervised learning of the feature space in which to perform clustering.
  • A notion of distance or dissimilarity is central to data clustering algorithms.
  • In turn, relies on representing the data in a feature space.
  • The k-means clustering algorithm (MacQueen et al, 1967), for example, uses the Euclidean distance between points in a given feature space, which for images might be raw pixels or gradient-
Highlights
  • Clustering, an essential data analysis and visualization tool, has been studied extensively in unsupervised machine learning from different perspectives: What defines a cluster? What is the right distance metric? How to efficiently group instances into clusters? How to validate clusters? And so on
  • Numerous different distance functions and embedding methods have been explored in the literature
  • Little work has focused on the unsupervised learning of the feature space in which to perform clustering
  • We show qualitative and quantitative results that demonstrate the benefit of Deep Embedded Clustering compared to LDGMI and SEC
  • This paper presents Deep Embedded Clustering, or Deep Embedded Clustering— an algorithm that clusters a set of data points in a jointly optimized feature space
Methods
  • MNIST STL-HOG REUTERS-10k REUTERS k-means LDMGI N/A SEC.
  • DEC w/o backprop 79.82% 34.06%.
  • DEC articles.
  • The authors computed tf-idf features on the 2000 most frequently occurring word stems.
  • Since some algorithms do not scale to the full Reuters dataset, the authors sampled a random subset of 10000 examples, which the authors call REUTERS-10k, for comparison purposes.
  • A summary of dataset statistics is shown in Table 1.
  • The authors normalize all datasets so that 1 d xi.
  • 2 2 is approximately 1, where d is the dimensionality of the data space point xi ∈ X
Results
  • Evaluation Metric

    The authors use the standard unsupervised evaluation metric and protocols for evaluations and comparisons to other algorithms (Yang et al, 2010).
  • M n where li is the ground-truth label, ci is the cluster assignment produced by the algorithm, and m ranges over all possible one-to-one mappings between clusters and labels.
  • This metric takes a cluster assignment from an unsupervised algorithm and a ground truth assignment and finds the best matching between them.
Conclusion
  • The underlying assumption of DEC is that the initial classifier’s high confidence predictions are mostly correct
  • To verify that this assumption holds for the task and that the choice of P has the desired properties, the authors plot the magnitude of the gradient of L with respect to each embedded point, |∂L/∂zi|, against its soft assignment, qij, to a ran-.
  • AE+LDMGI 83.98% 32.04% AE+SECThis paper presents Deep Embedded Clustering, or DEC— an algorithm that clusters a set of data points in a jointly optimized feature space.
  • DEC has the virtue of linear complexity in the number of data points which allows it to scale to large datasets
Tables
  • Table1: Dataset statistics. # Points # classes Dimension
  • Table2: Comparison of clustering accuracy (Eq 10) on four datasets
  • Table3: Comparison of clustering accuracy (Eq 10) on autoencoder (AE) feature
  • Table4: Clustering accuracy (Eq 10) on imbalanced subsample of MNIST
Download tables as Excel
Related work
  • Clustering has been extensively studied in machine learning in terms of feature selection (Boutsidis et al, 2009; Liu & Yu, 2005; Alelyani et al, 2013), distance functions (Xing et al, 2002; Xiang et al, 2008), grouping methods (MacQueen et al, 1967; Von Luxburg, 2007; Li et al, 2004), and cluster validation (Halkidi et al, 2001). Space does not allow for a comprehensive literature study and we refer readers to (Aggarwal & Reddy, 2013) for a survey.

    One branch of popular methods for clustering is kmeans (MacQueen et al, 1967) and Gaussian Mixture Models (GMM) (Bishop, 2006). These methods are fast and applicable to a wide range of problems. However, their distance metrics are limited to the original data space and they tend to be ineffective when input dimensionality is high (Steinbach et al, 2004).

    Several variants of k-means have been proposed to address issues with higher-dimensional input spaces. De la Torre & Kanade (2006); Ye et al (2008) perform joint dimensionality reduction and clustering by first clustering the data with k-means and then projecting the data into a lower dimensions where the inter-cluster variance is maximized. This process is repeated in EM-style iterations until convergence. However, this framework is limited to linear embedding; our method employs deep neural networks to perform non-linear embedding that is necessary for more complex data.
Funding
  • This work is in part supported by ONR N00014-13-1-0720, NSF IIS- 1338054, and Allen Distinguished Investigator Award
Reference
  • Aggarwal, Charu C and Reddy, Chandan K. Data clustering: algorithms and applications. CRC Press, 2013.
    Google ScholarFindings
  • Alelyani, Salem, Tang, Jiliang, and Liu, Huan. Feature selection for clustering: A review. Data Clustering: Algorithms and Applications, 2013.
    Google ScholarFindings
  • Bellman, R. Adaptive Control Processes: A Guided Tour. Princeton University Press, Princeton, New Jersey, 1961.
    Google ScholarFindings
  • Bengio, Yoshua, Courville, Aaron, and Vincent, Pascal. Representation learning: A review and new perspectives. 2013.
    Google ScholarFindings
  • Bishop, Christopher M. Pattern recognition and machine learning. springer New York, 2006.
    Google ScholarFindings
  • Boutsidis, Christos, Drineas, Petros, and Mahoney, Michael W. Unsupervised feature selection for the kmeans clustering problem. In NIPS, 2009.
    Google ScholarLocate open access versionFindings
  • Coates, Adam, Ng, Andrew Y, and Lee, Honglak. An analysis of single-layer networks in unsupervised feature learning. In International Conference on Artificial Intelligence and Statistics, pp. 215–223, 2011.
    Google ScholarLocate open access versionFindings
  • De la Torre, Fernando and Kanade, Takeo. Discriminative cluster analysis. In ICML, 2006.
    Google ScholarLocate open access versionFindings
  • Doersch, Carl, Singh, Saurabh, Gupta, Abhinav, Sivic, Josef, and Efros, Alexei. What makes paris look like paris? ACM Transactions on Graphics, 2012.
    Google ScholarLocate open access versionFindings
  • Girshick, Ross, Donahue, Jeff, Darrell, Trevor, and Malik, Jitendra. Rich feature hierarchies for accurate object detection and semantic segmentation. In CVPR, 2014.
    Google ScholarLocate open access versionFindings
  • Halkidi, Maria, Batistakis, Yannis, and Vazirgiannis, Michalis. On clustering validation techniques. Journal of Intelligent Information Systems, 2001.
    Google ScholarLocate open access versionFindings
  • Hinton, Geoffrey E and Salakhutdinov, Ruslan R. Reducing the dimensionality of data with neural networks. Science, 313(5786):504–507, 2006.
    Google ScholarLocate open access versionFindings
  • Hornik, Kurt. Approximation capabilities of multilayer feedforward networks. Neural networks, 4(2):251–257, 1991.
    Google ScholarLocate open access versionFindings
  • Jia, Yangqing, Shelhamer, Evan, Donahue, Jeff, Karayev, Sergey, Long, Jonathan, Girshick, Ross, Guadarrama, Sergio, and Darrell, Trevor. Caffe: Convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093, 2014.
    Findings
  • Krizhevsky, Alex, Sutskever, Ilya, and Hinton, Geoffrey E. Imagenet classification with deep convolutional neural networks. In NIPS, 2012.
    Google ScholarLocate open access versionFindings
  • Kuhn, Harold W. The hungarian method for the assignment problem. Naval research logistics quarterly, 2(1-2):83– 97, 1955.
    Google ScholarLocate open access versionFindings
  • Le, Quoc V. Building high-level features using large scale unsupervised learning. In Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on, pp. 8595–8598. IEEE, 2013.
    Google ScholarLocate open access versionFindings
  • LeCun, Yann, Bottou, Leon, Bengio, Yoshua, and Haffner, Patrick. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278– 2324, 1998.
    Google ScholarLocate open access versionFindings
  • Lewis, David D, Yang, Yiming, Rose, Tony G, and Li, Fan. Rcv1: A new benchmark collection for text categorization research. JMLR, 2004.
    Google ScholarLocate open access versionFindings
  • Li, Tao, Ma, Sheng, and Ogihara, Mitsunori. Entropybased criterion in categorical clustering. In ICML, 2004.
    Google ScholarLocate open access versionFindings
  • Liu, Huan and Yu, Lei. Toward integrating feature selection algorithms for classification and clustering. IEEE Transactions on Knowledge and Data Engineering, 2005.
    Google ScholarLocate open access versionFindings
  • Long, Jonathan, Shelhamer, Evan, and Darrell, Trevor. Fully convolutional networks for semantic segmentation. arXiv preprint arXiv:1411.4038, 2014.
    Findings
  • MacQueen, James et al. Some methods for classification and analysis of multivariate observations. In Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, pp. 281–297, 1967.
    Google ScholarLocate open access versionFindings
  • Nair, Vinod and Hinton, Geoffrey E. Rectified linear units improve restricted boltzmann machines. In ICML, 2010.
    Google ScholarLocate open access versionFindings
  • Nie, Feiping, Zeng, Zinan, Tsang, Ivor W, Xu, Dong, and Zhang, Changshui. Spectral embedded clustering: A framework for in-sample and out-of-sample spectral clustering. IEEE Transactions on Neural Networks, 2011.
    Google ScholarLocate open access versionFindings
  • Nigam, Kamal and Ghani, Rayid. Analyzing the effectiveness and applicability of co-training. In Proc. of the ninth international conference on Information and knowledge management, 2000.
    Google ScholarLocate open access versionFindings
  • Srivastava, Nitish, Hinton, Geoffrey, Krizhevsky, Alex, Sutskever, Ilya, and Salakhutdinov, Ruslan. Dropout: A simple way to prevent neural networks from overfitting. JMLR, 2014.
    Google ScholarLocate open access versionFindings
  • Tian, Fei, Gao, Bin, Cui, Qing, Chen, Enhong, and Liu, Tie-Yan. Learning deep representations for graph clustering. In AAAI Conference on Artificial Intelligence, 2014.
    Google ScholarLocate open access versionFindings
  • van der Maaten, Laurens. Learning a parametric embedding by preserving local structure. In International Conference on Artificial Intelligence and Statistics, 2009.
    Google ScholarLocate open access versionFindings
  • van Der Maaten, Laurens. Accelerating t-SNE using treebased algorithms. JMLR, 2014.
    Google ScholarLocate open access versionFindings
  • van der Maaten, Laurens and Hinton, Geoffrey. Visualizing data using t-SNE. JMLR, 2008.
    Google ScholarLocate open access versionFindings
  • Vincent, Pascal, Larochelle, Hugo, Lajoie, Isabelle, Bengio, Yoshua, and Manzagol, Pierre-Antoine. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. JMLR, 2010.
    Google ScholarLocate open access versionFindings
  • Von Luxburg, Ulrike. A tutorial on spectral clustering. Statistics and computing, 2007.
    Google ScholarLocate open access versionFindings
  • Xiang, Shiming, Nie, Feiping, and Zhang, Changshui. Learning a mahalanobis distance metric for data clustering and classification. Pattern Recognition, 2008.
    Google ScholarFindings
  • Xing, Eric P, Jordan, Michael I, Russell, Stuart, and Ng, Andrew Y. Distance metric learning with application to clustering with side-information. In NIPS, 2002.
    Google ScholarLocate open access versionFindings
  • Yan, Donghui, Huang, Ling, and Jordan, Michael I. Fast approximate spectral clustering. In ACM SIGKDD, 2009.
    Google ScholarLocate open access versionFindings
  • Yang, Yi, Xu, Dong, Nie, Feiping, Yan, Shuicheng, and Zhuang, Yueting. Image clustering using local discriminant models and global integration. IEEE Transactions on Image Processing, 2010.
    Google ScholarLocate open access versionFindings
  • Ye, Jieping, Zhao, Zheng, and Wu, Mingrui. Discriminative k-means for clustering. In NIPS, 2008.
    Google ScholarLocate open access versionFindings
  • Zeiler, Matthew D and Fergus, Rob. Visualizing and understanding convolutional networks. In ECCV. 2014.
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科