AI帮你理解科学

AI 生成解读视频

AI抽取解析论文重点内容自动生成视频


pub
生成解读视频

AI 溯源

AI解析本论文相关学术脉络


Master Reading Tree
生成 溯源树

AI 精读

AI抽取本论文的概要总结


微博一下
Our method is based on posing metric learning as a convex optimization problem, which allowed us to derive efficient, localoptima free algorithms

Distance Metric Learning with Application to Clustering with Side-Information

NIPS, pp.505-512, (2003)

引用3454|浏览121
EI
下载 PDF 全文
引用
微博一下

摘要

Many algorithms rely critically on being given a good metric over their inputs. For instance, data can often be clustered in many "plausible" ways, and if a clustering algorithm such as K-means initially fails to find one that is meaningful to a user, the only recourse may be for the user to manually tweak the metric until sufficiently go...更多

代码

数据

0
简介
  • The performance of many learning and datamining algorithms depend critically on their being given a good metric over the input space.
  • K-means, nearest-neighbors classifiers and kernel algorithms such as SVMs all need to be given good metrics that reflect reasonably well the important relationships between the data
  • This problem is acute in unsupervised settings such as clustering, and is related to the perennial problem of there often being no “right” answer for clustering: If three algorithms are used to cluster a set of documents, and one clusters according to the authorship, another clusters according to topic, and a third clusters according to writing style, who is to say which is the “right” answer?
  • This includes algorithms such as Multidimensional Scaling (MDS) [2], and Locally Linear Embedding
重点内容
  • The performance of many learning and datamining algorithms depend critically on their being given a good metric over the input space
  • K-means, nearest-neighbors classifiers and kernel algorithms such as SVMs all need to be given good metrics that reflect reasonably well the important relationships between the data. This problem is acute in unsupervised settings such as clustering, and is related to the perennial problem of there often being no “right” answer for clustering: If three algorithms are used to cluster a set of documents, and one clusters according to the authorship, another clusters according to topic, and a third clusters according to writing style, who is to say which is the “right” answer? Worse, if an algorithm were to have clustered by topic, and if we instead wanted it to cluster by writing style, there are relatively few systematic mechanisms for us to convey this to a clustering algorithm, and we are often left tweaking distance metrics by hand
  • We are interested in the following problem: Suppose a user indicates that certain points in an input space are considered by them to be “similar.” Can we automatically learn a distance metric over ¢¤£ that respects these relationships, i.e., one that assigns small distances between the similar pairs? For instance, in the documents example, we might hope that, by giving it pairs of documents judged to be written in similar styles, it would learn to recognize the critical features for determining style
  • We have presented an algorithm that, given examples of similar pairs of points in ¢ £, learns a distance metric that respects these relationships
  • Our method is based on posing metric learning as a convex optimization problem, which allowed us to derive efficient, localoptima free algorithms
方法
  • Experiments and Examples

    The authors begin by giving some examples of distance metrics learned on artificial data, and show how the methods can be used to improve clustering performance.

    3.1 Examples of learned distance metrics

    Consider the data shown in Figure 2(a), which is divided into two classes.
  • The authors begin by giving some examples of distance metrics learned on artificial data, and show how the methods can be used to improve clustering performance.
  • 3.1 Examples of learned distance metrics.
  • Consider the data shown in Figure 2(a), which is divided into two classes.
  • Suppose that points in each class are “sim-.
  • BDB "BDB 8gf § S %egf To visualize this, the authors can use the fact discussed earlier that learning is equivalent to finding a rescaling of the data
结论
  • The authors have presented an algorithm that, given examples of similar pairs of points in ¢ £ , learns a distance metric that respects these relationships.
  • The authors' method is based on posing metric learning as a convex optimization problem, which allowed them to derive efficient, localoptima free algorithms.
  • The authors showed examples of diagonal and full metrics learned from simple artificial examples, and demonstrated on artificial and on UCI datasets how the methods can be used to improve clustering performance
引用论文
  • C. Atkeson, A. Moore, and S. Schaal. Locally weighted learning. AI Review, 1996.
    Google ScholarLocate open access versionFindings
  • T. Cox and M. Cox. Multidimensional Scaling. Chapman & Hall, London, 1994.
    Google ScholarFindings
  • C. Domeniconi and D. Gunopulos. Adaptive nearest neighbor classification using support vector machines. In Advances in Neural Information Processing Systems 14. MIT Press, 2002.
    Google ScholarLocate open access versionFindings
  • G. H. Golub and C. F. Van Loan. Matrix Computations. Johns Hopkins Univ. Press, 1996.
    Google ScholarFindings
  • T. Hastie and R. Tibshirani. Discriminant adaptive nearest neighbor classification. IEEE Transactions on Pattern Analysis and Machine Learning, 18:607–616, 1996.
    Google ScholarLocate open access versionFindings
  • T.S. Jaakkola and D. Haussler. Exploiting generative models in discriminaive classifier. In Proc. of Tenth Conference on Advances in Neural Information Processing Systems, 1999.
    Google ScholarLocate open access versionFindings
  • I.T. Jolliffe. Principal Component Analysis. Springer-Verlag, New York, 1989.
    Google ScholarFindings
  • R. Rockafellar. Convex Analysis. Princeton Univ. Press, 1970.
    Google ScholarFindings
  • S.T. Roweis and L.K. Saul. Nonlinear dimensionality reduction by locally linear embedding.
    Google ScholarFindings
  • B. Scholkopf and A. Smola. Learning with Kernels. In Press, 2001.
    Google ScholarFindings
  • N. Tishby, F. Pereira, and W. Bialek. The information bottleneck method. In Proc. of the 37th Allerton Conference on Communication, Control and Computing, 1999.
    Google ScholarLocate open access versionFindings
  • K. Wagstaff, C. Cardie, S. Rogers, and S. Schroedl. Constrained k-means clustering with background knowledge. In Proc. 18th International Conference on Machine Learning, 2001.
    Google ScholarLocate open access versionFindings
0
您的评分 :

暂无评分

标签
评论
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn