AI帮你理解科学

AI 生成解读视频

AI抽取解析论文重点内容自动生成视频


pub
生成解读视频

AI 溯源

AI解析本论文相关学术脉络


Master Reading Tree
生成 溯源树

AI 精读

AI抽取本论文的概要总结


微博一下
We describe a nonlinear generalization of principal components analysis that uses an adaptive, multilayer Bencoder[ network to transform the high-dimensional data into a low-dimensional code and a similar Bdecoder[ network to recover the data from the code

Reducing The Dimensionality Of Data With Neural Networks

SCIENCE, no. 5786 (2006): 504-507

引用17271|浏览402
WOS SCIENCE
下载 PDF 全文
引用
微博一下

摘要

High-dimensional data can be converted to low-dimensional codes by training a multilayer neural network with a small central layer to reconstruct high-dimensional input vectors. Gradient descent can be used for fine-tuning the weights in such "autoencoder'' networks, but this works well only if the initial weights are close to a good solu...更多

代码

数据

0
简介
  • Materials and Methods are available as supporting material on Science Online.
  • High-dimensional data can be converted to low-dimensional codes by training a multilayer neural network with a small central layer to reconstruct high-dimensional input vectors.
  • The authors describe an effective way of initializing the weights that allows deep autoencoder networks to learn low-dimensional codes that work much better than principal components analysis as a tool to reduce the dimensionality of data.
重点内容
  • High-dimensional data can be converted to low-dimensional codes by training a multilayer neural network with a small central layer to reconstruct high-dimensional input vectors
  • We describe an effective way of initializing the weights that allows deep autoencoder networks to learn low-dimensional codes that work much better than principal components analysis as a tool to reduce the dimensionality of data
  • We describe a nonlinear generalization of principal components analysis that uses an adaptive, multilayer Bencoder[ network to transform the high-dimensional data into a low-dimensional code and a similar Bdecoder[ network to recover the data from the code
  • After fine-tuning on all 60,000 training images, the autoencoder was tested on 10,000 new images and produced much better reconstructions than did principal components analysis (Fig. 2B)
结果
  • The probability of a training image can be raised by www.sciencemag.org SCIENCE VOL 313 28 JULY 2006
  • It can be shown that adding an extra layer always improves a lower bound on the log probability that the model assigns to the training data, provided the number of feature detectors per layer does not decrease and their weights are initialized correctly (9).
  • After pretraining multiple layers of feature detectors, the model is Bunfolded[ (Fig. 1) to produce encoder and decoder networks that initially use the same weights.
  • To demonstrate that the pretraining algorithm allows them to fine-tune deep networks efficiently, the authors trained a very deep autoencoder on a synthetic data set containing images of Bcurves[ that were generated from three randomly chosen points in two dimensions (8).
  • The very deep autoencoder always reconstructs the average of the training data, even after prolonged fine-tuning (8).
  • Shallower autoencoders with a single hidden layer between the data and the code can learn without pretraining, but pretraining greatly reduces their total training time (8).
  • After fine-tuning on all 60,000 training images, the autoencoder was tested on 10,000 new images and produced much better reconstructions than did PCA (Fig. 2B).
结论
  • The authors used a 625-2000-1000-500-30 autoencoder with linear input units to discover 30dimensional codes for grayscale image patches that were derived from the Olivetti face data set (12).
  • It has been obvious since the 1980s that backpropagation through deep autoencoders would be very effective for nonlinear dimensionality reduction, provided that computers were fast enough, data sets were big enough, and the initial weights were close enough to a good solution.
  • Unlike nonparametric methods (15, 16), autoencoders give mappings in both directions between the data and code spaces, and they can be applied to very large data sets because both the pretraining and the fine-tuning scale linearly in time and space with the number of training cases.
基金
  • The research of M.W. is supported by the Leibniz award 2000 of the Deutsche Forschungsgemeinschaft (DFG), that of S.L. through a Helmholtz-Hochschul-Nachwuchsgruppe (VH-NG-232)
  • Roweis for helpful discussions, and the Natural Sciences and Engineering Research Council of Canada for funding
引用论文
  • http://science.sciencemag.org/content/313/5786/504 http://science.sciencemag.org/content/suppl/2006/08/04/313.5786.504.DC1 http://science.sciencemag.org/content/sci/313/5786/454.full
    Findings
  • This article cites 9 articles, 4 of which you can access for free http://science.sciencemag.org/content/313/5786/504#BIBL http://www.sciencemag.org/help/reprints-and-permissions
    Findings
  • Science (print ISSN 0036-8075; online ISSN 1095-9203) is published by the American Association for the Advancement of Science, 1200 New York Avenue NW, Washington, DC 20005. 2017 © The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works. The title Science is a registered trademark of AAAS.
    Google ScholarLocate open access versionFindings
0
您的评分 :

暂无评分

标签
评论
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn