AI帮你理解科学

AI 生成解读视频

AI抽取解析论文重点内容自动生成视频


pub
生成解读视频

AI 溯源

AI解析本论文相关学术脉络


Master Reading Tree
生成 溯源树

AI 精读

AI抽取本论文的概要总结


微博一下
In this paper we present new methods to train deep neural networks over several data repositories

Distributed learning of deep neural network over multiple agents.

Journal of Network and Computer Applications, (2018): 1-8

被引用75|浏览75
EI
下载 PDF 全文
引用
微博一下

摘要

In domains such as health care and finance, shortage of labeled data and computational resources is a critical issue while developing machine learning algorithms. To address the issue of labeled data scarcity in training and deployment of neural network-based systems, we propose a new technique to train deep neural networks over several d...更多

代码

数据

简介
  • Deep neural networks have become the new state of the art in classification and prediction of high dimensional data such as images, videos and bio-sensors.
  • Training of deep neural nets can be extremely data intensive requiring preparation of large scale datasets collected from multiple entities [1, 2].
  • Deep neural architectures needing large supercomputing resources and engineering oversight may be required for optimal accuracy in real world applications.
  • The authors attempt to solve these problems by proposing methods that enable training of neural networks using multiple data sources and a single supercomputing resource
重点内容
  • Deep neural networks have become the new state of the art in classification and prediction of high dimensional data such as images, videos and bio-sensors
  • When using deep neural networks, larger datasets have been shown to perform significantly better than smaller datasets
  • In this paper we present new methods to train deep neural networks over several data repositories
  • We present algorithms on how to train neural networks without revealing actual raw data
  • We show how this algorithm can be beneficial in low data scenarios by combining data from several resources. Such a method can be beneficial in training using proprietary data sources when data sharing is not possible. It can be of value in areas such as biomedical imaging, when training deep neural network without revealing personal details of patients and minimizing the computation resources required on devices
方法
  • Experiments and Applications

    The authors implement the algorithm and protocol using python bindings for caffe[35].
  • The authors demonstrate that the method works across a range of different topologies and experimentally verify identical results when training over multiple agents.
  • The authors experimentally verify the method’s correctness by implementing it and training it on a wide array of datasets and topologies including MNIST, ILSVRC 12 and CIFAR 10.
  • Table 1 lists datasets and topologies combined with their test accuracies.
  • As shown in table 1, the network converges to similar accuracies when training over several agents in a distributed fashion
结果
  • When using deep neural networks, larger datasets have been shown to perform significantly better than smaller datasets.
结论
  • Conclusions and Future Work

    In this paper the authors present new methods to train deep neural networks over several data repositories.
  • The authors show how this algorithm can be beneficial in low data scenarios by combining data from several resources
  • Such a method can be beneficial in training using proprietary data sources when data sharing is not possible.
  • It can be of value in areas such as biomedical imaging, when training deep neural network without revealing personal details of patients and minimizing the computation resources required on devices
总结
  • Introduction:

    Deep neural networks have become the new state of the art in classification and prediction of high dimensional data such as images, videos and bio-sensors.
  • Training of deep neural nets can be extremely data intensive requiring preparation of large scale datasets collected from multiple entities [1, 2].
  • Deep neural architectures needing large supercomputing resources and engineering oversight may be required for optimal accuracy in real world applications.
  • The authors attempt to solve these problems by proposing methods that enable training of neural networks using multiple data sources and a single supercomputing resource
  • Methods:

    Experiments and Applications

    The authors implement the algorithm and protocol using python bindings for caffe[35].
  • The authors demonstrate that the method works across a range of different topologies and experimentally verify identical results when training over multiple agents.
  • The authors experimentally verify the method’s correctness by implementing it and training it on a wide array of datasets and topologies including MNIST, ILSVRC 12 and CIFAR 10.
  • Table 1 lists datasets and topologies combined with their test accuracies.
  • As shown in table 1, the network converges to similar accuracies when training over several agents in a distributed fashion
  • Results:

    When using deep neural networks, larger datasets have been shown to perform significantly better than smaller datasets.
  • Conclusion:

    Conclusions and Future Work

    In this paper the authors present new methods to train deep neural networks over several data repositories.
  • The authors show how this algorithm can be beneficial in low data scenarios by combining data from several resources
  • Such a method can be beneficial in training using proprietary data sources when data sharing is not possible.
  • It can be of value in areas such as biomedical imaging, when training deep neural network without revealing personal details of patients and minimizing the computation resources required on devices
表格
  • Table1: Accuracies when training using multi-agent algorithm vs when training on a single machine
  • Table2: Comparison on how accuracy improves as more data is added when training
Download tables as Excel
相关工作
  • Deep neural networks have proven to be an effective tool to classify and segment high dimensional data such as images[3], audio and videos[4]. Deep models can be several hundreds of layers deep[5], and can have millions of parameters requiring large amounts of computational resources, creating the need for research in distributed training methodologies[6]. Interesting techniques include distributed gradient optimization[7, 8], online learning with delayed updates[9] and hashing and simplification of kernels[10]. Such techniques can be utilized to train very large scale deep neural networks spanning several machines[11] or to efficiently utilize several GPUs on a single machine[12]. In this paper we propose a technique for distributed computing combining data from several different sources.

    Secure computation continues to be a challenging problem in computer science [13]. One category of solutions to this problem involve adopting oblivious transfer protocols to perform secure dot product over multiple entities in polynomial time [14]. While this method is secure, it is somewhat impractical when considering large scale datasets because of resource requirements. A more practical approach proposed in [14] involves sharing only SIFT and HOG features instead of the actual raw data. However, as shown in [15], such feature vectors can be inverted very accurately using prior knowledge of the methods used to create them. Neural networks have been shown to be extremely robust to addition of noise and their denoising and reconstruction properties make it difficult to compute them securely [16]. Neural networks have also been shown to be able to recover an entire image from only a partial input [17], rendering simple obfuscation methods inert.
基金
  • When using deep neural networks, larger datasets have been shown to perform significantly better than smaller datasets
研究对象与分析
samples: 70000
Mixed NIST Mixed NIST (MNIST) database [34] contains handwritten digits sampled from postal codes and is a subset of a much larger dataset available from the National Institute Science and Technology. MNIST comprises of a total of 70,000 samples divided into 60,000 training samples and 10,000 testing samples. Original binary images were reformatted and spatially normalized to fit in a 20 × 20 bounding box

training samples: 50000
It is composed of 60,000, 32 × 32 color images distributed over 10 different class labels. The dataset consists of 50,000 training samples and 10,000 testing images. Images are uniformly distributed over 10 classes with training batches containing exactly 6000 images for each class

引用论文
  • A. Chervenak, I. Foster, C. Kesselman, C. Salisbury, S. Tuecke, The data grid: Towards an architecture for the distributed management and analysis of large scientific datasets, Journal of network and computer applications 23 (3) (2000) 187–200.
    Google ScholarLocate open access versionFindings
  • J. C.-I. Chuang, M. A. Sirbu, Distributed network storage service with quality-of-service guarantees, Journal of Network and Computer Applications 23 (3) (2000) 163–185.
    Google ScholarLocate open access versionFindings
  • A. Krizhevsky, I. Sutskever, G. E. Hinton, Imagenet classification with deep convolutional neural networks, in: Advances in Neural Information Processing Systems, 2012, pp. 1097– 1105.
    Google ScholarLocate open access versionFindings
  • A. Karpathy, L. Fei-Fei, Deep visual-semantic alignments for generating image descriptions, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015) 3128–3137.
    Google ScholarLocate open access versionFindings
  • K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778.
    Google ScholarLocate open access versionFindings
  • J. Dean, G. Corrado, R. Monga, K. Chen, M. Devin, M. Mao, A. Senior, P. Tucker, K. Yang, Q. V. Le, et al., Large scale distributed deep networks, in: Advances in neural information processing systems, 2012, pp. 1223–1231.
    Google ScholarLocate open access versionFindings
  • R. Mcdonald, M. Mohri, N. Silberman, D. Walker, G. S. Mann, Efficient large-scale distributed training of conditional maximum entropy models, in: Advances in Neural Information Processing Systems, 2009, pp. 1231–1239.
    Google ScholarLocate open access versionFindings
  • M. Zinkevich, M. Weimer, L. Li, A. J. Smola, Parallelized stochastic gradient descent, in: Advances in neural information processing systems, 2010, pp. 2595–2603.
    Google ScholarFindings
  • J. Langford, A. J. Smola, M. Zinkevich, Slow learners are fast, Advances in Neural Information Processing Systems 22 (2009) 2331–2339.
    Google ScholarLocate open access versionFindings
  • Q. Shi, J. Petterson, G. Dror, J. Langford, A. L. Strehl, A. J. Smola, S. Vishwanathan, Hash kernels, in: International Conference on Artificial Intelligence and Statistics, 2009, pp. 496–503.
    Google ScholarLocate open access versionFindings
  • A. Agarwal, J. C. Duchi, Distributed delayed stochastic optimization, in: Advances in Neural Information Processing Systems, 2011, pp. 873–881.
    Google ScholarLocate open access versionFindings
  • A. Agarwal, O. Chapelle, M. Dudık, J. Langford, A reliable effective terascale linear learning system., Journal of Machine Learning Research 15 (1) (2014) 1111–1133.
    Google ScholarLocate open access versionFindings
  • S. K. Sood, A combined approach to ensure data security in cloud computing, Journal of Network and Computer Applications 35 (6) (2012) 1831–1838.
    Google ScholarLocate open access versionFindings
  • S. Avidan, M. Butman, Blind vision, European Conference on Computer Vision (2006) 1–13.
    Google ScholarLocate open access versionFindings
  • A. Dosovitskiy, T. Brox, Inverting visual representations with convolutional networks, arXiv preprint arXiv:1506.02753.
    Findings
  • P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, P.-A. Manzagol, Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion, Journal of Machine Learning Research 11 (Dec) (2010) 3371–3408.
    Google ScholarLocate open access versionFindings
  • D. Pathak, P. Krahenbuhl, J. Donahue, T. Darrell, A. A. Efros, Context encoders: Feature learning by inpainting, arXiv preprint arXiv:1604.07379.
    Findings
  • J. Secretan, M. Georgiopoulos, J. Castro, A privacy preserving probabilistic neural network for horizontally partitioned databases, 2007 International Joint Conference on Neural Networks (2007) 1554–1559.
    Google ScholarLocate open access versionFindings
  • A. Chonka, Y. Xiang, W. Zhou, A. Bonti, Cloud security defence to protect cloud computing against http-dos and xml-dos attacks, Journal of Network and Computer Applications 34 (4) (2011)1097–1107.
    Google ScholarLocate open access versionFindings
  • B. Wu, J. Wu, E. B. Fernandez, M. Ilyas, S. Magliveras, Secure and efficient key management in mobile ad hoc networks, Journal of Network and Computer Applications 30 (3) (2007) 937–954.
    Google ScholarLocate open access versionFindings
  • M. Barni, C. Orlandi, A. Piva, A privacy-preserving protocol for neural-network-based computation, Proceedings of the 8th workshop on Multimedia and security (2006) 146–151.
    Google ScholarLocate open access versionFindings
  • Y. Karam, T. Baker, A. Taleb-Bendiab, Security support for intention driven elastic cloud computing, in: Computer Modeling and Simulation (EMS), 2012 Sixth UKSim/AMSS European Symposium on, IEEE, 2012, pp. 67–73.
    Google ScholarLocate open access versionFindings
  • S. Subashini, V. Kavitha, A survey on security issues in service delivery models of cloud computing, Journal of network and computer applications 34 (1) (2011) 1–11.
    Google ScholarLocate open access versionFindings
  • M. Mackay, T. Baker, A. Al-Yasiri, Security-oriented cloud computing platform for critical infrastructures, Computer Law & Security Review 28 (6) (2012) 679–686.
    Google ScholarLocate open access versionFindings
  • T. Baker, M. Mackay, A. Shaheed, B. Aldawsari, Security-oriented cloud platform for soa-based scada, in: Cluster, Cloud and Grid Computing (CCGrid), 2015 15th IEEE/ACM International Symposium on, IEEE, 2015, pp. 961–970.
    Google ScholarLocate open access versionFindings
  • O. Goldreich, S. Micali, A. Wigderson, How to play any mental game, Proceedings of the nineteenth annual ACM symposium on Theory of computing (1987) 218–229.
    Google ScholarLocate open access versionFindings
  • A. C.-C. Yao, How to generate and exchange secrets, Foundations of Computer Science, 1986., 27th Annual Symposium on (1986) 162–167.
    Google ScholarLocate open access versionFindings
  • Y. Zhang, S. Zhong, A privacy-preserving algorithm for distributed training of neural network ensembles, Neural Computing and Applications 22 (1) (2013) 269–282.
    Google ScholarLocate open access versionFindings
  • K. Chen, L. Liu, A random rotation perturbation approach to privacy preserving data classification.
    Google ScholarFindings
  • C. Orlandi, A. Piva, M. Barni, Oblivious neural network computing via homomorphic encryption, EURASIP Journal on Information Security 2007 (2007) 18.
    Google ScholarLocate open access versionFindings
  • H.-C. Shin, M. R. Orton, D. J. Collins, S. J. Doran, M. O. Leach, Stacked autoencoders for unsupervised feature learning and multiple organ detection in a pilot study using 4d patient data, IEEE transactions on Pattern Analysis and Machine Intelligence 35 (8) (2013) 1930–1943.
    Google ScholarLocate open access versionFindings
  • A. Coates, A. Karpathy, A. Y. Ng, Emergence of object-selective features in unsupervised feature learning, in: Advances in Neural Information Processing Systems, 2012, pp. 2681– 2689.
    Google ScholarLocate open access versionFindings
  • J. Weston, F. Ratle, H. Mobahi, R. Collobert, Deep learning via semi-supervised embedding, in: Neural Networks: Tricks of the Trade, Springer, 2012, pp. 639–655.
    Google ScholarLocate open access versionFindings
  • Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, L. D. Jackel, Backpropagation applied to handwritten zip code recognition, Neural computation 1 (4) (1989) 541–551.
    Google ScholarFindings
  • Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, T. Darrell, Caffe: Convolutional architecture for fast feature embedding, arXiv preprint arXiv:1408.5093.
    Findings
  • K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition, arXiv preprint arXiv:1409.1556.
    Findings
  • J. Chen, R. Monga, S. Bengio, R. Jozefowicz, Revisiting distributed synchronous sgd, arXiv preprint arXiv:1604.00981.
    Findings
  • H. B. McMahan, E. Moore, D. Ramage, S. Hampson, et al., Communication-efficient learning of deep networks from decentralized data, arXiv preprint arXiv:1602.05629.
    Findings
  • I. J. Goodfellow, O. Vinyals, A. M. Saxe, Qualitatively characterizing neural network optimization problems, arXiv preprint arXiv:1412.6544.
    Findings
  • P. M. Granitto, P. F. Verdes, H. A. Ceccatto, Neural network ensembles: evaluation of aggregation algorithms, Artificial Intelligence 163 (2) (2005) 139–162.
    Google ScholarLocate open access versionFindings
  • N. Papernot, M. Abadi, U. Erlingsson, I. Goodfellow, K. Talwar, Semi-supervised knowledge transfer for deep learning from private training data, arXiv preprint arXiv:1610.05755.
    Findings
您的评分 :
0

 

标签
评论
小科