AI帮你理解科学

AI 生成解读视频

AI抽取解析论文重点内容自动生成视频


pub
生成解读视频

AI 溯源

AI解析本论文相关学术脉络


Master Reading Tree
生成 溯源树

AI 精读

AI抽取本论文的概要总结


微博一下
We present a simple but highly flexible framework for progressively growing neural networks in a principled steepest descent fashion

Firefly Neural Architecture Descent: a General Approach for Growing Neural Networks

NIPS 2020, (2020)

被引用1|浏览118
下载 PDF 全文
引用
微博一下

摘要

We propose firefly neural architecture descent, a general framework for progressively and dynamically growing neural networks to jointly optimize the networks’ parameters and architectures. Our method works in a steepest descent fashion, which iteratively finds the best network within a functional neighborhood of the original network that...更多

代码

数据

0
简介
  • Biological brains are developed and shaped by complex progressive growing processes, most existing artificial deep neural networks are trained under fixed network structures.
  • Dynamically growing neural network has been proposed as a promising approach for preventing the challenging catastrophic forgetting problem in continual learning (Rusu et al, 2016; Yoon et al, 2017; Rosenfeld & Tsotsos, 2018; Li et al, 2019).
  • The method is restricted to neuron splitting, and can not incorporate more flexible ways for growing networks, including adding brand new neurons and introducing new layers
重点内容
  • Biological brains are developed and shaped by complex progressive growing processes, most existing artificial deep neural networks are trained under fixed network structures
  • We conduct four sets of experiments to verify the effectiveness of firefly neural architecture descent
  • Firefly descent finds competitive but more compact networks in a relatively shorter time compared to state-of-the-art approaches
  • We present a simple but highly flexible framework for progressively growing neural networks in a principled steepest descent fashion
  • We demonstrate the effectiveness of our method on both growing networks on both single tasks and continual learning problems, in which our method consistently achieves the best results
  • We can see that firefly descent learns the smallest network that achieves the best performance among all methods
  • This work develops a new framework that can grow neural networks and efficiently, which can be generally used in various applications that using neural networks and positively enhance their capacity and performance
方法
  • NASNet-A (Zoph et al, 2018) ENAS (Pham et al, 2018) Random Search

    DARTS (Liu et al, 2018b) DARTS (Liu et al, 2018b) Firefly

    Continual Learning the authors apply our method to grow networks for continual learning (CL), and compare with two state-of-the-art methods, Compact-Pick-Grow (CPG) (Hung et al, 2019a) and Learn-to-grow (Li et al, 2019), both of which progressively grow neural networks for learning new tasks.
  • Figure 5(a) shows the average accuracy and size of models at the end of the 10 tasks learned by firefly descent, Learn-to-Grow, CPG and other CL baselines.
  • The authors can see that firefly descent learns the smallest network that achieves the best performance among all methods.
  • It is more computationally efficient than CPG when growing and picking the neurons for the new tasks.
结果
  • The authors conduct four sets of experiments to verify the effectiveness of firefly neural architecture descent.
  • The authors first demonstrate the importance of introducing additional growing operations beyond neuron splitting (Liu et al, 2019) and apply the firefly descent to both neural architecture search and continual learning problems.
  • In both applications, firefly descent finds competitive but more compact networks in a relatively shorter time compared to state-of-the-art approaches.
结论
  • The authors present a simple but highly flexible framework for progressively growing neural networks in a principled steepest descent fashion.
  • The authors' framework allows them to incorporate various mechanisms for growing networks.
  • Future work can investigate various other growing methods for specific applications under the general framework.
  • Broader Impact.
  • This work develops a new framework that can grow neural networks and efficiently, which can be generally used in various applications that using neural networks and positively enhance their capacity and performance.
  • The authors' work does not have any negative societal impacts
表格
  • Table1: Performance compared with several NAS baseline
Download tables as Excel
相关工作
  • In this section, we briefly review previous works that grow neural networks in a general purpose and then discuss existing works that apply network growing to tackle continual learning.

    Growing for general purpose Previous works have investigated ways of knowledge transfer by expanding the network architecture. One of the approaches, called Net2Net (Wei et al, 2016), provides growing operations for widening and deepening the network with the same output. So whenever the network is applied to learn a new task, it will be initialized as a functional equivalent but larger network for more learning capacity. Network Morphism (Wei et al, 2016) extends the Net2Net to a broader concept, which defines more operations that change a network’s architecture but maintains its functional representation. Although the growing methods are similar to ours, in these works, they randomly or adopt simple heuristic to select which neurons to grow and in what direction. As a result, they failed to guarantee that the growing procedure can finally reach a better architecture every time. (Elsken et al, 2017) solve this problem by growing several neighboring networks and choose the best one after some training and evaluation on them. However, this requires comparing multiple candidate networks simultaneously.
基金
  • Funding Transparency Statement Related Funding: NSF (CPS-1739964, IIS-1724157, NRI-1925082, CAREER-1846421, SenSE2037267, EAGER-2041327), ONR (N00014-18-2243), FLI (RFP2-000), ARO (W911NF-19-2-0333), DARPA, Lockheed Martin, GM, and Bosch. Peter Stone serves as the Executive Director of Sony AI America and receives financial compensation for this work. The terms of this arrangement have been reviewed and approved by the University of Texas at Austin in accordance with its policy on objectivity in research.
引用论文
  • Chen, Tianqi, Goodfellow, Ian, and Shlens, Jonathon. Net2net: Accelerating learning via knowledge transfer. In International Conference on Learning Representations (ICLR), 2016.
    Google ScholarLocate open access versionFindings
  • Elsken, Thomas, Metzen, Jan-Hendrik, and Hutter, Frank. Simple and efficient architecture search for convolutional neural networks. arXiv preprint arXiv:1711.04528, 2017.
    Findings
  • Elsken, Thomas, Metzen, Jan Hendrik, and Hutter, Frank. Efficient multi-objective neural architecture search via lamarckian evolution. arXiv preprint arXiv:1804.09081, 2018.
    Findings
  • Hung, Ching-Yi, Tu, Cheng-Hao, Wu, Cheng-En, Chen, Chien-Hung, Chan, Yi-Ming, and Chen, Chu-Song. Compacting, picking and growing for unforgetting continual learning. In Advances in Neural Information Processing Systems, pp. 13647–13657, 2019a.
    Google ScholarLocate open access versionFindings
  • Hung, Steven CY, Lee, Jia-Hong, Wan, Timmy ST, Chen, Chein-Hung, Chan, Yi-Ming, and Chen, Chu-Song. Increasingly packing multiple facial-informatics modules in a unified deep-learning model via lifelong learning. In Proceedings of the 2019 on International Conference on Multimedia Retrieval, pp. 339–343, 2019b.
    Google ScholarLocate open access versionFindings
  • Kirkpatrick, James, Pascanu, Razvan, Rabinowitz, Neil, Veness, Joel, Desjardins, Guillaume, Rusu, Andrei A, Milan, Kieran, Quan, John, Ramalho, Tiago, Grabska-Barwinska, Agnieszka, et al. Overcoming catastrophic forgetting in neural networks. Proceedings of the national academy of sciences, 114(13):3521–3526, 2017.
    Google ScholarLocate open access versionFindings
  • Li, Xilai, Zhou, Yingbo, Wu, Tianfu, Socher, Richard, and Xiong, Caiming. Learn to grow: A continual structure learning framework for overcoming catastrophic forgetting. arXiv preprint arXiv:1904.00310, 2019.
    Findings
  • Li, Zhizhong and Hoiem, Derek. Learning without forgetting. IEEE transactions on pattern analysis and machine intelligence, 40(12):2935–2947, 2017.
    Google ScholarLocate open access versionFindings
  • Liu, Chenxi, Zoph, Barret, Neumann, Maxim, Shlens, Jonathon, Hua, Wei, Li, Li-Jia, Fei-Fei, Li, Yuille, Alan, Huang, Jonathan, and Murphy, Kevin. Progressive neural architecture search. In Proceedings of the European Conference on Computer Vision (ECCV), pp. 19–34, 2018a.
    Google ScholarLocate open access versionFindings
  • Liu, Hanxiao, Simonyan, Karen, and Yang, Yiming. Darts: Differentiable architecture search. arXiv preprint arXiv:1806.09055, 2018b.
    Findings
  • Liu, Qiang, Wu, Lemeng, and Wang, Dilin. Splitting steepest descent for growing neural architectures. Neural Information Processing Systems (NeurIPS), 2019.
    Google ScholarLocate open access versionFindings
  • Pham, Hieu, Guan, Melody Y, Zoph, Barret, Le, Quoc V, and Dean, Jeff. Efficient neural architecture search via parameter sharing. arXiv preprint arXiv:1802.03268, 2018.
    Findings
  • Real, Esteban, Aggarwal, Alok, Huang, Yanping, and Le, Quoc V. Regularized evolution for image classifier architecture search. In Proceedings of the aaai conference on artificial intelligence, volume 33, pp. 4780–4789, 2019.
    Google ScholarLocate open access versionFindings
  • Rosenfeld, Amir and Tsotsos, John K. Incremental learning through deep adaptation. IEEE transactions on pattern analysis and machine intelligence, 2018.
    Google ScholarFindings
  • Rusu, Andrei A, Rabinowitz, Neil C, Desjardins, Guillaume, Soyer, Hubert, Kirkpatrick, James, Kavukcuoglu, Koray, Pascanu, Razvan, and Hadsell, Raia. Progressive neural networks. arXiv preprint arXiv:1606.04671, 2016.
    Findings
  • Simonyan, Karen and Zisserman, Andrew. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
    Findings
  • Wang, Dilin, Li, Meng, Wu, Lemeng, Chandra, Vikas, and Liu, Qiang. Energy-aware neural architecture optimization with fast splitting steepest descent. arXiv preprint arXiv:1910.03103, 2019.
    Findings
  • Wei, Tao, Wang, Changhu, Rui, Yong, and Chen, Chang Wen. Network morphism. In International Conference on Machine Learning (ICML), pp. 564–572, 2016.
    Google ScholarLocate open access versionFindings
  • Wen, Wei, Yan, Feng, and Li, Hai. Autogrow: Automatic layer growing in deep convolutional networks. arXiv preprint arXiv:1906.02909, 2019.
    Findings
  • Wu, Lemeng, Ye, Mao, Lei, Qi, Lee, Jason D, and Liu, Qiang. Steepest descent neural architecture optimization: Escaping local optimum with signed neural splitting. arXiv preprint arXiv:2003.10392, 2020.
    Findings
  • Xu, Ju and Zhu, Zhanxing. Reinforced continual learning. In Advances in Neural Information Processing Systems, pp. 899–908, 2018.
    Google ScholarLocate open access versionFindings
  • Yoon, Jaehong, Yang, Eunho, Lee, Jeongtae, and Hwang, Sung Ju. Lifelong learning with dynamically expandable networks. arXiv preprint arXiv:1708.01547, 2017.
    Findings
  • Zoph, Barret, Vasudevan, Vijay, Shlens, Jonathon, and Le, Quoc V. Learning transferable architectures for scalable image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 8697–8710, 2018.
    Google ScholarLocate open access versionFindings
作者
Lemeng Wu
Lemeng Wu
Bo Liu
Bo Liu
您的评分 :
0

 

标签
评论
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科