UCB-ENAS based on Reinforcement Learning

Song Xue,Bo Zhao,Hanlin Chen,Runqi Wang,Baochang Zhang

PROCEEDINGS OF THE 2021 IEEE 16TH CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA 2021)（2021）

引用 0|浏览7

暂无评分

摘要

Deep learning has achieved good results in many practical applications, but the network architecture is largely dependent on manual design. In order to liberate the network architecture from manual design, the Neural Architecture Search (NAS) came into being. NAS is mainly divided into three parts: search space, search strategy and performance estimation strategy. Because of the huge search space of NAS, search process becomes extremely long. A good search strategy can search out the high-performance network architecture in a short lime. In this paper, we study the search strategy for NAS problems and propose the UCB-ENAS algorithm based on reinforcement learning, which significantly improves search efficiency in a flexible manner. NAS problem can be regarded as a stateless Multi-armed Bandit problem, so we use long short-term memory (LSTM) and Upper Confidence Bounds (UCB) to jointly build a controller that generates a network architecture, and then use the policy-based REINFORCE algorithm to update the controller parameters to maximize the expected reward. Controller parameters and model parameters are alternately optimized. A large number of experiments show that the proposed algorithm can quickly and efficiently search the network architecture, which is faster than ENAS in search speed, and the performance is higher than the architecture searched by DARTS (first order). For example: 56.54% perplexity is obtained on the PTB dalaset.

查看译文

关键词

deep learning, neural architecture search, reinforcement learning, long short-term memory, upper confidence bounds

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要