Hyperscale Hardware Optimized Neural Architecture Search

Sheng Li, Garrett Andersen,Tao Chen,Liqun Cheng, Julian Grady,Da Huang, Quoc V. Le,Andrew Li, Xin Li, Yang Li,Chen Liang,Yifeng Lu, Yun Ni,Ruoming Pang,Mingxing Tan,Martin Wicke, Gang Wu, Shengqi Zhu,Parthasarathy Ranganathan大牛学者,Norman P. Jouppi大牛学者

ASPLOS 2023: Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3(2023)

引用 0|浏览110
Recent advances in machine learning have leveraged dramatic increases in computational power, a trend expected to continue in the future. This paper introduces the first Hyperscale Hardware Optimized Neural Architecture Search (H 2 O-NAS) to automatically design accurate and performant machine learning models tailored to the underlying hardware architecture. H 2 O-NAS consists of three key components: a new massively parallel “one-shot” search algorithm with intelligent weight sharing, which can scale to search spaces of O (10 280 ) and handle large volumes of production traffic; hardware-optimized search spaces for diverse ML models on heterogeneous hardware; and a novel two-phase hybrid performance model and a multi-objective reward function optimized for large scale deployments. H 2 O-NAS has been implemented around state-of-the-art machine learning models (e.g. convolutional models, vision transformers, and deep learning recommendation models) and deployed at zettaflop scale in production. Our results demonstrate significant improvements in performance (22% ∼ 56%) and energy efficiency (17% ∼25%) at same or better quality. Our solution is designed for largescale deployment, streamlining privacy and security processes and reducing manual overhead. This facilitates a smooth and automated transition from research to production.
AI 理解论文