DyCE: Dynamic Configurable Exiting for Deep Learning Compression and Scaling
arxiv(2024)
摘要
Modern deep learning (DL) models necessitate the employment of scaling and
compression techniques for effective deployment in resource-constrained
environments. Most existing techniques, such as pruning and quantization are
generally static. On the other hand, dynamic compression methods, such as early
exits, reduce complexity by recognizing the difficulty of input samples and
allocating computation as needed. Dynamic methods, despite their superior
flexibility and potential for co-existing with static methods, pose significant
challenges in terms of implementation due to any changes in dynamic parts will
influence subsequent processes. Moreover, most current dynamic compression
designs are monolithic and tightly integrated with base models, thereby
complicating the adaptation to novel base models. This paper introduces DyCE,
an dynamic configurable early-exit framework that decouples design
considerations from each other and from the base model. Utilizing this
framework, various types and positions of exits can be organized according to
predefined configurations, which can be dynamically switched in real-time to
accommodate evolving performance-complexity requirements. We also propose
techniques for generating optimized configurations based on any desired
trade-off between performance and computational complexity. This empowers
future researchers to focus on the improvement of individual exits without
latent compromise of overall system performance. The efficacy of this approach
is demonstrated through image classification tasks with deep CNNs. DyCE
significantly reduces the computational complexity by 23.5
25.9
0.5
terms of real-time configuration and fine-grained performance tuning.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要