Design and Develop Hardware Aware DNN for Faster Inference
Intelligent Systems and Applications(2022)
摘要
On many small-scale devices, advanced learning models have become standard. The necessity of the hour is to reduce the amount of time required for inference. This study describes a pipeline for automating Deep Neural Network customization and reducing neural network inference time. This paper presents a hardware-aware methodology in the form of a sequential pipeline for shrinking the size of deep neural networks. MorphNet is used at the pipeline’s core to iteratively decrease and enlarge a network. Upon the activation of layers, a resource-weighted sparsifying regularizer is used to identify and prune inefficient neurons, and all layers are then expanded using a uniform multiplicative factor. This is followed by fusion, a technique for combining the frozen batch normalization layer with the preceding convolution layer. Finally, the DNN is retrained after customization using a Knowledge Distillation approach to maintain model accuracy performance. The approach shows promising initial results on MobileNetv1 and ResNet50 architectures.
更多查看译文
关键词
MorphNet, Fusion, Knowledge distillation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要