InDistill: Information flow-preserving knowledge distillation for model compression

arxiv(2022)

引用 0|浏览5
暂无评分
摘要
In this paper we introduce InDistill, a model compression approach that combines knowledge distillation and channel pruning in a unified framework for the transfer of the critical information flow paths from a heavyweight teacher to a lightweight student. Such information is typically collapsed in previous methods due to an encoding stage prior to distillation. By contrast, InDistill leverages a pruning operation applied to the teacher's intermediate layers reducing their width to the corresponding student layers' width. In that way, we force architectural alignment enabling the intermediate layers to be directly distilled without the need of an encoding stage. Additionally, a curriculum learning-based training scheme is adopted considering the distillation difficulty of each layer and the critical learning periods in which the information flow paths are created. The proposed method surpasses state-of-the-art performance on three standard benchmarks, i.e. CIFAR-10, CUB-200, and FashionMNIST by 3.08%, 14.27%, and 1% mAP, respectively, as well as on more challenging evaluation settings, i.e. ImageNet and CIFAR-100 by 1.97% and 5.65% mAP, respectively.
更多
查看译文
关键词
knowledge distillation,compression,flow-preserving
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要