Differentiable Feature Aggregation Search for Knowledge Distillation
european conference on computer vision, pp. 469-484, 2020.
Knowledge distillation has become increasingly important in model compression. It boosts the performance of a miniaturized student network with the supervision of the output distribution and feature maps from a sophisticated teacher network. Some recent works introduce multi-teacher distillation to provide more supervision to the studen...More
PPT (Upload PPT)