谷歌浏览器插件
订阅小程序
在清言上使用

Scale Decoupled Distillation.

2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)(2024)

引用 0|浏览13
暂无评分
摘要
Logit knowledge distillation attracts increasing attention due to its practicality in recent studies. However, it of-ten suffers inferior performance compared to the feature knowledge distillation. In this paper, we argue that existing log it-based methods may be sub-optimal since they only leverage the global logit output that couples multiple se-mantic knowledge. This may transfer ambiguous knowl-edge to the student and mislead its learning. To this end, we propose a simple but effective method, i.e., Scale De-coupled Distillation (SDD), for logit knowledge distillation. SDD decouples the global logit output into multi-ple local logit outputs and establishes distillation pipelines for them. This helps the student to mine and inherit fine-grained and unambiguous logit knowledge. Moreover, the decoupled knowledge can be further divided into consis-tent and complementary logit knowledge that transfers the semantic information and sample ambiguity, respectively. By increasing the weight of complementary parts, SDD can guide the student to focus more on ambiguous samples, im-proving its discrimination ability. Extensive experiments on several benchmark datasets demonstrate the effective-ness of SDD for wide teacher-student pairs, especially in the fine-grained classification task. Code is available at: https://github.comishicaiwei123/SDD-CVPR2024
更多
查看译文
关键词
Knowlegde Distillation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要