Chrome Extension
WeChat Mini Program
Use on ChatGLM

Improving Relation Classification Effectiveness by Alternate Distillation.

Zhaoguo Wang, Kai Li,Yuxin Ye

Applied intelligence(2023)

Cited 0|Views11
No score
Abstract
With the development of neural networks, more and more complex and excellent relation classification models are constantly proposed. Although they can be compressed by some model compression methods at the cost of effectiveness, they are still insufficient to deploy on resource-constrained devices. Knowledge distillation can transfer the excellent predictive abilities of superior models to lightweight models, but the gap between models limits its effects. Due to the huge gaps between relation classification models, it is painstakingly difficult to select and train a superior teacher model to guide student models when we use knowledge distillation to get a lightweight model. Therefore, how to obtain a lightweight relation classification model with high effectiveness is still a hot research topic. In this paper, we construct an alternate distillation framework with three modules. The weight adaptive external distillation module is built based on an adaptive weighting module based on cosine similarity. The progressive internal distillation module allows the model to be its own teacher to guide its own training. Finally, a combination module based on the attention mechanism combines the above two modules. On SemEval-2010 Task 8 and WiKi80 datasets, we demonstrate the great effect of our approach on improving the relation classification effectiveness of lightweight models. The complex relation classification models compressed at the cost of effectiveness are still insufficient to deploy on resource-constrained devices. Besides, due to the significant differences between relation classification models, it is challenging to find a suitable teacher model for knowledge distillation. In this paper, we propose an alternate distillation framework (including external distillation and internal distillation) to obtain lightweight relation classification models with high effectiveness. Our approach effectively transfers the excellent predictive capability of complex models to lightweight models even when there is a significant gap between them
More
Translated text
Key words
Relation classification,Deep neural network,Effectiveness,Knowledge distillation
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined