Explore, Filter and Distill: Distilled Reinforcement Learning in Recommendation

Ruobing Xie,Shaoliang Zhang,Rui Wang,Feng Xia,Leyu Lin

Conference on Information and Knowledge Management（2021）

引用 2|浏览25

暂无评分

摘要

ABSTRACTReinforcement learning (RL) has been verified in real-world list-wise recommendation. However, RL-based recommendation suffers from huge memory and computation costs due to its large-scale models. Knowledge distillation (KD) is an effective approach for model compression widely used in practice. However, RL-based models strongly rely on sufficient explorations on the enormous user-item space due to the data sparsity issue, which multiplies the challenges of KD with RL models. What the teacher should teach and how much the student should learn from each lesson need to be carefully designed. In this work, we propose a novel Distilled reinforcement learning framework for recommendation (DRL-Rec), which aims to improve both effectiveness and efficiency in list-wise recommendation. Specifically, we propose an Exploring and filtering module before the distillation, which decides what lessons the teacher should teach from both teachers' and students' aspects. We also conduct a Confidence-guided distillation at both output and intermediate levels with a list-wise KL divergence loss and a Hint loss, which aims to understand how much the student should learn for each lesson. We achieve significant improvements on both offline and online evaluations in a well-known recommendation system. DRL-Rec has been deployed on WeChat Top Stories for more than six months, affecting millions of users. The source codes are released in https://github.com/modriczhang/DRL-Rec.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要