Robust Reinforcement Learning Objectives for Sequential Recommender Systems
arxiv(2023)
摘要
Attention-based sequential recommendation methods have shown promise in
accurately capturing users' evolving interests from their past interactions.
Recent research has also explored the integration of reinforcement learning
(RL) into these models, in addition to generating superior user
representations. By framing sequential recommendation as an RL problem with
reward signals, we can develop recommender systems that incorporate direct user
feedback in the form of rewards, enhancing personalization for users.
Nonetheless, employing RL algorithms presents challenges, including off-policy
training, expansive combinatorial action spaces, and the scarcity of datasets
with sufficient reward signals. Contemporary approaches have attempted to
combine RL and sequential modeling, incorporating contrastive-based objectives
and negative sampling strategies for training the RL component. In this work,
we further emphasize the efficacy of contrastive-based objectives paired with
augmentation to address datasets with extended horizons. Additionally, we
recognize the potential instability issues that may arise during the
application of negative sampling. These challenges primarily stem from the data
imbalance prevalent in real-world datasets, which is a common issue in offline
RL contexts. Furthermore, we introduce an enhanced methodology aimed at
providing a more effective solution to these challenges. Experimental results
across several real datasets show our method with increased robustness and
state-of-the-art performance.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要