Reinforcement Learning to Rank with Pairwise Policy Gradient
SIGIR '20: The 43rd International ACM SIGIR conference on research and development in Information Retrieval Virtual Event China July, 2020, pp. 509-518, 2020.
This paper concerns reinforcement learning~(RL) of the document ranking models for information retrieval~(IR). One branch of the RL approaches to ranking formalize the process of ranking with Markov decision process~(MDP) and determine the model parameters with policy gradient. Though preliminary success has been shown, these approaches a...More
Full Text (Upload PDF)
PPT (Upload PPT)