AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
The performance of Determinantal Point Process is below the baseline, which is mainly caused by missing the position bias, while in Baidu App News Feed System the clicks are severely influenced by its position

Sequential Evaluation and Generation Framework for Combinatorial Recommender System.

arXiv: Information Retrieval, (2019)

Cited by: 0|Views2
EI
Full Text
Bibtex
Weibo

Abstract

Typical recommender systems push K items at once in the result page in the form of a feed, in which the selection and the order of the items are important for user experience. In this paper, we formalize the K-item recommendation problem as taking an unordered set of candidate items as input, and exporting an ordered list of selected item...More

Code:

Data:

0
Introduction
  • Recommender Systems(RS) attracts a lot of attention with the booming of information on the Internet.
  • Recommender System, Intra-list Correlations, Diversified Ranking, Reinforcement Learning
  • The authors use sequence decoder as the ranking policy, targeting at generating a sequence(a recommendation list) with as high overall utility as possible.
Highlights
  • Recommender Systems(RS) attracts a lot of attention with the booming of information on the Internet
  • It’s important to consider the intra-list correlations in many realistic RS in order for better user experience
  • If we consider the sub-modular ranking, the lower bound of the ratio of the overall utility comparing the greedy choice to the global optimum is (1 − 1/e)([35])
  • We propose to optimize the overall utility of a sequence, with the preposition that the diversity and the other intralist correlations need to be responsible for this utility
  • We provide a thorough investigation of intra-list correlations in a realistic recommender system
  • The performance of Determinantal Point Process (DPP) is below the baseline, which is mainly caused by missing the position bias, while in Baidu App News Feed System (BANFS) the clicks are severely influenced by its position
Results
  • The authors explain the model architectures and learning metrics for the Evaluator and Generator respectively.
  • As the authors focus mainly on the intra-list correlation, in order to make the comparison easier, the authors propose several relatively simple, but representational models in the following part.
  • The classic RS typically do not consider intra-list correlations at all, i.e. each item is evaluated independently based on the user information φ(u), item information φ(caj ), and position information φ(j) for considering the position bias.
  • The authors further propose to use GRNN([14]) to encode the preceding sequence a−j in order to capture the interactions between a−j and caj , which is followed by two layer MLP(fig.
  • The authors firstly concatenate the user descriptor with item representation and position embedding, and the authors apply a 2-layer Transformer to predict the probability of click in each position(fig.
  • . While many previous works impose strong hypothesis on the formulations, Deep DPP ([33]) has effectively combined the essence of DPP and the representation power of Deep Learning to push the intra-list modeling to new frontier.
  • The authors do provide the comparison of different Generators under the proposed simulator, but the authors demonstrate the validity of the simulator itself by directly comparing the Evaluator to the online environments.
  • By concluding from Tab. 1, the authors can see that a−j and a+j do make impact on the click of the j-th position, as the Bi-GRNN and the Transformer outperforms the other models in all three evaluation criteria.
  • This phenomenon has further shown that the intra-list correlation is much more complicated than many position bias hypothesis or unordered set-wise hypothesis previously proposed.
  • The authors use three different model architectures for the Generators(MLP, GRNN and SetToSeq), which is combined with different learning metrics(SL and RL) and policies(Greedy and Sampling).
Conclusion
  • Notice that in case the authors use GRNN + SL as the Evaluator and the Generator at the same time, the sampling policy can be replaced with beam search.
  • Though the authors believe that RL should be more cost-efficient and more straightforward for solving the intra-list correlation, the experiments shows that the performance of RL occasionally generate unexpected bad patterns.
  • The authors show that compared with traditional diversified ranking algorithms, the proposed framework is capable of capturing various possible correlations as well as the position bias.
Summary
  • Recommender Systems(RS) attracts a lot of attention with the booming of information on the Internet.
  • Recommender System, Intra-list Correlations, Diversified Ranking, Reinforcement Learning
  • The authors use sequence decoder as the ranking policy, targeting at generating a sequence(a recommendation list) with as high overall utility as possible.
  • The authors explain the model architectures and learning metrics for the Evaluator and Generator respectively.
  • As the authors focus mainly on the intra-list correlation, in order to make the comparison easier, the authors propose several relatively simple, but representational models in the following part.
  • The classic RS typically do not consider intra-list correlations at all, i.e. each item is evaluated independently based on the user information φ(u), item information φ(caj ), and position information φ(j) for considering the position bias.
  • The authors further propose to use GRNN([14]) to encode the preceding sequence a−j in order to capture the interactions between a−j and caj , which is followed by two layer MLP(fig.
  • The authors firstly concatenate the user descriptor with item representation and position embedding, and the authors apply a 2-layer Transformer to predict the probability of click in each position(fig.
  • . While many previous works impose strong hypothesis on the formulations, Deep DPP ([33]) has effectively combined the essence of DPP and the representation power of Deep Learning to push the intra-list modeling to new frontier.
  • The authors do provide the comparison of different Generators under the proposed simulator, but the authors demonstrate the validity of the simulator itself by directly comparing the Evaluator to the online environments.
  • By concluding from Tab. 1, the authors can see that a−j and a+j do make impact on the click of the j-th position, as the Bi-GRNN and the Transformer outperforms the other models in all three evaluation criteria.
  • This phenomenon has further shown that the intra-list correlation is much more complicated than many position bias hypothesis or unordered set-wise hypothesis previously proposed.
  • The authors use three different model architectures for the Generators(MLP, GRNN and SetToSeq), which is combined with different learning metrics(SL and RL) and policies(Greedy and Sampling).
  • Notice that in case the authors use GRNN + SL as the Evaluator and the Generator at the same time, the sampling policy can be replaced with beam search.
  • Though the authors believe that RL should be more cost-efficient and more straightforward for solving the intra-list correlation, the experiments shows that the performance of RL occasionally generate unexpected bad patterns.
  • The authors show that compared with traditional diversified ranking algorithms, the proposed framework is capable of capturing various possible correlations as well as the position bias.
Tables
  • Table1: Offline comparison of different Evaluators
  • Table2: Offline comparison of different Generators by using the Evaluators for simulation
  • Table3: Correlation between the Evaluator predictions and the online A/B tests
  • Table4: Comparison of different solutions in online performance
Download tables as Excel
Related work
  • Diversity has been frequently investigated in the area of intra-list correlations. Recent works on diversified ranking include the submodularity ([2],[34]), graph based methods ([39]), and Determinantal Point Process (DPP) ([22], [23], [33]). The diversified ranking with predefined submodular functions typically supposes that diversity are homogeneous on different topics, and independent of the user. DPP and submodular ranking also suppose that co-exposed items always have a negative impact on the possible click of the others. In contrast to those propositions, realistic RS show cases that violate those rules. Our statistics reveal that some related contents are prone to be clicked together. Except for combination, phenomenons relating the user feedback to display positions have also been widely studied, e.g., click models in Information Retrieval(IR) systems, such as the Cascade Click Model([9]) and the Dynamic Bayesian Network([5]). It is found that the position bias is not only related to the user’s personal habit, but also related to the layout design etc([7]). Thus click models often need to be considered case-bycase. More complex phenomenon has also been discovered, such as Serpentining[36], which found that the users prefer discontinuous clicks when browsing the list. It is recommended that high quality items should be scattered over the entire list instead of clustered on the top positions. In contrast to those discoveries, few the previous works have studied the intra-list correlation and the position bias together.
Study subjects and analysis
data: 20
The figure shows that the list. Greedy Sampling(n = 20) Sampling(n = 40) Greedy Sampling(n = 20) Sampling(n = 40). MLP + SL GRNN + SL GRNN + RL(Simulated Data) SetToSeq + RL(Simulated Data) SetToSeq + RL(Real Data) with higher evaluation score indeed include more "good patterns", which also means the evaluation score is consistent with intuitive indicators

data: 20
• GRNN: Generator only, GRNN + SL (Greedy). • Evaluator-Generator: GRNN + SL + Sampling(n = 20) as the Generator, Bi-GRNN as the Selector. • SetToSeq + RL + Simulated Data: Training the SetToSeq model with Q-Learning and simulated feedback

Reference
  • Gediminas Adomavicius and Alexander Tuzhilin. Context-aware recommender systems. In Recommender systems handbook, pages 191–226.
    Google ScholarLocate open access versionFindings
  • Yossi Azar and Iftah Gamzu. Ranking with submodular valuations. In Proceedings of the twenty-second annual ACM-SIAM symposium on Discrete Algorithms, pages 1070–1079. SIAM, 2011.
    Google ScholarLocate open access versionFindings
  • Irwan Bello, Hieu Pham, Quoc V. Le, Mohammad Norouzi, and Samy Bengio. Neural combinatorial optimization with reinforcement learning. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Workshop Track Proceedings [3].
    Google ScholarLocate open access versionFindings
  • Pedro G Campos, Fernando Díez, and Iván Cantador. Time-aware recommender systems: a comprehensive survey and analysis of existing evaluation protocols. User Modeling and User-Adapted Interaction, 24(1-2):67–119, 2014.
    Google ScholarLocate open access versionFindings
  • Olivier Chapelle and Ya Zhang. A dynamic bayesian network click model for web search ranking. In Proceedings of the 18th international conference on World wide web, pages 1–10. ACM, 2009.
    Google ScholarLocate open access versionFindings
  • Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, et al. Wide & deep learning for recommender systems. In Proceedings of the 1st Workshop on Deep Learning for Recommender Systems, pages 7–10. ACM, 2016.
    Google ScholarLocate open access versionFindings
  • Aleksandr Chuklin, Ilya Markov, and Maarten de Rijke. Click models for web search. Synthesis Lectures on Information Concepts, Retrieval, and Services, 7(3):1– 115, 2015.
    Google ScholarLocate open access versionFindings
  • Paul Covington, Jay Adams, and Emre Sargin. Deep neural networks for youtube recommendations. In Proceedings of the 10th ACM Conference on Recommender Systems, pages 191–19ACM, 2016.
    Google ScholarLocate open access versionFindings
  • Nick Craswell, Onno Zoeter, Michael Taylor, and Bill Ramsey. An experimental comparison of click position-bias models. In Proceedings of the 2008 international conference on web search and data mining, pages 87–94. ACM, 2008.
    Google ScholarLocate open access versionFindings
  • Fernando Diaz, Ryen White, Georg Buscher, and Dan Liebling. Robust models of mouse movement on dynamic web search results pages. In Proceedings of the 22nd ACM international conference on Conference on information & knowledge management, pages 1451–1460. ACM, 2013.
    Google ScholarLocate open access versionFindings
  • Bradley B Doll, Dylan A Simon, and Nathaniel D Daw. The ubiquity of modelbased reinforcement learning. Current opinion in neurobiology, 22(6):1075–1081, 2012.
    Google ScholarLocate open access versionFindings
  • Yue Feng, Jun Xu, Yanyan Lan, Jiafeng Guo, Wei Zeng, and Xueqi Cheng. From greedy selection to exploratory decision-making: Diverse ranking with policy-value networks. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, SIGIR 2018, Ann Arbor, MI, USA, July 08-12, 2018 [12], pages 125–134.
    Google ScholarLocate open access versionFindings
  • Scott Fujimoto, David Meger, and Doina Precup. Off-policy deep reinforcement learning without exploration. CoRR, abs/1812.02900, 2018.
    Findings
  • Alex Graves, Abdel Rahman Mohamed, and Geoffrey Hinton. Speech recognition with deep recurrent neural networks. In IEEE International Conference on Acoustics, 2013.
    Google ScholarLocate open access versionFindings
  • Xiangnan He, Hanwang Zhang, Min-Yen Kan, and Tat-Seng Chua. Fast matrix factorization for online recommendation with implicit feedback. In Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, pages 549–558. ACM, 2016.
    Google ScholarLocate open access versionFindings
  • Balázs Hidasi, Alexandros Karatzoglou, Linas Baltrunas, and Domonkos Tikk. Session-based recommendations with recurrent neural networks. In 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference Track Proceedings [16].
    Google ScholarLocate open access versionFindings
  • Sepp Hochreiter and Jürgen Schmidhuber. Long short-term memory. Neural computation, 9(8):1735–1780, 1997.
    Google ScholarLocate open access versionFindings
  • Nan Jiang and Lihong Li. Doubly robust off-policy value evaluation for reinforcement learning. In Proceedings of the 33nd International Conference on Machine Learning, ICML 2016, New York City, NY, USA, June 19-24, 2016 [18], pages 652–661.
    Google ScholarLocate open access versionFindings
  • Elias B. Khalil, Hanjun Dai, Yuyu Zhang, Bistra Dilkina, and Le Song. Learning combinatorial optimization algorithms over graphs. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 4-9 December 2017, Long Beach, CA, USA, pages 6351–6361, 2017.
    Google ScholarLocate open access versionFindings
  • Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings [20].
    Google ScholarLocate open access versionFindings
  • Yehuda Koren, Robert Bell, and Chris Volinsky. Matrix factorization techniques for recommender systems. Computer, 42(8), 2009.
    Google ScholarLocate open access versionFindings
  • Alex Kulesza and Ben Taskar. Determinantal point processes for machine learning. CoRR, abs/1207.6083, 2012.
    Findings
  • Alex Kulesza, Ben Taskar, et al. Determinantal point processes for machine learning. Foundations and Trends® in Machine Learning, 5(2–3):123–286, 2012.
    Google ScholarLocate open access versionFindings
  • Lihong Li, Shunbao Chen, Jim Kleban, and Ankur Gupta. Counterfactual estimation and optimization of click metrics in search engines: A case study. In Proceedings of the 24th International Conference on World Wide Web, pages 929–934. ACM, 2015.
    Google ScholarLocate open access versionFindings
  • Xuezhe Ma and Eduard H. Hovy. End-to-end sequence labeling via bi-directional lstm-cnns-crf. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016, August 7-12, 2016, Berlin, Germany, Volume 1: Long Papers [25].
    Google ScholarLocate open access versionFindings
  • Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin A. Riedmiller. Playing atari with deep reinforcement learning. CoRR, abs/1312.5602, 2013.
    Findings
  • Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl. Item-based collaborative filtering recommendation algorithms. In Proceedings of the 10th international conference on World Wide Web, pages 285–295. ACM, 2001.
    Google ScholarLocate open access versionFindings
  • Guy Shani and Asela Gunawardana. Evaluating recommendation systems. In Recommender systems handbook, pages 257–297.
    Google ScholarLocate open access versionFindings
  • Jing-Cheng Shi, Yang Yu, Qing Da, Shi-Yong Chen, and Anxiang Zeng. Virtualtaobao: Virtualizing real-world online retail environment for reinforcement learning. CoRR, abs/1805.10000, 2018.
    Findings
  • Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. In
    Google ScholarFindings
  • Oriol Vinyals, Meire Fortunato, and Navdeep Jaitly. Pointer networks. In Advances in Neural Information Processing Systems, pages 2692–2700, 2015.
    Google ScholarLocate open access versionFindings
  • Chao Wang, Yiqun Liu, Meng Wang, Ke Zhou, Jian-yun Nie, and Shaoping Ma.
    Google ScholarFindings
  • Mark Wilhelm, Ajith Ramanathan, Alexander Bonomo, Sagar Jain, Ed H Chi, and Jennifer Gillenwater. Practical diversified recommendations on youtube with determinantal point processes. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management, pages 2165–2173. ACM, 2018.
    Google ScholarLocate open access versionFindings
  • Yan Yan, Gaowen Liu, Sen Wang, Jian Zhang, and Kai Zheng. Graph-based clustering and ranking for diversified image search. Multimedia Systems, 23(1):41–52, 2017.
    Google ScholarLocate open access versionFindings
  • Yisong Yue and Carlos Guestrin. Linear submodular bandits and their application to diversified retrieval. In Advances in Neural Information Processing Systems, pages 2483–2491, 2011.
    Google ScholarLocate open access versionFindings
  • Qian Zhao, Gediminas Adomavicius, F Maxwell Harper, Martijn Willemsen, and Joseph A Konstan. Toward better interactions in recommender systems: cycling and serpentining approaches for top-n item lists. In [CSCW2017] Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing, 2017.
    Google ScholarLocate open access versionFindings
  • Xiangyu Zhao, Long Xia, Liang Zhang, Zhuoye Ding, Dawei Yin, and Jiliang Tang. Deep reinforcement learning for page-wise recommendations. In Proceedings of the 12th ACM Conference on Recommender Systems, RecSys 2018, Vancouver, BC, Canada, October 2-7, 2018 [37], pages 95–103.
    Google ScholarLocate open access versionFindings
  • Guanjie Zheng, Fuzheng Zhang, Zihan Zheng, Yang Xiang, Nicholas Jing Yuan, Xing Xie, and Zhenhui Li. Drn: A deep reinforcement learning framework for news recommendation. In Proceedings of the 2018 World Wide Web Conference on World Wide Web, pages 167–176. International World Wide Web Conferences Steering Committee, 2018.
    Google ScholarLocate open access versionFindings
  • Xiaojin Zhu, Andrew Goldberg, Jurgen Van Gael, and David Andrzejewski. Improving diversity in ranking using absorbing random walks. In Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference, pages 97–104, 2007.
    Google ScholarLocate open access versionFindings
  • Cai-Nicolas Ziegler, Sean M McNee, Joseph A Konstan, and Georg Lausen. Improving recommendation lists through topic diversification. In Proceedings of the 14th international conference on World Wide Web, pages 22–32. ACM, 2005.
    Google ScholarLocate open access versionFindings
Author
Fan Wang
Fan Wang
Xiaomin Fang
Xiaomin Fang
Lihang Liu
Lihang Liu
Yaxue Chen
Yaxue Chen
Jiucheng Tao
Jiucheng Tao
Zhiming Peng
Zhiming Peng
Cihang Jin
Cihang Jin
Hao Tian
Hao Tian
Your rating :
0

 

Tags
Comments
小科