AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
We have proposed the neural attentive recommendation machine with an encoder-decoder architecture to address the session-based recommendation problem

Neural Attentive Session-based Recommendation.

CIKM, (2017): 1419-1428

Cited by: 211|Views230
EI

Abstract

Given e-commerce scenarios that user profiles are invisible, session-based recommendation is proposed to generate recommendation results from short sessions. Previous work only considers the user's sequential behavior in the current session, whereas the user's main purpose in the current session is not emphasized. In this paper, we propos...More

Code:

Data:

0
Introduction
  • A user session is kicked off when a user clicks a certain item; within a user session, clicking on the interesting item, and spending more time viewing it.
  • Current recommendation research confronts challenges when recommendations are merely from those user sessions, where existing recommendation methods [1, 16, 39, 42] cannot perform well
  • To tackle this problem, sessionbased recommendation [33] is proposed to predict the item that the user is probably interested in based merely on implicit feedbacks, i.e., user clicks, in the current session.
  • The authors argue that both the user’s sequential behavior and main purpose in the current session should be considered in session-based recommendation
Highlights
  • A user session is kicked off when a user clicks a certain item; within a user session, clicking on the interesting item, and spending more time viewing it
  • We propose a novel Neural Attentive Recommendation Machine (NARM) model to take into account both the user’s sequential behavior and main purpose in the current session, and compute recommendation scores by using a bi-linear matching scheme
  • Though a growing number of publications on session-based recommendation focus on recurrent neural networks (RNN)-based methods, unlike existing studies, we propose a novel neural attentive recommendation model that combines both the user’s sequential behavior and main purpose in the current session, which to the best of our knowledge, is not considered by existing researches
  • As recommender systems can only recommend a few items at each time, the actual item a user might pick should be amongst the first few items of the list
  • We have proposed the neural attentive recommendation machine (NARM) with an encoder-decoder architecture to address the session-based recommendation problem
  • We have conducted extensive experiments on two benchmark datasets and demonstrated that our approach can outperform state-of-the-art methods in terms of different evaluation metrics
  • The attention mechanism can be used to explore the importance of attributes in the current session
Methods
  • There are two traditional modeling paradigms, i.e., general recommender and sequential recommender.

    General recommender is mainly based on item-to-item recommendation approaches.
  • Sarwar et al [32] analyze different itembased recommendation generation algorithms and compare their results with basic k-nearest neighbor approaches.
  • Though these methods have proven to be effective and are widely employed, they only take into account the last click of the session, ignoring the information of the whole click sequence.
  • The authors consider that the recall metric is more important than the MRR metric in this task, and NARM adopts the bi-linear decoder in the following experiments
Results
  • Evaluation Metrics and Experimental Setup

    4.3.1 Evaluation Metrics. As recommender systems can only recommend a few items at each time, the actual item a user might pick should be amongst the first few items of the list.
  • The authors use the following metrics to evaluate the quality of the recommendation lists.
  • Recall@N does not consider the actual rank of the item as long as it is amongst the top-N and usually correlates well with other metrics such as click-through rate (CTR) [21].
  • MRR@20: Another used metric is MRR@20 (Mean Reciprocal Rank), which is the average of reciprocal ranks of the desire items.
  • MRR takes the rank of the item into account, which is important in settings where the order of recommendations matters
Conclusion
  • Based on the sequential behavior feature and the user purpose feature, the authors have applied NARM to predict a user’s click in the current session.
  • More item attributes, such as prices and categories, may enhance the performance of the method in sessionbased recommendation.
  • Both the nearest neighbor sessions and the importance of different neighbors should give new insights.
  • The attention mechanism can be used to explore the importance of attributes in the current session
Tables
  • Table1: Statistics of the datasets used in our experiments. (The avg.length means the average length of the complete dataset.)
  • Table2: The comparison of different decoders in NARM
  • Table3: Performance comparison of NARM with baseline methods over three datasets
  • Table4: Performance comparison among three versions of NARM over three datasets
  • Table5: Performance comparison among different session lengths on DIGINETICA dataset. (The baseline method is Improved GRU-Rec [<a class="ref-link" id="c40" href="#r40">40</a>].)
Download tables as Excel
Related work
  • Session-based recommendation is a typical application of recommender systems based on implicit feedbacks, where no explicit preferences (e.g., ratings) but only positive observations (e.g., clicks) are available [10, 23, 27]. These positive observations are usually in a form of sequential data as obtained by passively tracking users’ behavior over a sequence of time. In this section, we briefly review the related work on session-based recommendation from the following two aspects, i.e., traditional methods and deep learning based methods.
Funding
  • This work is supported by the Natural Science Foundation of China (61672322, 61672324), the Natural Science Foundation of Shandong province (2016ZRE27468) and the Fundamental Research Funds of Shandong University
Study subjects and analysis
datasets: 3
POP S-POP Item-KNN BPR-MF FPMC*

GRU-Rec Improved GRU-Rec NARM

4.4 Comparison among Different Decoders

We first empirically compare NARMs with different decoders, i.e., fully-connected decoder and bi-linear similarity decoder. The results over three datasets are shown in Table 2. Here we only illustrate the results on 100-dimensional hidden states because we obtain the same conclusions on other dimension settings.

We make following observations from Table 2: (1) With regard to Recall@20, the performance improves when using the bi-linear similarity decoder, and the improvements are around 0.65%, 0.24% and 4.74% respectively over three datasets. (2) And with regard to MRR@20, the performance on the model using the bi-linear decoder becomes a little worse on YOOCHOOSE 1/64 and 1/4

datasets: 3
The results over three datasets are shown in Table 2. Here we only illustrate the results on 100-dimensional hidden states because we obtain the same conclusions on other dimension settings.

We make following observations from Table 2: (1) With regard to Recall@20, the performance improves when using the bi-linear similarity decoder, and the improvements are around 0.65%, 0.24% and 4.74% respectively over three datasets. (2) And with regard to MRR@20, the performance on the model using the bi-linear decoder becomes a little worse on YOOCHOOSE 1/64 and 1/4
. But on DIGINETICA, the model with the bi-linear decoder still obviously outperforms the model with the fully-connected decoder.

For the session-based recommendation task, as the recommender system recommends top-20 items at once in our settings, the actual item a user might pick should be among the list of 20 items

datasets: 3
Note that some items that in the test set would not appear in the training set since we trained the model only on more recent fractions. The statistics of the three datasets (i.e., YOOCHOOSE 1/64, YOOCHOOSE 1/4 and DIGINETICA) are shown in Table 1. 4.2 Baseline Methods

datasets: 3
We first empirically compare NARMs with different decoders, i.e., fully-connected decoder and bi-linear similarity decoder. The results over three datasets are shown in Table 2. Here we only illustrate the results on 100-dimensional hidden states because we obtain the same conclusions on other dimension settings

datasets: 3
Here we only illustrate the results on 100-dimensional hidden states because we obtain the same conclusions on other dimension settings. We make following observations from Table 2: (1) With regard to Recall@20, the performance improves when using the bi-linear similarity decoder, and the improvements are around 0.65%, 0.24% and 4.74% respectively over three datasets. (2) And with regard to MRR@20, the performance on the model using the bi-linear decoder becomes a little worse on YOOCHOOSE 1/64 and 1/4

datasets: 3
Next we compare our NARM model with state-of-the-art methods. The results of all methods over three datasets are shown in Table 3. And a more specific comparison between NARM and the best baseline (i.e., Improved GRU-Rec) over three datasets are illustrated in Figure 5

datasets: 3
The results of all methods over three datasets are shown in Table 3. And a more specific comparison between NARM and the best baseline (i.e., Improved GRU-Rec) over three datasets are illustrated in Figure 5. We have the following observations from the results: (1) For YOOCHOOSE 1/4 dataset, BPR-MF does not work when we use the average of item factors occurred in the session to replace the user factor

datasets: 3
(2) Overall, three RNNbased methods consistently outperform the traditional baselines, which demonstrates that RNN-based models are good at dealing with sequence information in sessions. (3) By taking both the user’s sequential behavior and main purpose into consideration, the proposed NARM can outperform all the baselines in terms of recall@20 over three datasets and can outperform most of the baselines in terms of MRR@20. Take DIGINETICA dataset as an example, when compared with the best baseline (i.e., Improved GRU-Rec), the relative performance improvements by NARM are around 7.98% and 9.70% respectively in terms of recall@20 and MRR@20

datasets: 3
In this part, we refer to the NARM that uses the sequential behavior feature only, the NARM that uses the user purpose feature only, and the NARM that uses both two features as N ARM lobal, N ARMlocal and N ARMh br id respectively. As shown in Table 4, (1) N ARM lobal and N ARMlocal , which only use a single feature, do not perform well on three datasets. Besides, their performance are very close to each other in terms of two metrics

datasets: 3
This indicates that merely considering the sequential behavior or the user purpose in the current session may not be able to learn a good recommendation model. (2) When we take into account both the user’s sequential behavior and main purpose, N ARMh br id performs better than N ARM lobal and N ARMlocal in terms of Recall@20 and MRR@20 on different hidden state dimensions over three datasets. Take DIGINETICA dataset as an example, when compared with N ARM lobal and N ARMlocal with the dimensionality of the hidden state set to 50, the relative performance improvements by N ARMh br id are around 3.52% and 5.09% in terms of Recall@20 respectively

datasets: 3
The comparison of different decoders in NARM. Performance comparison of NARM with baseline methods over three datasets. Performance comparison among three versions of NARM over three datasets

datasets: 3
Figure 4. Performance comparison between NARM and the best baseline (i.e., Improved GRU-Rec) over three datasets. Visualization of items weights. The depth of the color corresponds to the importance of items given by equation (7). The numbers above the sessions is the session IDs. (Best viewed in color.)

Reference
  • G. Adomavicius and A. Tuzhilin. Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Transactions on Knowledge and Data Engineering, 17(6):734–749, 2005.
    Google ScholarLocate open access versionFindings
  • D. Amodei, R. Anubhai, E. Battenberg, C. Case, J. Casper, B. Catanzaro, J. Chen, M. Chrzanowski, A. Coates, G. Diamos, et al. Deep speech 2: end-to-end speech recognition in english and mandarin. In Proceedings of the 33rd. International Conference on Machine Learning, pages 173–182, 2016.
    Google ScholarLocate open access versionFindings
  • S. Chen, J. L. Moore, D. Turnbull, and T. Joachims. Playlist prediction via metric embedding. In Proceedings of the 18th. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 714–722, 2012.
    Google ScholarLocate open access versionFindings
  • J. Davidson, B. Liebald, J. Liu, P. Nandy, T. Van Vleet, U. Gargi, S. Gupta, Y. He, M. Lambert, B. Livingston, et al. The youtube video recommendation system. In Proceedings of the 4th. ACM Conference on Recommender Systems, pages 293–296, 2010.
    Google ScholarLocate open access versionFindings
  • L. De Vine, G. Zuccon, B. Koopman, L. Sitbon, and P. Bruza. Medical semantic similarity with a neural language model. In Proceedings of the 23rd. ACM International Conference on Conference on Information and Knowledge Management, pages 1819–1822, 2014.
    Google ScholarLocate open access versionFindings
  • A. M. Elkahky, Y. Song, and X. He. A multi-view deep learning approach for cross domain user modeling in recommendation systems. In Proceedings of the 24th. International Conference on World Wide Web, pages 278–288, 2015.
    Google ScholarLocate open access versionFindings
  • A. Graves, A.-r. Mohamed, and G. Hinton. Speech recognition with deep recurrent neural networks. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, pages 6645–6649, 2013.
    Google ScholarLocate open access versionFindings
  • K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 770–778, 2016.
    Google ScholarLocate open access versionFindings
  • X. He and T.-S. Chua. Neural factorization machines for sparse predictive analytics. In Proceedings of the 40th. International ACM SIGIR conference on Research and Development in Information Retrieval, pages 355–364, 2017.
    Google ScholarLocate open access versionFindings
  • X. He, H. Zhang, M.-Y. Kan, and T.-S. Chua. Fast matrix factorization for online recommendation with implicit feedback. In Proceedings of the 39th. International ACM SIGIR conference on Research and Development in Information Retrieval, pages 549–558, 2016.
    Google ScholarLocate open access versionFindings
  • X. He, L. Liao, H. Zhang, L. Nie, X. Hu, and T.-S. Chua. Neural collaborative filtering. In Proceedings of the 26th. International Conference on World Wide Web, pages 173–182, 2017.
    Google ScholarLocate open access versionFindings
  • B. Hidasi, A. Karatzoglou, L. Baltrunas, and D. Tikk. Session-based recommendations with recurrent neural networks. In Proceedings of the 4th. International Conference on Learning Representations, 2016.
    Google ScholarLocate open access versionFindings
  • G. Hinton, L. Deng, D. Yu, G. E. Dahl, A. R. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, and T. N. Sainath. Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Processing Magazine, 29(6):82–97, 2012.
    Google ScholarLocate open access versionFindings
  • S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural computation, 9(8):1735–1780, 1997.
    Google ScholarLocate open access versionFindings
  • D. Kingma and J. Ba. Adam: a method for stochastic optimization. In Proceedings of the 4th. International Conference on Learning Representations, 2015.
    Google ScholarLocate open access versionFindings
  • Y. Koren, R. Bell, and C. Volinsky. Matrix factorization techniques for recommender systems. Computer, 42(8):30–37, 2009.
    Google ScholarLocate open access versionFindings
  • A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In Proceedings of the 25th. International Conference on Neural Information Processing Systems, pages 1097–1105, 2012.
    Google ScholarLocate open access versionFindings
  • P. Li, Z. Wang, W. Lam, Z. Ren, and L. Bing. Salience estimation via variational auto-encoders for multi-document summarization. In Proceedings of the 31st. AAAI Conference on Artificial Intelligence, pages 3497–3503, 2017.
    Google ScholarLocate open access versionFindings
  • P. Li, Z. Wang, Z. Ren, L. Bing, and W. Lam. Neural rating regression with abstractive tips generation for recommendation. In Proceedings of the 40th. International ACM SIGIR conference on Research and Development in Information Retrieval, pages 345–354, 2017.
    Google ScholarLocate open access versionFindings
  • G. Linden, B. Smith, and J. York. Amazon.com recommendations: item-to-item collaborative filtering. IEEE Internet Computing, 7(1):76–80, 2003.
    Google ScholarLocate open access versionFindings
  • Q. Liu, T. Chen, J. Cai, and D. Yu. Enlister: baidu’s recommender system for the biggest chinese q&a website. In Proceedings of the 6th. ACM Conference on Recommender Systems, pages 285–288, 2012.
    Google ScholarLocate open access versionFindings
  • T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In Proceedings of the 26th. International Conference on Neural Information Processing Systems, pages 3111–3119, 2013.
    Google ScholarLocate open access versionFindings
  • A. Mild and T. Reutterer. An improved collaborative filtering approach for predicting cross-category purchases based on binary market basket data. Journal of Retailing and Consumer Services, 10(3):123–133, 2003.
    Google ScholarLocate open access versionFindings
  • A. Mnih and G. Hinton. A scalable hierarchical distributed language model. In Proceedings of the 21st. International Conference on Neural Information Processing Systems, pages 1081–1088, 2008.
    Google ScholarLocate open access versionFindings
  • B. Mobasher, H. Dai, T. Luo, and M. Nakagawa. Using sequential and nonsequential patterns in predictive web usage mining tasks. In Proceedings of the IEEE International Conference on Data Mining, pages 669–672, 2002.
    Google ScholarLocate open access versionFindings
  • P. Ren, Z. Chen, Z. Ren, F. Wei, J. Ma, and M. de Rijke. Leveraging contextual sentence relations for extractive summarization using a neural attention model. In Proceedings of the 40th. International ACM SIGIR conference on Research and Development in Information Retrieval, pages 95–104, 2017.
    Google ScholarLocate open access versionFindings
  • Z. Ren, S. Liang, P. Li, S. Wang, and M. de Rijke. Social collaborative viewpoint regression with explainable recommendations. In Proceedings of the 10th. ACM International Conference on Web Search and Data Mining, pages 485–494, 2017.
    Google ScholarLocate open access versionFindings
  • S. Rendle, C. Freudenthaler, Z. Gantner, and L. Schmidt-Thieme. Bpr: bayesian personalized ranking from implicit feedback. In Proceedings of the 25th. Conference on Uncertainty in Artificial Intelligence, pages 452–461, 2009.
    Google ScholarLocate open access versionFindings
  • S. Rendle, C. Freudenthaler, and L. Schmidt-Thieme. Factorizing personalized markov chains for next-basket recommendation. In Proceedings of the 19th. International Conference on World Wide Web, pages 811–820, 2010.
    Google ScholarLocate open access versionFindings
  • O. Rsoy and C. Cardie. Deep recursive neural networks for compositionality in language. In Proceedings of the 27th. International Conference on Neural Information Processing Systems, pages 2096–2104, 2014.
    Google ScholarLocate open access versionFindings
  • R. Salakhutdinov, A. Mnih, and G. Hinton. Restricted boltzmann machines for collaborative filtering. In Proceedings of the 24th. International Conference on Machine Learning, pages 791–798, 2007.
    Google ScholarLocate open access versionFindings
  • B. Sarwar, G. Karypis, J. Konstan, and J. Riedl. Item-based collaborative filtering recommendation algorithms. In Proceedings of the 10th. International Conference on World Wide Web, pages 285–295, 2001.
    Google ScholarLocate open access versionFindings
  • J. B. Schafer, J. Konstan, and J. Riedl. Recommender systems in e-commerce. In Proceedings of the 1st. ACM Conference on Electronic Commerce, pages 158–166, 1999.
    Google ScholarLocate open access versionFindings
  • S. Sedhain, A. K. Menon, S. Sanner, and L. Xie. Autorec: autoencoders meet collaborative filtering. In Proceedings of the 24th. International Conference on World Wide Web, pages 111–112, 2015.
    Google ScholarLocate open access versionFindings
  • L. Shang, Z. Lu, and H. Li. Neural responding machine for short-text conversation. In Proceedings of the 53rd. Annual Meeting of the Association for Computational Linguistics, pages 1577–1586, 2015.
    Google ScholarLocate open access versionFindings
  • G. Shani, D. Heckerman, and R. I. Brafman. An mdp-based recommender system. Journal of Machine Learning Research, 6(1):1265–1295, 2005.
    Google ScholarLocate open access versionFindings
  • R. Socher, C. Y. Lin, A. Y. Ng, and C. D. Manning. Parsing natural scenes and natural language with recursive neural networks. In Proceedings of the 28th. International Conference on Machine Learning, pages 129–136, 2011.
    Google ScholarLocate open access versionFindings
  • H. Song, Z. Ren, S. Liang, P. Li, J. Ma, and M. de Rijke. Summarizing answers in non-factoid community question-answering. In Proceedings of the 10th. ACM International Conference on Web Search and Data Mining, pages 405–414, 2017.
    Google ScholarLocate open access versionFindings
  • X. Su and T. M. Khoshgoftaar. A survey of collaborative filtering techniques. Advances in Artificial Intelligence, 2009.
    Google ScholarLocate open access versionFindings
  • Y. K. Tan, X. Xu, and Y. Liu. Improved recurrent neural networks for sessionbased recommendations. In Proceedings of the 1st. Workshop on Deep Learning for Recommender Systems, pages 17–22, 2016.
    Google ScholarLocate open access versionFindings
  • P. Wang, J. Guo, Y. Lan, J. Xu, S. Wan, and X. Cheng. Learning hierarchical representation model for nextbasket recommendation. In Proceedings of the 38th. International ACM SIGIR conference on Research and Development in Information Retrieval, pages 403–412, 2015.
    Google ScholarLocate open access versionFindings
  • M. Weimer, A. Karatzoglou, Q. V. Le, and A. Smola. Maximum margin matrix factorization for collaborative ranking. In Proceedings of the 20th. International Conference on Neural Information Processing Systems, pages 1–8, 2007.
    Google ScholarLocate open access versionFindings
  • Y. Wu, C. Dubois, A. X. Zheng, and M. Ester. Collaborative denoising autoencoders for top-n recommender systems. In Proceedings of the 9th. ACM International Conference on Web Search and Data Mining, pages 153–162, 2016.
    Google ScholarLocate open access versionFindings
  • G. E. Yap, X. L. Li, and P. S. Yu. Effective next-items recommendation via personalized sequential pattern mining. In Proceedings of the 17th. International Conference on Database Systems for Advanced Applications, pages 48–64, 2012.
    Google ScholarLocate open access versionFindings
  • Y. Zhang, H. Dai, C. Xu, J. Feng, T. Wang, J. Bian, B. Wang, and T.-Y. Liu. Sequential click prediction for sponsored search with recurrent neural networks. In Proceedings of the 28th. AAAI Conference on Artificial Intelligence, pages 1369– 1375, 2014.
    Google ScholarLocate open access versionFindings
  • A. Zimdars, D. M. Chickering, and C. Meek. Using temporal data for making recommendations. In Proceedings of the 17th. Conference on Uncertainty in Artificial Intelligence, pages 580–588, 2001.
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科