## AI helps you reading Science

## AI Insight

AI extracts a summary of this paper

Weibo:

# Factorizing personalized Markov chains for next-basket recommendation

WWW, pp.811-820, (2010)

EI

Keywords

Abstract

Recommender systems are an important component of many websites. Two of the most popular approaches are based on matrix factorization (MF) and Markov chains (MC). MF methods learn the general taste of a user by factorizing the matrix over observed user-item preferences. On the other hand, MC methods model sequential behavior by learning a...More

Code:

Data:

Introduction

- A core technology of many recent websites are recommender systems
- They are used for example to increase sales in e-commerce, clicking rates on websites or visitor satisfaction in general.
- An obvious example is an online shop where a user buys items.
- In these applications, usually several items are bought at the same time, i.e. the authors have a set/basket of items at one point of time.
- The target is to recommend items to the user that he might want to buy in his visit

Highlights

- A core technology of many recent websites are recommender systems
- Two of the most popular approaches are based on matrix factorization (MF) and Markov chains (MC)
- As the observations for estimating the transitions are usually very limited, our method factorizes the transition cube with a pairwise interaction model which is a special case of the Tucker Decomposition
- We show that our factorized personalized MC (FPMC) model subsumes both a common Markov chain and the normal matrix factorization model
- We have introduced a recommender method based on personalized Markov chains over sequential set data
- We empirically show that our model outperforms other state-of-the-art methods on sequential data
- As direct estimation over a full parametrized transition cube leads to very poor estimates, we introduce a factorization model that gives a low-rank approximation to the transition cube

Results

- In figure 6 you can see the quality on the sparse and dense online-shopping dataset.
- For the factorization methods the authors run each method with kU,I = kI,L ∈ {8, 16, 32, 64, 128} factorization dimension.
- As expected all methods outperform the most-popular baseline clearly on both datasets and all quality measures.
- With reasonable factorization dimensions (e.g. 32) all the factorization methods outperform the standard MC method.
- The factorized personalized MC (FPMC) outperforms all other methods

Conclusion

- The authors have introduced a recommender method based on personalized Markov chains over sequential set data.
- As direct estimation over a full parametrized transition cube leads to very poor estimates, the authors introduce a factorization model that gives a low-rank approximation to the transition cube.
- The advantages of this approach is that each transition is influenced by transitions of similar users, similar items and similar transitions.
- The authors show on real-world data that FPMC outperforms MF, FMC and normal MC both on sparse and dense data

Summary

## Introduction:

A core technology of many recent websites are recommender systems- They are used for example to increase sales in e-commerce, clicking rates on websites or visitor satisfaction in general.
- An obvious example is an online shop where a user buys items.
- In these applications, usually several items are bought at the same time, i.e. the authors have a set/basket of items at one point of time.
- The target is to recommend items to the user that he might want to buy in his visit
## Results:

In figure 6 you can see the quality on the sparse and dense online-shopping dataset.- For the factorization methods the authors run each method with kU,I = kI,L ∈ {8, 16, 32, 64, 128} factorization dimension.
- As expected all methods outperform the most-popular baseline clearly on both datasets and all quality measures.
- With reasonable factorization dimensions (e.g. 32) all the factorization methods outperform the standard MC method.
- The factorized personalized MC (FPMC) outperforms all other methods
## Conclusion:

The authors have introduced a recommender method based on personalized Markov chains over sequential set data.- As direct estimation over a full parametrized transition cube leads to very poor estimates, the authors introduce a factorization model that gives a low-rank approximation to the transition cube.
- The advantages of this approach is that each transition is influenced by transitions of similar users, similar items and similar transitions.
- The authors show on real-world data that FPMC outperforms MF, FMC and normal MC both on sparse and dense data

- Table1: Characteristics of the datasets in our experiments in terms of number of users, items, baskets and triples (u, i, t) where t is the sequential time of the basket. The dense dataset is a subset of the sparse one containing the 10,000 users with most purchases and the 1000 most purchased items
- Table2: Properties of the MC transition matrix estimated by the counting scheme. For the sparse dataset, only 12% of the entries of the transition matrix are non-zero and non-missing. For the dense subset, 88% are filled. dataset total missing values non-zero zero

Related work

- Markov chains or recommender systems have been studied by several researchers. Zimdars et al [10] describe a sequential recommender based on Markov chains. They investigate how to extract sequential patterns to learn the next state with a standard predictor – e.g. a decision tree. Mobasher et al [5] use pattern mining methods to discover sequential patterns which are used for generating recommendations. Shani et al [9] introduce a recommender based on Markov decision processes (MDP) and also a MC based recommender. To enhance the maximum likelihood estimates (MLE) of the MC transition graphs, they describe several heuristic approaches like clustering and skipping. Instead of improving the MLE estimates with heuristics, we use a factorization model that is learned for optimal ranking instead of transition MLE. In total, the main difference of our work to all the previous approaches is the use of personalized transition graphs which bring together the benefits of sequential, i.e. time-aware, MC with time-invariant user taste. Furthermore factorizing transition probabilities and optimizing the parameters for ranking is new.

Funding

- Steffen Rendle is supported by a research fellowship of the Japan Society for the Promotion of Science (JSPS)
- This work is partially co-funded through the European Commission FP7 project MyMedia (www.mymediaproject.org) under the grant agreement no. 215006
- This work is co-funded by the European Regional Development Fund project LEFOS (www.ismll.uni-hildesheim.de) under the grant agreement no. 62700. Online−Shopping (sparse) F−Measure @ Top5 Half−life utility (HLU) SBPR−FPMC SBPR−FMC SBPR−MF MC dense most popular Dimensionality Online−Shopping (dense)

Study subjects and analysis

users: 4

An example for non-personalized MLE can be seen in figure 2. Here, the buying history for the four users of figure 1 are translated into transitions A of eq (4). The transition matrix can then be applied to predict which items should be recommended given the last basket

users: 10

We evaluate our recommender on anonymized purchase data of an online drug store4. The dataset we used is a 10core subset, i.e. every user bought in total at least 10 items (PB∈Bu |B|) > 10 and vice versa each item was bought by at least 10 users. The statistics of the dataset can be found in table 1

Reference

- J. S. Breese, D. Heckerman, and C. Kadie. Empirical analysis of predictive algorithms for collaborative filtering. In Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence (UAI-98), pages 43–52, San Francisco, 1998. Morgan Kaufmann.
- Y. Hu, Y. Koren, and C. Volinsky. Collaborative filtering for implicit feedback datasets. In IEEE International Conference on Data Mining (ICDM 2008), pages 263–272, 2008.
- Y. Koren. Factorization meets the neighborhood: a multifaceted collaborative filtering model. In KDD ’08: Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 426–434, New York, NY, USA, 2008. ACM.
- Y. Koren. Collaborative filtering with temporal dynamics. In KDD ’09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 447–456, New York, NY, USA, 2009. ACM.
- B. Mobasher, H. Dai, T. Luo, and M. Nakagawa. Using sequential and non-sequential patterns in predictive web usage mining tasks. In ICDM ’02: Proceedings of the 2002 IEEE International Conference on Data Mining, page 669, Washington, DC, USA, 2002. IEEE Computer Society.
- R. Pan and M. Scholz. Mind the gaps: weighting the unknown in large-scale one-class collaborative filtering. In KDD ’09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 667–676, New York, NY, USA, 2009. ACM.
- S. Rendle, C. Freudenthaler, Z. Gantner, and L. Schmidt-Thieme. BPR: Bayesian personalized ranking from implicit feedback. In Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence (UAI 2009), 2009.
- S. Rendle and L. Schmidt-Thieme. Pairwise interaction tensor factorization for personalized tag recommendation. In Proceedings of the Third ACM International Conference on Web Search and Data Mining (WSDM 2010). ACM, 2010.
- G. Shani, D. Heckerman, and R. I. Brafman. An mdp-based recommender system. Journal of Machine Learning Research, 6:1265–1295, 2005.
- A. Zimdars, D. M. Chickering, and C. Meek. Using temporal data for making recommendations. In UAI ’01: Proceedings of the 17th Conference in Uncertainty in Artificial Intelligence, pages 580–588, San Francisco, CA, USA, 2001. Morgan Kaufmann Publishers Inc.

Best Paper

Best Paper of WWW, 2010

Tags

Comments