A Generic Coordinate Descent Framework for Learning from Implicit Feedback

Proceedings of the 26th International Conference on World Wide Web, pp. 1341-1350, 2017.

Cited by: 108|Bibtex|Views149|DOI:https://doi.org/10.1145/3038912.3052694
EI
Other Links: dblp.uni-trier.de|dl.acm.org|academic.microsoft.com|arxiv.org
Weibo:
We have presented a general, efficient framework for learning recommender system models from implicit feedback

Abstract:

In recent years, interest in recommender research has shifted from explicit feedback towards implicit feedback data. A diversity of complex models has been proposed for a wide variety of applications. Despite this, learning from implicit feedback is still computationally challenging. So far, most work relies on stochastic gradient descent...More

Code:

Data:

0
Introduction
  • The focus of recommender system research has shifted from explicit feedback problems such as rating prediction to implicit feedback problems.
  • Most of the signal that a user provides about her preferences is implicit.
  • Examples for implicit feedback are: a user watches a video, clicks on a link, etc.
  • Implicit feedback data is much cheaper to c 2017 International World Wide Web Conference Committee (IW3C2), published under Creative Commons CC-BY-NC-ND 2.0 License.
  • WWW 2017, April 3–7, 2017, Perth, Australia.
Highlights
  • The focus of recommender system research has shifted from explicit feedback problems such as rating prediction to implicit feedback problems
  • In Section 5, we show how to apply iCD to a diverse set of models, including, matrix factorization (MF), factorization machines (FM) and tensor factorization
  • In Section 5, we show that many common models are k-separable, including matrix factorization, feature-based approaches such as factorization machines, and higher-order tensor factorization such as Parallel Factor Analysis or Tucker decomposition
  • We have presented a general, efficient framework for learning recommender system models from implicit feedback
  • We have shown that the implicit regularizer of any k-separable model can be computed efficiently without iterating over all context-item pairs
Methods
  • The main objective of the experiments is to illustrate the generality of the iCD framework.
  • The purpose of the experiments is not to compare BPR and CD on yet another dataset, but rather to demonstrate the versatility of the iCD framework and illustrate how it can serve as a building block for future research on complex recommender models.
  • As with MF, it is likely that both iCD and BPR will show strengths in different applications
Results
  • 6.2.1 Cold-Start Recommendation

    In the Cold-Start recommendation [2] scenario, the authors assume that a user interacts with the recommender system for the first time.
Conclusion
  • The authors have presented a general, efficient framework for learning recommender system models from implicit feedback.
  • The authors have shown that the implicit regularizer of any k-separable model can be computed efficiently without iterating over all context-item pairs.
  • The authors have provided efficient learning algorithms for these models based on the framework.
  • The authors' framework is not limited to the models discussed in the paper but designed to serve as a general blueprint for deriving learning algorithms for recommender systems
Summary
  • Introduction:

    The focus of recommender system research has shifted from explicit feedback problems such as rating prediction to implicit feedback problems.
  • Most of the signal that a user provides about her preferences is implicit.
  • Examples for implicit feedback are: a user watches a video, clicks on a link, etc.
  • Implicit feedback data is much cheaper to c 2017 International World Wide Web Conference Committee (IW3C2), published under Creative Commons CC-BY-NC-ND 2.0 License.
  • WWW 2017, April 3–7, 2017, Perth, Australia.
  • Methods:

    The main objective of the experiments is to illustrate the generality of the iCD framework.
  • The purpose of the experiments is not to compare BPR and CD on yet another dataset, but rather to demonstrate the versatility of the iCD framework and illustrate how it can serve as a building block for future research on complex recommender models.
  • As with MF, it is likely that both iCD and BPR will show strengths in different applications
  • Results:

    6.2.1 Cold-Start Recommendation

    In the Cold-Start recommendation [2] scenario, the authors assume that a user interacts with the recommender system for the first time.
  • Conclusion:

    The authors have presented a general, efficient framework for learning recommender system models from implicit feedback.
  • The authors have shown that the implicit regularizer of any k-separable model can be computed efficiently without iterating over all context-item pairs.
  • The authors have provided efficient learning algorithms for these models based on the framework.
  • The authors' framework is not limited to the models discussed in the paper but designed to serve as a general blueprint for deriving learning algorithms for recommender systems
Related work
  • Since several years, matrix factorization (MF) is regarded as the most effective, basic recommender system model. Two optimization strategies dominate the research on MF from implicit feedback data. The first one is Bayesian Personalized Ranking (BPR) [13], a stochastic gradient descent (SGD) framework, that contrasts pairs of consumed to nonconsumed items. The second one is coordinate descent (CD) also known as alternating least squares on an elementwise loss over both the consumed and non-consumed items [5]. In terms of the loss formulation, BPR’s pairwise classification loss is better suited for ranking whereas CD loss is better suited for numerical data. With regard to the optimization task, both techniques face the same challenge of learning over a very large number of training examples. BPR tackles this issue by sampling negative items, but it has been shown that BPR has convergence problems when the number of items is large [7, 12]. It requires more complex, nonuniform, sampling strategies for dealing with this problem [12, 6]. On the other hand, for CD-MF, Hu et al [5] have derived an efficient algorithm that allows to optimize over the large number of non-consumed items without any cost. This computational trick is exact and does not involve sampling. Many authors have compared both CD-MF and BPR-MF on a variety of datasets and some work reports better quality for BPR-MF [4, 17, 16, 8] whereas for other problems CD-MF works better [8, 25, 15, 22, 26]. This large body of results indicates that the advantages of CD and BPR are orthogonal and both approaches have their merits.
Reference
  • C. Cheng, H. Yang, M. R. Lyu, and I. King. Where You Like to Go Next: Successive Point-of-Interest Recommendation. In IJCAI, volume 13, pages 2605–2611, 2013.
    Google ScholarLocate open access versionFindings
  • Z. Gantner, L. Drumond, C. Freudenthaler, S. Rendle, and L. Schmidt-Thieme. Learning attribute-to-feature mappings for cold-start recommendations. In 2010 IEEE International Conference on Data Mining, pages 176–185. IEEE, 2010.
    Google ScholarLocate open access versionFindings
  • R. A. Harshman. Foundations of the PARAFAC procedure: Models and conditions for an ”explanatory” multi-modal factor analysis. UCLA Working Papers in Phonetics, 16(1):84, 1970.
    Google ScholarLocate open access versionFindings
  • R. He and J. McAuley. VBPR: Visual bayesian personalized ranking from implicit feedback. In D. Schuurmans and M. P. Wellman, editors, AAAI, pages 144–150. AAAI Press, 2016.
    Google ScholarFindings
  • Y. Hu, Y. Koren, and C. Volinsky. Collaborative filtering for implicit feedback datasets. In Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, ICDM ’08, pages 263–272, 2008.
    Google ScholarLocate open access versionFindings
  • B. Kanagal, A. Ahmed, S. Pandey, V. Josifovski, J. Yuan, and L. Garcia-Pueyo. Supercharging recommender systems using taxonomies for learning user purchase behavior. Proc. VLDB Endow., 5(10):956–967, June 2012.
    Google ScholarLocate open access versionFindings
  • B. McFee, T. Bertin-Mahieux, D. P. Ellis, and G. R. Lanckriet. The million song dataset challenge. In Proceedings of the 21st International Conference on World Wide Web, WWW ’12 Companion, pages 909–916, New York, NY, USA, 2012. ACM.
    Google ScholarLocate open access versionFindings
  • X. Ning and G. Karypis. Slim: Sparse linear methods for top-n recommender systems. In 2011 IEEE 11th International Conference on Data Mining, pages 497–506. IEEE, 2011.
    Google ScholarLocate open access versionFindings
  • W. Pan and L. Chen. GBPR: Group Preference Based Bayesian Personalized Ranking for One-Class Collaborative Filtering. In IJCAI, volume 13, pages 2691–2697, 2013.
    Google ScholarLocate open access versionFindings
  • I. Pilaszy, D. Zibriczky, and D. Tikk. Fast als-based matrix factorization for explicit and implicit feedback datasets. In Proceedings of the fourth ACM conference on Recommender systems, pages 71–78. ACM, 2010.
    Google ScholarLocate open access versionFindings
  • S. Rendle. Factorization machines with libfm. ACM Trans. Intell. Syst. Technol., 3(3):57:1–57:22, may 2012.
    Google ScholarLocate open access versionFindings
  • S. Rendle and C. Freudenthaler. Improving pairwise learning for item recommendation from implicit feedback. In Proceedings of the 7th ACM International Conference on Web Search and Data Mining, WSDM ’14, pages 273–282, New York, NY, USA, 2014. ACM.
    Google ScholarLocate open access versionFindings
  • S. Rendle, C. Freudenthaler, Z. Gantner, and L. Schmidt-Thieme. BPR: Bayesian personalized ranking from implicit feedback. In Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence, UAI ’09, pages 452–461, Arlington, Virginia, United States, 2009. AUAI Press.
    Google ScholarLocate open access versionFindings
  • S. Rendle, C. Freudenthaler, and L. Schmidt-Thieme. Factorizing personalized markov chains for next-basket recommendation. In Proceedings of the 19th International Conference on World Wide Web, WWW ’10, pages 811–820. ACM, 2010.
    Google ScholarLocate open access versionFindings
  • S. Sedhain, A. K. Menon, S. Sanner, and D. Braziunas. On the effectiveness of linear models for one-class collaborative filtering. In Proceedings of the 30th Conference on Artificial Intelligence (AAAI-16), 2016.
    Google ScholarLocate open access versionFindings
  • Y. Shi, A. Karatzoglou, L. Baltrunas, M. Larson, A. Hanjalic, and N. Oliver. TFMAP: optimizing MAP for top-n context-aware recommendation. In Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval, pages 155–164. ACM, 2012.
    Google ScholarLocate open access versionFindings
  • Y. Shi, A. Karatzoglou, L. Baltrunas, M. Larson, N. Oliver, and A. Hanjalic. CLiMF: learning to maximize reciprocal rank with collaborative less-is-more filtering. In Proceedings of the sixth ACM conference on Recommender systems, pages 139–146. ACM, 2012.
    Google ScholarLocate open access versionFindings
  • E. Shmueli, A. Kagian, Y. Koren, and R. Lempel. Care to comment?: recommendations for commenting on news stories. In Proceedings of the 21st international conference on World Wide Web, pages 429–438. ACM, 2012.
    Google ScholarLocate open access versionFindings
  • J.-T. Sun, H.-J. Zeng, H. Liu, Y. Lu, and Z. Chen. Cubesvd: A novel approach to personalized web search. In Proceedings of the 14th International Conference on World Wide Web, WWW ’05, pages 382–390, New York, NY, USA, 2005. ACM.
    Google ScholarLocate open access versionFindings
  • P. Symeonidis, A. Nanopoulos, and Y. Manolopoulos. A unified framework for providing recommendations in social tagging systems based on ternary semantic analysis. IEEE Trans. on Knowl. and Data Eng., 22(2):179–192, Feb. 2010.
    Google ScholarLocate open access versionFindings
  • L. R. Tucker. Some mathematical notes on three-mode factor analysis. Psychometrika, 31:279–311, 1966.
    Google ScholarLocate open access versionFindings
  • M. Volkovs and G. W. Yu. Effective latent models for binary feedback in recommender systems. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 313–3ACM, 2015.
    Google ScholarLocate open access versionFindings
  • H.-F. Yu, C.-J. Hsieh, S. Si, and I. Dhillon. Scalable coordinate descent approaches to parallel matrix factorization for recommender systems. In Proceedings of the 12th International Conference on Data Mining, ICDM ’12, pages 765–774, 2012.
    Google ScholarLocate open access versionFindings
  • X. Yu, X. Ren, Y. Sun, Q. Gu, B. Sturt, U. Khandelwal, B. Norick, and J. Han. Personalized entity recommendation: A heterogeneous information network approach. In Proceedings of the 7th ACM International Conference on Web Search and Data Mining, WSDM ’14, pages 283–292. ACM, 2014.
    Google ScholarLocate open access versionFindings
  • T. Zhao, J. McAuley, and I. King. Leveraging social connections to improve personalized ranking for collaborative filtering. In Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, CIKM ’14, pages 261–270, New York, NY, USA, 2014. ACM.
    Google ScholarLocate open access versionFindings
  • T. Zhao, J. McAuley, and I. King. Improving latent factor models via personalized feature projection for one class recommendation. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, pages 821–830. ACM, 2015.
    Google ScholarLocate open access versionFindings
Full Text
Your rating :
0

 

Tags
Comments