AI帮你理解科学

AI 生成解读视频

AI抽取解析论文重点内容自动生成视频


pub
生成解读视频

AI 溯源

AI解析本论文相关学术脉络


Master Reading Tree
生成 溯源树

AI 精读

AI抽取本论文的概要总结


微博一下
Our model is able to capture the sentiment in each aspect of a review, and predict partial scores under different aspects

Jointly modeling aspects, ratings and sentiments for movie recommendation (JMARS)

KDD, pp.193-202, (2014)

被引用398|浏览242
EI
下载 PDF 全文
引用
微博一下

摘要

Recommendation and review sites offer a wealth of information beyond ratings. For instance, on IMDb users leave reviews, commenting on different aspects of a movie (e.g. actors, plot, visual effects), and expressing their sentiments (positive or negative) on these aspects in their reviews. This suggests that uncovering aspects and sentime...更多

代码

数据

0
简介
  • Collaborative filtering is a staple to many business in the internet economy. Data to build good content recommender systems essentially comes in three guises: interactions, ratings, and reviews.
  • There is rating information regarding whether the user enjoyed the recommended item.
  • This is the traditional domain of collaborative filtering.
  • The top three background words are ‘film’, ‘story’, and ’character’, all of which provide little information about aspects or sentiments.
  • This is not a mistake, as the word ‘nasty’ can convey positive or negative connotations for different users at the same time
重点内容
  • Collaborative filtering is a staple to many business in the internet economy
  • Data to build good content recommender systems essentially comes in three guises: interactions, ratings, and reviews
  • Our model outperforms state-of-the-art recommender systems such as matrix factorization [15]
  • Aspect-sentiments contain sentiment words specific to aspects, e.g. “spectacular” of “Adventure” aspect, “sharp” of “Social” aspect, and “nasty” of “Violence” aspect. These words emphasize the importance of discriminating sentiment words for different aspects
  • Our model is able to capture the sentiment in each aspect of a review, and predict partial scores under different aspects
结果
  • The authors' model outperforms state-of-the-art recommender systems such as matrix factorization [15].
  • As is common in collaborative filtering, only a tiny fraction of matrix entries are present — the dataset contained less than 0.03% observed entries.
  • The authors' model outperforms state-of-the-art methods in terms of MSE on recommendation.
  • The authors' model achieves the best performance in terms of different factor size when the size of aspect is 20
结论
  • In this paper the authors proposed JMARS which provides superior recommendations by exploiting all the available data sources.
  • Towards this end, the authors involve information from review and ratings.
  • The user interests and movie topics can be inferred with the integrated model.
  • Future work includes capturing the hierarchical nature of movie topics and incorporating non-parametric models to increase flexibility.
  • A fast inference algorithm is required to further increase the scalability of this model
表格
  • Table1: IMDb data set. Unigrams containing stop words or punctuations, as well as infrequent unigrams that appear less than five times in the corpus are removed during pruning
  • Table2: Comparison of models in terms of perplexity on held-out data in terms of different topic and latent factor size
  • Table3: Comparison of models in terms of MSE on held-out data. † and ‡ mean the result is better than the method in the previous columns at 1% and 0.1% significance level, measured by McNemar’s test
  • Table4: The learnt aspect-specific ratings and latent sentiment identified by our model for a review
  • Table5: Top background words from φ0 and sentiment words from φs
  • Table6: Top topic words from φa for three topics measure by aggregating all θu,m across reviews. The aspect labels (adventure, violence, social) are manually assigned
  • Table7: Top movie-specific words from φm
Download tables as Excel
相关工作
  • Collaborative filtering is a fertile area of research and there exists a multitude of techniques which can readily be applied to subsets of the problem that we tackle. See e.g. [18, 9] for a review. Specifically, probabilistic matrix factorization methods [15, 17] have proven successful in real world problems [3, 8, 11, 25, 22].

    However, probabilistic matrix factorization techniques struggle to generalize to new items, i.e. they fail at the cold-start problem. Regression based latent factor models (RLFM) [1] use attribute features to solve this problem by incorporating observable features into latent factors. Recent research [22, 16] incorporates Latent Dirichlet Allocation (LDA) and uses the topic as features, e.g. for recommending scientific articles. In terms of ratings, [19] use a statistically more appropriate model for capturing the discrete nature of the reviews by formulating an exponential families approach.
基金
  • This research is supported by the Singapore National Research Foundation under its International Research Centre @ Singapore Funding Initiative and administered by the IDM Programme Office, Media Development Authority (MDA)
引用论文
  • D. Agarwal and B.-C. Chen. Regression-based latent factor models. In J. Elder, F. Fogelman-Soulie, P. Flach, and M. Zaki, editors, ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 19–28. ACM, 2009.
    Google ScholarLocate open access versionFindings
  • A. Ahmed and E. P. Xing. Staying informed: supervised and semi-supervised multi-view topical analysis of ideological perspective. In Conference on Empirical Methods in Natural Language Processing, pages 1140–1150. ACL, 2010.
    Google ScholarLocate open access versionFindings
  • R. M. Bell and Y. Koren. Lessons from the Netflix prize challenge. SIGKDD Explorations, 9(2):75–79, 2007.
    Google ScholarLocate open access versionFindings
  • A. Z. Broder. Computational advertising and recommender systems. In P. Pu, D. G. Bridge, B. Mobasher, and F. Ricci, editors, Conference on Recommender Systems, pages 1–2. ACM, 2008.
    Google ScholarLocate open access versionFindings
  • J.-F. Cai, E. J. Candes, and Z. Shen. A singular value thresholding algorithm for matrix completion. SIAM Journal on Optimization, 20(4):1956–1982, 2010.
    Google ScholarLocate open access versionFindings
  • T. Griffiths and M. Steyvers. Finding scientific topics. Proceedings of the National Academy of Sciences, 101:5228–5235, 2004.
    Google ScholarLocate open access versionFindings
  • L. Hong, A. Ahmed, S. Gurumurthy, A. Smola, and K. Tsioutsiouliklis. Discovering geographical topics in the twitter stream. In International Conference on World Wide Web, 2012. Aspect social moral society point question human god act nature issues men personal culture behavior conflict
    Google ScholarLocate open access versionFindings
  • [9] Y. Koren, R. Bell, and C. Volinsky. Matrix factorization techniques for recommender systems. IEEE Computer, 42(8):30–37, 2009.
    Google ScholarLocate open access versionFindings
  • [10] A. Lazaridou, I. Titov, and C. Sporleder. A bayesian model for joint unsupervised induction of sentiment, aspect and discourse representations. In Annual Meeting of the Association for Computational Linguistics, pages 1630–1639, 2013.
    Google ScholarLocate open access versionFindings
  • [11] H. Ma, H. Yang, M. R. Lyu, and I. King. SoRec: Social Recommendation Using Probabilistic Matrix Factorization. In Conference on Information and Knowledge Management, pages 931–940, 2008.
    Google ScholarLocate open access versionFindings
  • [12] J. McAuley and J. Leskovec. Hidden Factors and Hidden Topics: Understanding Rating Dimensions with Review Text. In Conference on Recommender Systems, pages 165–172, 2013.
    Google ScholarLocate open access versionFindings
  • [13] J. J. McAuley, J. Leskovec, and D. Jurafsky. Learning attitudes and attributes from multi-aspect reviews. In International Conference on Data Mining, pages 1020–1025, 2012.
    Google ScholarLocate open access versionFindings
  • [14] Q. Mei, X. Ling, M. Wondra, H. Su, and C. Zhai. Topic sentiment mixture: Modeling facets and opinions in weblogs. In International Conference on World Wide Web, pages 171–180, 2007.
    Google ScholarLocate open access versionFindings
  • [15] A. Mnih and R. Salakhutdinov. Probabilistic matrix factorization. In Neural Information Processing Systems Conference, pages 1257–1264, 2007.
    Google ScholarLocate open access versionFindings
  • [16] I. Porteous, E. Bart, and M. Welling. Multi-HDP: A non parametric bayesian model for tensor factorization. In D. Fox and C. Gomes, editors, Conference on Artificial Intelligence, pg. 1487–1490. 2008.
    Google ScholarLocate open access versionFindings
  • [17] R. Salakhutdinov and A. Mnih. Bayesian probabilistic matrix factorization using markov chain monte carlo. In W. Cohen, A. McCallum, and S. Roweis, editors, International Conference on Machine Learning, volume 307, pages 880–887. ACM, 2008.
    Google ScholarLocate open access versionFindings
  • [18] X. Su and T. M. Khoshgoftaar. A survey of collaborative filtering techniques. Advances in Artificial Intelligence, 2009 4:2, Jan. 2009.
    Google ScholarLocate open access versionFindings
  • [19] C. Tan, E. H. Chi, D. Huffaker, G. Kossinets, and A. J. Smola. Instant foodie: Predicting expert ratings from grassroots. In Conference on Information and Knowledge Management, 2013.
    Google ScholarLocate open access versionFindings
  • [20] I. Titov and R. Mcdonald. A Joint Model of Text and Aspect Ratings for Sentiment Summarization. In Annual Meeting of the Association for Computational Linguistics, pages 308–316, Columbus, Ohio, 2008.
    Google ScholarLocate open access versionFindings
  • [22] C. Wang and D. M. Blei. Collaborative Topic Modeling for Recommending Scientific Articles. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 448–456, 2011.
    Google ScholarLocate open access versionFindings
  • [23] H. Wang, Y. Lu, and C. Zhai. Latent aspect rating analysis without aspect keyword supervision. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 618–626, 2011.
    Google ScholarLocate open access versionFindings
  • [24] M. Weimer, A. Karatzoglou, Q. Le, and A. J. Smola. Cofi rank - maximum margin matrix factorization for collaborative ranking. In J. Platt, D. Koller, Y. Singer, and S. Roweis, editors, Advances in Neural Information Processing Systems 20, 2008.
    Google ScholarLocate open access versionFindings
  • [25] S.-H. Yang, B. Long, A. Smola, H. Zha, and Z. Zheng. Collaborative competitive filtering: learning recommender using context of user choice. In W.-Y. Ma, J.-Y. Nie, R. A. Baeza-Yates, T.-S. Chua, and W. B. Croft, editors, Research and Development in Information Retrieval, pages 295–304. ACM, 2011.
    Google ScholarLocate open access versionFindings
  • [26] X. Zhao, J. Jiang, H. Yan, and X. Li. Jointly modeling aspects and opinions with a MaxEnt-LDA hybrid. In Conference on Empirical Methods in Natural Language Processing, pages 56–65, 2010.
    Google ScholarLocate open access versionFindings
您的评分 :
0

 

标签
评论
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科