For a user browsing a particular product, two useful notions of relevant recommendations include substitutes and complements: products that can be purchased instead of each other, and products that can be purchased in addition to each other
Inferring Networks of Substitutable and Complementary Products
ACM Knowledge Discovery and Data Mining, (2015): 785-794
To design a useful recommender system, it is important to understand how products relate to each other. For example, while a user is browsing mobile phones, it might make sense to recommend other phones, but once they buy a phone, we might instead want to recommend batteries, cases, or chargers. In economics, these two types of recommenda...更多
下载 PDF 全文
- Recommender systems are ubiquitous in applications ranging from e-commerce to social media, video, and online news platforms.
- When a user in an online store is examining t-shirts she should receive recommendations for similar t-shirts, or otherwise jeans, sweatshirts, and socks, rather than a movie even though she may very well be interested in it
- From these relationships the authors can construct a product graph, where nodes represent products, and edges represent various types of product relationships.
- LDA associates each document in a corpus d ∈ T with a K-dimensional topic distribution θd the fraction, the K which encodes topics.
- Each topic k has an associated word distribution, φk, which encodes the probability that a particular word is used for that topic.
- Recommender systems are ubiquitous in applications ranging from e-commerce to social media, video, and online news platforms
- While most recommender systems focus on analyzing patterns of interest in products to provide personalized recommendations [14, 30, 34, 36], another important problem is to understand relationships between products, in order to surface recommendations that are relevant to a given context [17, 35]
- Despite the importance of understanding relationships between products there are several interesting questions that make the problem of building product graphs challenging: What are the common types of relationships we might want to discover? What data will allow us to reliably discover relationships between products? How do we model the semantics of why certain products are related?— For example, the semantics of why a given t-shirt might be related to a particular pair of jeans are intricate and can only be captured by a highly flexible model
- We propose a variant of a supervised topic model that identifies topics that are useful as features for link prediction
- For a user browsing a particular product, two useful notions of relevant recommendations include substitutes and complements: products that can be purchased instead of each other, and products that can be purchased in addition to each other
- Results are shown in Table
3 for each of the datasets in Table 2.
- 2. Prediction of ‘substitute’ links is uniformly more accurate than ‘complement’ links for all methods, both in absolute and relative terms.
- Prediction of ‘substitute’ links is uniformly more accurate than ‘complement’ links for all methods, both in absolute and relative terms
- This matches the intuition that substitute links should be ‘easier’ to predict, as they essentially correspond to some notion of similarity, whereas the semantics of complements are more subtle
- A useful recommender system must produce recommendations that not only match the preferences, but which are relevant to the current topic of interest.
- The authors have presented Sceptre, a model for predicting and understanding relationships between linked products.
- The authors have applied this to the problem of identifying substitutable and complementary products on a large collection of Amazon data, including 144 million reviews and 237 million ground-truth relationships based on browsing and co-purchasing logs.
- Table1: Notation
- Table2: Dataset statistics for a selection of categories on Amazon
- Table3: Link prediction accuracy for substitute and complement links (the former are not available for the majority of Music/Movies products in our dataset). Absolute performance is shown at left, reduction in error vs. random classification at right
- Table4: Link prediction accuracy using cold-start data (manufacturer’s and editorial descriptions)
- Table5: A selection of topics from Electronics and Men’s Clothing along with our labels for each topic. Top 10 words/bigrams from each topic are shown after subtracting the background distribution. Capital letters denote brand names (Bamboo, Wacom, Red Wing, etc.)
- The basic task of a recommender system is to suggest relevant items to users, based on their opinions, context, and behavior. One component of this task is that of estimating users’ ratings or rankings of products , e.g. by matrix factorization  or collaborative filtering . Our goal here is related but complementary to rating estimation as we aim to discover relations between products.
In principle the types of relationships in which we are interested can be mined from behavioral data, such as browsing and co-purchasing logs. For example, Amazon allows users to navigate between products through links such as ‘users who bought X also bought Y’ and ‘users who viewed X also viewed Y’ . Such a ‘co-counting’ solution, while simple, has a few shortcomings, for example it may produce noisy recommendations for infrequentlypurchased products, and has limited ability to explain the recommendations it provides. More sophisticated solutions have been proposed that make use of browsing and co-purchasing data (e.g. ), but in contrast to such ‘behavioral-based’ solutions our goal is to learn the semantics of ‘what makes products related?’ in order to generate new links, adapt to different notions of relatedness, and to understand and explain the features that cause humans to consider products to be related.
- Mights want to discover? What data will allow us to reliably discover relationships between products? How do models the semantics of why certain products are related?— For example, the semantics of why a given t-shirt might be related to a particular pair of jeans are intricate and can only be captured by a highly flexible model
- Focuses on identifying two types of links between products: substitutes and complements
- Evaluates Sceptre in terms of its accuracy at link prediction and ranking, wfinds it to be significantly
- Finds that the most useful source of information to identify substitutes and complements is the text associated with each product, from which are able to uncover the key features and relationships between products, and to explain these relationships through textual signals
- R. Balasubramanyan and W. Cohen. Block-LDA: Jointly modeling entity-annotated text and entity-entity links. In SDM, 2011.
- J. Bennett and S. Lanning. The Netflix prize. In KDD Cup and Workshop, 2007.
- D. Blei and J. McAuliffe. Supervised topic models. In NIPS, 2007.
- D. Blei, A. Ng, and M. Jordan. Latent Dirichlet allocation. JMLR, 2003.
- D. M. Blei, T. Griffiths, M. Jordan, and J. Tenenbaum. Hierarchical topic models and the nested Chinese restaurant process. In NIPS, 2003.
- S. Brody and N. Elhadad. An unsupervised aspect-sentiment model for online reviews. In ACL, 2010.
- J. Chang and D. Blei. Relational topic models for document networks. In AIStats, 2009.
- J. Chang, J. Boyd-Graber, and D. Blei. Connections between the lines: augmenting social networks with text. In KDD, 2009.
- C. Fellbaum. WordNet: An Electronic Lexical Database. Bradford Books, 1998.
- M. Gamon, A. Aue, S. Corston-Oliver, and E. Ringger. Pulse: Mining customer opinions from free text. In IDA, 2005.
- G. Ganu, N. Elhadad, and A. Marian. Beyond the stars: Improving rating predictions using review text content. In WebDB, 2009.
- S. Jagabathula, N. Mishra, and S. Gollapudi. Shopping for products you don’t know you need. In WSDM, 2011.
- B. Kanagal, A. Ahmed, S. Pandey, V. Josifovski, J. Yuan, and L. Garcia-Pueyo. Supercharging recommender systems using taxonomies for learning user purchase behavior. VLDB, 2012.
- Y. Koren and R. Bell. Advances in collaborative filtering. In Recommender Systems Handbook. Springer, 2011.
- Y. Koren, R. Bell, and C. Volinsky. Matrix factorization techniques for recommender systems. Computer, 2009.
- A. Levi, O. Mokryn, C. Diot, and N. Taft. Finding a needle in a haystack of reviews: cold start context-based hotel recommender system. In RecSys, 2012.
- G. Linden, B. Smith, and J. York. Amazon.com recommendations: Item-to-item collaborative filtering. IEEE Internet Computing, 2003.
- Y. Liu, A. Niculescu-Mizil, and W. Gryc. Topic-link LDA: joint models of topic and author community. In ICML, 2009.
- Y. Liu, W. Wang, B. Lévy, F. Sun, D.-M. Yan, L. Lu, and C. Yang. On centroidal Voronoi tessellation – energy smoothness and fast computation. ACM Trans. on Graphics, 2009.
- C. D. Manning, P. Raghavan, and H. Schütze. Introduction to Information Retrieval. Cambridge University Press, 2008.
- A. Mas-Colell, M. Whinston, and J. Green. Microeconomic Theory. Oxford University Press, 1995.
- J. McAuley and J. Leskovec. Hidden factors and hidden topics: understanding rating dimensions with review text. In RecSys, 2013.
- S. Moghaddam and M. Ester. On the design of LDA models for aspect-based opinion mining. In CIKM, 2012.
- S. Moghaddam and M. Ester. The FLDA model for aspect-based opinion mining: Addressing the cold start problem. In WWW, 2013.
- J. Nocedal. Updating quasi-newton matrices with limited storage. Mathematics of Computation, 1980.
- D. Panigrahi and S. Gollapudi. Result enrichment in commerce search using browse trails. In WSDM, 2011.
- S.-T. Park and W. Chu. Pairwise preference regression for cold-start recommendation. In RecSys, 2009.
- A. Popescu and O. Etzioni. Extracting product features and opinions from reviews. In HLT, 2005.
- A. Reyes and P. Rosso. Mining subjective knowledge from customer reviews: A specific case of irony detection. In HLT, 2011.
- A. Schein, A. Popescul, L. Ungar, and D. Pennock. Methods and metrics for cold-start recommendations. In SIGIR, 2002.
- I. Titov and R. McDonald. A joint model of text and aspect ratings for sentiment summarization. In ACL, 2008.
- I. Titov and R. McDonald. Modeling online reviews with multi-grain topic models. In WWW, 2008.
- D. Vu, A. Asuncion, D. Hunter, and P. Smyth. Dynamic egocentric models for citation networks. In ICML, 2011.
- C. Wang and D. Blei. Collaborative topic modeling for recommending scientific articles. In KDD, 2011.
- J. Zheng, X. Wu, J. Niu, and A. Bolivar. Substitutes or complements: another step forward in recommendations. In EC, 2009.
- K. Zhou, S.-H. Yang, and H. Zha. Functional matrix factorizations for cold-start recommendation. In SIGIR, 2011.