Understanding the Role of Style in E-commerce Shopping

pp. 3112-3120, 2019.

Cited by: 2|Bibtex|Views78|DOI:https://doi.org/10.1145/3292500.3330760
EI
Other Links: dl.acm.org|dblp.uni-trier.de|academic.microsoft.com
Weibo:
In this paper we proposed a style predictive model to better understand the influence of style on our listing inventory

Abstract:

Aesthetic style is the crux of many purchasing decisions. When considering an item for purchase, buyers need to be aligned not only with the functional aspects (e.g. description, category, ratings) of an item's specification, but also its stylistic and aesthetic aspects (e.g. modern, classical, retro) as well. Style becomes increasingly i...More

Code:

Data:

0
Introduction
  • A buyer’s purchase preference is often shaped by a myriad of factors. These factors vary from attributes that have concrete values such as item specification, description, ratings, and cost, to more subjective characteristics such as aesthetics and style.
  • A search on Etsy for "large canvas tote bag" under the "Bags & Purses" category for "under $25" and "ships to the U.S." still matches around 2,500 items
  • This example illustrates that even after satisfying a number of concrete item specifications, the buyer is still confronted with a large number of relevant items by which style is a major differentiating factor.
  • Recommender systems should explicitly model style preference in order to holistically understand a buyer’s purchase preference
Highlights
  • A buyer’s purchase preference is often shaped by a myriad of factors
  • We propose a novel method that blends the best of both worlds: We leverage domain expertise to extract latent style embeddings in a supervised manner that preserves the interpretability of the representation
  • We evaluate our style model by deploying the learned embeddings as a candidate set in a production listing-to-listing recommender system and find that the new method produces recommendations that are more similar in style
  • We discuss two applications of the style model described in the previous section: (1) to predict the style of all items on Etsy, and (2) to extract mid-level style-aware representations that are useful for downstream recommender system tasks
  • In this paper we proposed a style predictive model to better understand the influence of style on our listing inventory
  • We leverage 43 named styles given by merchandising experts in order to setup a supervised, styleaware deep neural network model for predicting a listings style
Methods
  • EXPERIMENTS AND APPLICATIONS

    the authors discuss two applications of the style model described in the previous section: (1) to predict the style of all items on Etsy, and (2) to extract mid-level style-aware representations that are useful for downstream recommender system tasks.
  • The authors formulate the style prediction problem as a supervised, multiclass classification problem, in which the end goal is to classify each item into one of the 43 style classes.
  • To perform this classification, the authors use the multi-modal neural network described in Section 3.2.3, and use the final 43-dimensional softmax output as the predicted probability class vector.
  • The authors can use the 43-dimensional output to understand the mixture of different styles that are present in the item, as an item can often embody the characteristics of one or more style
Conclusion
  • DISCUSSION AND CONCLUSION

    Etsy is a global community based marketplace where people come together to buy and sell unique items.
  • In this paper the authors proposed a style predictive model to better understand the influence of style on the listing inventory.
  • With the help of these embeddings the authors performed the first ever large scale analysis to understand how aesthetic styles impact e-commerce purchase behavior.
  • This gave them insights into how the recommender and search systems can better leverage predicted styles of a listing to match user preferences
Summary
  • Introduction:

    A buyer’s purchase preference is often shaped by a myriad of factors. These factors vary from attributes that have concrete values such as item specification, description, ratings, and cost, to more subjective characteristics such as aesthetics and style.
  • A search on Etsy for "large canvas tote bag" under the "Bags & Purses" category for "under $25" and "ships to the U.S." still matches around 2,500 items
  • This example illustrates that even after satisfying a number of concrete item specifications, the buyer is still confronted with a large number of relevant items by which style is a major differentiating factor.
  • Recommender systems should explicitly model style preference in order to holistically understand a buyer’s purchase preference
  • Methods:

    EXPERIMENTS AND APPLICATIONS

    the authors discuss two applications of the style model described in the previous section: (1) to predict the style of all items on Etsy, and (2) to extract mid-level style-aware representations that are useful for downstream recommender system tasks.
  • The authors formulate the style prediction problem as a supervised, multiclass classification problem, in which the end goal is to classify each item into one of the 43 style classes.
  • To perform this classification, the authors use the multi-modal neural network described in Section 3.2.3, and use the final 43-dimensional softmax output as the predicted probability class vector.
  • The authors can use the 43-dimensional output to understand the mixture of different styles that are present in the item, as an item can often embody the characteristics of one or more style
  • Conclusion:

    DISCUSSION AND CONCLUSION

    Etsy is a global community based marketplace where people come together to buy and sell unique items.
  • In this paper the authors proposed a style predictive model to better understand the influence of style on the listing inventory.
  • With the help of these embeddings the authors performed the first ever large scale analysis to understand how aesthetic styles impact e-commerce purchase behavior.
  • This gave them insights into how the recommender and search systems can better leverage predicted styles of a listing to match user preferences
Tables
  • Table1: Accuracy from retraining N-Last layers
  • Table2: Accuracy Results from Training different style classification models
  • Table3: Cosine Similarity between pairs of listings from different (or same) shop and taxonomy
  • Table4: Cosine Similarity between pairs of listings from different (or same) user that were purchased or favorited when we consider user favorites. This confirms our intuition that our style embeddings are able to capture the stylistically similar patterns in users’ favoriting and purchasing behavior
  • Table5: Favorite count and purchase count regression models estimation and testing results
Download tables as Excel
Related work
  • The concept of visual style has been the subject of much discussion in past literature and touches on several different bodies of work, including visual and style-aware recommender systems, as well as models that focus on predicting styles or other related visual cues.

    Before discussing visual style, we first describe general, visuallyaware systems in which learned models detect or segment objects that are present in an image or scene [3, 7, 31]. Earlier work in this field made use of hand-crafted features and focused on sliding window classification [4, 5, 29], while more recent work leverages the power of CNNs and R-CNNs [6,7,8, 21, 37]. There have also been efforts to fuse image and text signals to achieve useful mid-level semantic representations and higher performance [2, 10, 18,19,20].

    In recent years, many recommender systems have come to leverage image-recognition methods to improve recommender systems by finding visually similar items to aid in product discovery [14, 34, 36] or to understand user visual preferences to better personalize recommended items [13, 23]. This body of work is more concerned with identifying the identity of the object present, as opposed to understanding the style of the object. Of greater relevance to our proposed work are those visual recommender systems that have been developed for the online fashion industry, where explicit clothing items (e.g. blouse, pants, hat) are “parsed" [33], and visual attributes are extracted from each item in order to perform such tasks as judging clothing compatibility and fashionability [11, 12, 16, 27].
Reference
  • R. S. Arora and A. Elgammal. 2012. Towards automated classification of fine-art painting style: A comparative study. In ICPR.
    Google ScholarFindings
  • E. Bruni, N.-K. Tran, and M. Baroni. 2014. Multimodal distributional semantics. JAIR (2014).
    Google ScholarLocate open access versionFindings
  • J. Dai, K. He, and J. Sun. 2016. Instance-aware semantic segmentation via multitask network cascades. CVPR (2016).
    Google ScholarLocate open access versionFindings
  • N. Dalal and B. Triggs. 2005. Histograms of oriented gradients for human detection. CVPR (2005).
    Google ScholarLocate open access versionFindings
  • P. F. Felzenszwalb, R. B. Girshick, D. McAllester, and D. Ramanan. 2010. Object detection with discriminatively trained part based models. TPAMI (2010).
    Google ScholarLocate open access versionFindings
  • R. Girshick. 2015. Fast R-CNN. ICCV (2015).
    Google ScholarLocate open access versionFindings
  • R. Girshick, andT. Darrell J. Donahue, and J. Malik. 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. CVPR (2014).
    Google ScholarLocate open access versionFindings
  • Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. CVPR (2016).
    Google ScholarLocate open access versionFindings
  • Ruining He, Chen Fang, Zhaowen Wang, and Julian McAuley. 2016. Vista: A Visually, Socially, and Temporally-aware Model for Artistic Recommendation. Recsys (2016).
    Google ScholarLocate open access versionFindings
  • Felix Hill and Anna Korhonen. 2014. Learning Abstract Concept Embeddings from Multi-Modal Data: Since You Probably Can’t See What I Mean. In EMNLP.
    Google ScholarFindings
  • W. Hsiao and K. Grauman. 2017. Learning the Latent "Look": Unsupervised Discovery of a Style-Coherent Embedding from Fashion Images. ICCV (2017).
    Google ScholarLocate open access versionFindings
  • Wei-Lin Hsiao and Kristen Grauman. 2018. Creating Capsule Wardrobes from Fashion Images. CVPR (2018).
    Google ScholarLocate open access versionFindings
  • Diane J. Hu, Rob Hall, and Josh Attenberg. 2014. Style in the Long Tail: Discovering Unique Interests with Latent Variable Models in Large Scale Social E-commerce. In KDD.
    Google ScholarFindings
  • Yushi Jing, David C. Liu, Dmitry Kislyuk, Andrew Zhai, Jiajing Xu, Jeff Donahue, and Sarah Tavel. 2015. Visual Search at Pinterest. KDD (2015).
    Google ScholarLocate open access versionFindings
  • Jeff Johnson, Matthijs Douze, and Hervé Jégou. 2017. Billion-scale similarity search with GPUs. arXiv preprint arXiv:1702.08734 (2017).
    Findings
  • Wang-Cheng Kang, Chen Fang, Zhaowen Wang, and Julian McAuley. 2017. Visually-Aware Fashion Recommendation and Design with Generative Image Models. ICDM (2017).
    Google ScholarLocate open access versionFindings
  • Sergey Karayev, Aaron Hertzmann, Holger Winnemoeller, Aseem Agarwala, and Trevor Darrell. 2013. Recognizing Image Style. CoRR abs/1311.3715 (2013).
    Findings
  • Douwe Kiela and Léon Bottou. 2014. Learning Image Embeddings using Convolutional Neural Networks for Improved Multi-Modal Semantics. EMNLP (2014).
    Google ScholarLocate open access versionFindings
  • Ryan Kiros, Ruslan Salakhutdinov, and Richard Zemel. 2014. Multimodal Neural Language Models. ICML (2014).
    Google ScholarLocate open access versionFindings
  • Ryan Kiros, Ruslan Salakhutdinov, and Richard S. Zemel. 2014. Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models. CoRR abs/1411.2539 (2014). arXiv:1411.2539
    Findings
  • Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2017. ImageNet Classification with Deep Convolutional Neural Networks. Commun. ACM 60, 6 (May 2017).
    Google ScholarLocate open access versionFindings
  • Xin Lu, Zhe Lin, Xiaohui Shen, Radomir Mech, and James Z. Wang. 2015. Deep Multi-Patch Aggregation Network for Image Style, Aesthetics, and Quality Estimation. In ICCV.
    Google ScholarFindings
  • Julian McAuley, Christopher Targett, Qinfeng Shi, and Anton van den Hengel. 2015. Image-Based Recommendations on Styles and Substitutes. In SIGIR.
    Google ScholarLocate open access versionFindings
  • N. Murray, L. Marchesotti, and F. Perronnin. 2012. AVA: A large-scale database for aesthetic visual analysis. In CVPR.
    Google ScholarFindings
  • Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in PyTorch. In NIPS-W.
    Google ScholarLocate open access versionFindings
  • Qinfeng Shi, James Petterson, Gideon Dror, John Langford, Alexander J. Smola, and S. V. N. Vishwanathan. 2009. Hash Kernels for Structured Data. Journal of Machine Learning Research (2009).
    Google ScholarLocate open access versionFindings
  • E. Simo-Serra, S. Fidler, F. Moreno-Noguer, and R. Urtasun. 2015. Neuroaesthetics in fashion: Modeling the perception of fashionability. In CVPR. 869–877.
    Google ScholarLocate open access versionFindings
  • Andreas Veit, Balazs Kovacs, Sean Bell, Julian McAuley, Kavita Bala, and Serge J. Belongie. 2015. Learning Visual Clothing Style with Heterogeneous Dyadic Co-occurrences. ICCV (2015).
    Google ScholarLocate open access versionFindings
  • P. Viola and M. Jones. 2001. Rapid object detection using a boosted cascade of simple features. CVPR (2001).
    Google ScholarLocate open access versionFindings
  • Kilian Q. Weinberger, Anirban Dasgupta, Josh Attenberg, John Langford, and Alexander J. Smola. 2009. Feature Hashing for Large Scale Multitask Learning. CoRR (2009).
    Google ScholarLocate open access versionFindings
  • J. Wu, Y. Yu, C. Huang, and K. Yu. 2015. Deep multiple instance learning for image classification and auto-annotation. CVPR (2015).
    Google ScholarLocate open access versionFindings
  • Zhe Xu, Dacheng Tao, Ya Zhang, Junjie Wu, and Ah Chung Tsoi. 2014. Architectural Style Classification Using Multinomial Latent Logistic Regression. ECCV 2014 (2014), 600–615.
    Google ScholarLocate open access versionFindings
  • K. Yamaguchi, M. H. Kiapour, L. E. Ortiz, and T. L. Berg. 2012. Parsing clothing in fashion photographs. In CVPR. 3570–3577.
    Google ScholarFindings
  • Fan Yang, Ajinkya Kale, Yury Bubnov, Leon Stein, Qiaosong Wang, M. Hadi Kiapour, and Robinson Piramuthu. 2017. Visual Search at eBay.. In KDD.
    Google ScholarFindings
  • Nick Zangwill. 2019. Aesthetic Judgement. In The Stanford Encyclopedia of Philosophy.
    Google ScholarLocate open access versionFindings
  • Andrew Zhai, Dmitry Kislyuk, Yushi Jing, Michael Feng, Eric Tzeng, Jeff Donahue, Yue Li Du, and Trevor Darrell. 2017. Visual Discovery at Pinterest. WWW (2017).
    Google ScholarFindings
  • Liliang Zhang, Liang Lin, Xiaodan Liang, and Kaiming He. 2016. Is faster r-cnn doing well for pedestrian detection?. In European conference on computer vision. Springer, 443–457.
    Google ScholarLocate open access versionFindings
  • Xiaoting Zhao, Raphael Louca, Diane Hu, and Liangjie Hong. 2018. Learning Item-Interaction Embeddings for User Recommendations. CoRR (2018).
    Google ScholarLocate open access versionFindings
Full Text
Your rating :
0

 

Tags
Comments