TFNet: Multi-Semantic Feature Interaction for CTR Prediction

Feng Yu
Feng Yu
Xueli Yu
Xueli Yu
Jie Shao
Jie Shao
Fan Huang
Fan Huang

SIGIR '20: The 43rd International ACM SIGIR conference on research and development in Information Retrieval Virtual Event China July, 2020, pp. 1885-1888, 2020.

Cited by: 0|Bibtex|Views103|DOI:https://doi.org/10.1145/3397271.3401304
EI
Other Links: arxiv.org|dl.acm.org|dblp.uni-trier.de|academic.microsoft.com
Weibo:
Extensive offline experiments demonstrate that the TFNet model significantly outperforms existing models and achieves the state-of-the-art results

Abstract:

The CTR (Click-Through Rate) prediction plays a central role in the domain of computational advertising and recommender systems. There exists several kinds of methods proposed in this field, such as Logistic Regression (LR), Factorization Machines (FM) and deep learning based methods like Wide&Deep, Neural Factorization Machines (NFM) and...More

Code:

Data:

0
Introduction
  • The CTR prediction plays a central role in the domain of computational advertising and recommender systems, where an item could whether be recommended is decided by the probability of whether the user would click on it.
  • In ads recommender systems, it is reasonable that the interactions of feature pair and are in the different semantic spaces, where the former learns the effect of the user’s preference on the ad while the latter represents the effect of cost paid by the advertiser on this ad
  • Learning such feature interactions via simple vector-product in just one semantic space is obviously insufficient
Highlights
  • The CTR prediction plays a central role in the domain of computational advertising and recommender systems, where an item could whether be recommended is decided by the probability of whether the user would click on it
  • 2) Extensive offline and online experiments show that TFNet outperforms competitive compared methods on typical Criteo and Avazu datasets, and achieves large improvement of revenue and click rate in online A/B tests in the largest Chinese App recommender systems
  • We first make ablation analysis of the higher-order feature interaction part of TFNet, which consists of the higher-order interactions of the embedding vectors and tensor-based interactive features
  • We propose a tensor-based feature interaction model TFNet, which can learn the feature interactions in different semantic spaces
  • Extensive offline experiments demonstrate that the TFNet model significantly outperforms existing models and achieves the state-of-the-art results
Methods
  • Several representative methods are used for empirical comparison: (i) FM [9] , Wide&Deep [2] , DeepFM [3] , NFM [4] and (v) AFM [13].

    To make a fair comparison, the number of parameters of the proposed TFNet model is set to be approximately equal to that of

    1 https://s3-eu-west-1.amazonaws.com/criteo-labs/dac.tar.gz 2 https://www.kaggle.com/c/avazu-ctr-prediction/data most compared models.
  • Several representative methods are used for empirical comparison: (i) FM [9] , Wide&Deep [2] , DeepFM [3] , NFM [4] and (v) AFM [13].
  • To make a fair comparison, the number of parameters of the proposed TFNet model is set to be approximately equal to that of.
  • 1 https://s3-eu-west-1.amazonaws.com/criteo-labs/dac.tar.gz 2 https://www.kaggle.com/c/avazu-ctr-prediction/data most compared models.
  • For the proposed TFNet model, the network structure of two hidden layers Hl1 ,Hl2 are 512-512, d = 45, m = 4, 6 for Criteo dataset and Avazu dataset respectively
Results
  • Results and Analysis

    The authors first make ablation analysis of the higher-order feature interaction part of TFNet, which consists of the higher-order interactions of the embedding vectors and tensor-based interactive features.
  • The analysis verifies that whether the higher-order part can be a complement of tensor-based ones in TFNet. Comparing TFNet– and TFNet in the bottom part of Table 1, higher-order interactions obtain extra 0.4% relative AUC improvement, which demonstrates the mutual complementation of tensor-based and higher-order feature interaction.
  • The proposed TFNet model can gain prominent relative AUC improvement of around 2% against compared models on both datasets.
  • This verifies the effectiveness of the proposed method
Conclusion
  • CONCLUSION AND FUTURE WORK

    In this work, the authors propose a tensor-based feature interaction model TFNet, which can learn the feature interactions in different semantic spaces.
  • The authors are trying to utilize this model for ranking on the Chinese mainstream short video platform WeSee4, where the scene is a sliding play style instead of a click one, making it more complicated and challenging for samples modeling.
  • It is in the offline verification stage at present, and online evaluation will be accessed later
Summary
  • Introduction:

    The CTR prediction plays a central role in the domain of computational advertising and recommender systems, where an item could whether be recommended is decided by the probability of whether the user would click on it.
  • In ads recommender systems, it is reasonable that the interactions of feature pair and are in the different semantic spaces, where the former learns the effect of the user’s preference on the ad while the latter represents the effect of cost paid by the advertiser on this ad
  • Learning such feature interactions via simple vector-product in just one semantic space is obviously insufficient
  • Methods:

    Several representative methods are used for empirical comparison: (i) FM [9] , Wide&Deep [2] , DeepFM [3] , NFM [4] and (v) AFM [13].

    To make a fair comparison, the number of parameters of the proposed TFNet model is set to be approximately equal to that of

    1 https://s3-eu-west-1.amazonaws.com/criteo-labs/dac.tar.gz 2 https://www.kaggle.com/c/avazu-ctr-prediction/data most compared models.
  • Several representative methods are used for empirical comparison: (i) FM [9] , Wide&Deep [2] , DeepFM [3] , NFM [4] and (v) AFM [13].
  • To make a fair comparison, the number of parameters of the proposed TFNet model is set to be approximately equal to that of.
  • 1 https://s3-eu-west-1.amazonaws.com/criteo-labs/dac.tar.gz 2 https://www.kaggle.com/c/avazu-ctr-prediction/data most compared models.
  • For the proposed TFNet model, the network structure of two hidden layers Hl1 ,Hl2 are 512-512, d = 45, m = 4, 6 for Criteo dataset and Avazu dataset respectively
  • Results:

    Results and Analysis

    The authors first make ablation analysis of the higher-order feature interaction part of TFNet, which consists of the higher-order interactions of the embedding vectors and tensor-based interactive features.
  • The analysis verifies that whether the higher-order part can be a complement of tensor-based ones in TFNet. Comparing TFNet– and TFNet in the bottom part of Table 1, higher-order interactions obtain extra 0.4% relative AUC improvement, which demonstrates the mutual complementation of tensor-based and higher-order feature interaction.
  • The proposed TFNet model can gain prominent relative AUC improvement of around 2% against compared models on both datasets.
  • This verifies the effectiveness of the proposed method
  • Conclusion:

    CONCLUSION AND FUTURE WORK

    In this work, the authors propose a tensor-based feature interaction model TFNet, which can learn the feature interactions in different semantic spaces.
  • The authors are trying to utilize this model for ranking on the Chinese mainstream short video platform WeSee4, where the scene is a sliding play style instead of a click one, making it more complicated and challenging for samples modeling.
  • It is in the offline verification stage at present, and online evaluation will be accessed later
Tables
  • Table1: Offline experimental results of compared methods on the Criteo and Avazu datasets. RI-AUC is the relative
  • Table2: The impact of d (dimension of embeddings) and m (the number of slice of operating tensor) of the TFNet model on the Criteo and Avazu datasets
Download tables as Excel
Funding
  • ∗The first two authors contributed equally to this work. †This work is supported by National Key Research and Development Program (2018YFB1402600)
Reference
  • Patrick PK Chan, Xian Hu, et al. 2018. Convolutional Neural Networks based Click-Through Rate Prediction with Multiple Feature Sequences.. In IJCAI.
    Google ScholarFindings
  • Heng-Tze Cheng et al. 2016.
    Google ScholarFindings
  • Huifeng Guo, Ruiming Tang, et al. 2017. Deepfm: A factorization-machine based neural network for CTR prediction. In IJCAI.
    Google ScholarFindings
  • Xiangnan He and Tat-Seng Chua. 2017. Neural factorization machines for sparse predictive analytics. In SIGIR.
    Google ScholarLocate open access versionFindings
  • Zekun Li, Zeyu Cui, Shu Wu, Xiaoyu Zhang, and Liang Wang. 2019. Fi-GNN: Modeling Feature Interactions via Graph Neural Networks for CTR Prediction. In CIKM.
    Google ScholarFindings
  • Qiang Liu, Shu Wu, and Liang Wang. 2015. COT: Contextual Operating Tensor for Context-Aware Recommender Systems.. In AAAI.
    Google ScholarFindings
  • Qiang Liu, Feng Yu, Shu Wu, and Liang Wang. 2015. A convolutional click prediction model. In CIKM.
    Google ScholarFindings
  • Yanru Qu, Bohui Fang, et al. 201Product-Based Neural Networks for User Response Prediction over Multi-Field Categorical Data. TOIS 37, 1 (2018).
    Google ScholarLocate open access versionFindings
  • S Rendle. 2012. Factorization machines with libfm. ACM TIST 3, 57 (2012).
    Google ScholarLocate open access versionFindings
  • Wu Shu, Liu Qiang, Wang Liang, and Tieniu Tan. 2016. Contextual Operation for Recommender Systems. IEEE TKDE 28, 8 (2016).
    Google ScholarLocate open access versionFindings
  • Richard Socher et al. 2013. Recursive deep models for semantic compositionality over a sentiment treebank. In EMNLP.
    Google ScholarFindings
  • Yulong Wang, Hang Su, et al. 2018. Interpret Neural Networks by Identifying
    Google ScholarFindings
  • Jun Xiao, Hao Ye, et al. 2017. Attentional factorization machines: Learning the weight of feature interactions via attention networks. In IJCAI. 4 https://www.weishi.com
    Findings
Full Text
Your rating :
0

 

Tags
Comments