Predicting User Purchase in E-commerce by Comprehensive Feature Engineering and Decision Boundary Focused Under-Sampling.
RECSYS(2015)
摘要
ABSTRACTThe goal of RecSys Challenge 2015 [2] is: (1) to predict which user will end up with a purchase and if so, (2) to predict items that he/she will buy given click/purchase data provided by YOOCHOOSE. It is hard to achieve the goal of this Challenge because (1) the data does not contain user demographics information and it contains a lot of missing values and (2) the volume of the dataset is massive with about 33 million clicks and 1 million purchase history and the class distribution (the ratio of non-purchased clicks to purchased clicks) is highly imbalanced. In order to efficiently solve these problems, we propose (1) Comprehensive Feature Engineering method (CFE) including imputation of missing values to make up for insufficiency of information and (2) Decision Boundary Focused Under-Sampling method (DBFUS) to cope with class imbalance problem and to reduce learning time and memory usage. Our proposed approach obtained 54403.6 points on the final leaderboard.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络