Network-Based Feature Extraction Method for Fraud Detection via Label Propagation.

BigComp(2019)

引用 2|浏览42
暂无评分
摘要
Using machine learning to detect fraud has become a new mainstream in the online lending industry. In the process of learning, feature engineering is a key step to determine the performance of the final model. Traditional feature engineering is mostly based on the intrinsic features of dataset, especially the combination of them. However, fraud has gradually become a group behavior. In a related network, nodes represent users, divided into fraudulent nodes and non-fraudulent nodes. Fraudulent nodes are sparsely related with non-fraudulent nodes, and the links between fraudulent nodes are closely related. This paper presents a method based on Label Propagation Algorithm to extract network features from the related network. Based on the related network containing of known fraudulent users and relationship between them, a personalized label propagation algorithm is used to infer the unknown user's fraud probability, and the fraud probability is regarded as a network-based derivative feature to increase the information entropy of the feature engineering. Additionally, we changed the initialization method of the transition probability matrix and label distribution matrix, to avoid the performance degradation of label propagation algorithm caused by the unbalanced distribution of fraud data. By testing in real datasets, 17% precision score of detecting fraudster was achieved using only network feature.
更多
查看译文
关键词
Feature extraction,Data mining,Credit cards,IP networks,Neural networks,Machine learning,Industries
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要