Downstream Task-Oriented Generative Model Selections on Synthetic Data Training for Fraud Detection Models
CoRR(2024)
摘要
Devising procedures for downstream task-oriented generative model selections
is an unresolved problem of practical importance. Existing studies focused on
the utility of a single family of generative models. They provided limited
insights on how synthetic data practitioners select the best family generative
models for synthetic training tasks given a specific combination of machine
learning model class and performance metric. In this paper, we approach the
downstream task-oriented generative model selections problem in the case of
training fraud detection models and investigate the best practice given
different combinations of model interpretability and model performance
constraints. Our investigation supports that, while both Neural
Network(NN)-based and Bayesian Network(BN)-based generative models are both
good to complete synthetic training task under loose model interpretability
constrain, the BN-based generative models is better than NN-based when
synthetic training fraud detection model under strict model interpretability
constrain. Our results provides practical guidance for machine learning
practitioner who is interested in replacing their training dataset from real to
synthetic, and shed lights on more general downstream task-oriented generative
model selection problems.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要