Realistic Data Synthesis Using Enhanced Generative Adversarial Networks

2019 IEEE Second International Conference on Artificial Intelligence and Knowledge Engineering (AIKE)(2019)

引用 9|浏览53
暂无评分
摘要
Real data with privacy and confidentiality concerns are not often available or are too expensive to afford in respect of both time and money. In this situation, it is a good alternative to use synthetic data. The objective of this research is to generate realistic synthetic data so that people can use it freely. We propose a synthetic data generation model based on boundary-seeking generative adversarial networks (BGANs)-designated as medical BGAN or medBGAN and compare its performances with an existing method medical GAN (medGAN). We aim to perform the investigation on several datasets in two different domains: electronic health records (EHRs) in the medical domain and a crime dataset in the City of Los Angeles Police Department. Firstly, we train the models and generate synthetic data by using these trained models. We then analyze and compare the models' performance by applying some statistical methods (dimension-wise average and Kolmogorov-Smirnov test) and two machine learning tasks (association rule mining and prediction). The comprehensive analysis of this study shows that the proposed model is more efficient in generating realistic synthetic data than those generated using medGAN.
更多
查看译文
关键词
electronic health records,synthetic data generation,data synthesis,generative adversarial networks,boundary seeking GANs
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要