How to Simulate Realistic Survival Data? A Simulation Study to Compare Realistic Simulation Models
arXiv (Cornell University)(2023)
摘要
In statistics, it is important to have realistic data sets available for a
particular context to allow an appropriate and objective method comparison. For
many use cases, benchmark data sets for method comparison are already available
online. However, in most medical applications and especially for clinical
trials in oncology, there is a lack of adequate benchmark data sets, as patient
data can be sensitive and therefore cannot be published. A potential solution
for this are simulation studies. However, it is sometimes not clear, which
simulation models are suitable for generating realistic data. A challenge is
that potentially unrealistic assumptions have to be made about the
distributions. Our approach is to use reconstructed benchmark data sets
used as a basis for the simulations, which has the following advantages: the
actual properties are known and more realistic data can be simulated. There are
several possibilities to simulate realistic data from benchmark data sets. We
investigate simulation models based upon kernel density estimation, fitted
distributions, case resampling and conditional bootstrapping. In order to make
recommendations on which models are best suited for a specific survival
setting, we conducted a comparative simulation study. Since it is not possible
to provide recommendations for all possible survival settings in a single
paper, we focus on providing realistic simulation models for two-armed phase
III lung cancer studies. To this end we reconstructed benchmark data sets from
recent studies. We used the runtime and different accuracy measures (effect
sizes and p-values) as criteria for comparison.
更多查看译文
关键词
realistic survival data,simulation study,models
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要