Harmonising data from different sources to conduct research using linked survey and routine datasets

Amrita Bandyopadhyay,Karen Tingay,Mario Cortina Borja,Lucy J Griffiths,Ashley Akbari,Helen Bedford,Sinead Brophy,Suzanne Walton,Carol Dezateux,Ronan Lyons

International Journal for Population Data Science（2018）

引用 1|浏览16

暂无评分

摘要

IntroductionHarmonization of different data sources from various electronic health records across systems enhances the potential scope and granularity of data available to health data research, providing more opportunities for research by improving the generalizability and effective sample size of a range of outcome metrics.Objectives and ApproachThis study describes data harmonisation for a UK longitudinal birth cohort, the Millennium Cohort Study (MCS) which was linked to routine inpatient and emergency department, and, where available, general practice and child health records for 1838 Welsh and 1431 Scottish consenting MCS participants. Datasets requiring harmonisation were: from Wales, Patient Episode Dataset for Wales (PEDW) and Emergency Department Data Set (EDDS) data and from Scotland, Scottish Medical Record 01 (SMR01) and Accident and Emergency dataset (Au0026E2). Heterogeneous variables were created by transforming variable names, concepts, codes to improve scope for analysis.ResultsA harmonized dataset of 2166 participants and 5747 hospital admissions were derived of cohort members who had at least 1 hospital inpatient or AE standardising periods of data collection; identifying inconsistencies and then mapping and bridging differences in definitions of periods of care and levels of diagnostic and operational coding across countries and datasets.Conclusion/ImplicationsHeterogeneous variables from different data sources were pooled and converted into standardised data for research, extending existing harmonisation work, including curation of a population based anonymously linkable longitudinal cohort. [AA1] These methods are reproducible and can be utilised by other researchers and projects applying to use these routine data sources.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要