A Highly Accurate Ensemble Classifier for the Molecular Diagnosis of ASD at Ages 1 to 4 Years

medRxiv(2021)

引用 0|浏览1
暂无评分
摘要
Importance: ASD diagnosis remains behavior-based and the median age of the first diagnosis remains unchanged at ~52 months, which is nearly 5 years after its first trimester origin. Long delays between ASD's prenatal onset and eventual diagnosis likely is a missed opportunity. However, accurate and clinically-translatable early-age diagnostic methods do not exist due to ASD genetic and clinical heterogeneity. There is a need for early-age diagnostic biomarkers of ASD that is robust against its heterogeneity. Objective: To develop a single blood-based molecular classifier that accurately diagnoses ASD at the age of first symptoms. Design, Setting, and Participants: N=264 ASD, typically developing (TD), and language delayed (LD) toddlers with their clinical, diagnostic, and leukocyte RNA data collected. Datasets included Discovery (n=175 ASD, TD subjects), Longitudinal (n=33 ASD, TD subjects), and Replication (n=89 ASD, TD, LD subjects). We developed an ensemble of ASD classifiers by testing 42,840 models composed of 3,570 feature selection sets and 12 classification methods. Models were trained on the Discovery dataset with 5-fold cross validation. Results were used to construct a Bayesian model averaging-based (BMA) ensemble classifier model that was tested in Discovery and Replication datasets. Data were collected from 2007 to 2012 and analyzed from August 2019 to April 2021. Main Outcomes and Measures: Primary outcomes were (1) comparisons of the performance of 42,840 classifier models in correctly identifying ASD vs TD and LD in Discovery and Replication datasets; and (2) performance of the ensemble model composed of 1,076 models and weighted by Bayesian model averaging technique. Results: Of 42,840 models trained in the Discovery dataset, 1,076 averaged AUC-ROC>0.8. These 1,076 models used 191 different feature routes and 2,764 gene features. Using weighted BMA of these features and routes, an ensemble classifier model was constructed which demonstrated excellent performance in Discovery and Replication datasets with ASD classification AUC-ROC scores of 84% to 88%. ASD classification accuracy was comparable against LD and TD subjects and in the Longitudinal dataset. ASD toddlers with ensemble scores above and below the ASD ensemble mean had similar diagnostic and psychometric scores, but those below the ASD ensemble mean had more prenatal risk events than TD toddlers. Ensemble features include genes with immune/inflammation, response to cytokines, transcriptional regulation, mitotic cell cycle, and PI3K-AKT, RAS, and Wnt signaling pathways. Conclusions and Relevance: An ensemble ASD molecular classifier has high and replicable accuracy across the spectrum of ASD clinical characteristics and across toddlers aged 1 to 4 years, which has potential for clinical translation.
更多
查看译文
关键词
molecular diagnosis,accurate ensemble classifier,asd
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要