The Molecular Twin platform: a novel machine learning tool for democratization of precision cancer medicine.

Journal of Clinical Oncology(2022)

引用 0|浏览11
暂无评分
摘要
e13546 Background: Pancreatic ductal adenocarcinoma (PDAC) is one of the most aggressive cancers. Contemporary analyses focused on a handful of molecular and clinical variables combined with machine learning algorithms (MLA) are unable to accurately predict therapy outcomes. Here, we use the Molecular Twin multi-omic analytical platform that evaluates tumor and host features extracted from 10 multi-omic analytes and provides an array of MLA, including a Parsimonious Biomarker Model that can predict survival and recurrence with limited analytic burden, while maintaining a high degree of fidelity. Methods: Retrospectively collected serum and tissue samples from 74 patients with Stage I/II resectable PDAC were subjected to targeted NGS DNA sequencing, whole transcriptome RNA sequencing, paired tissue proteomics, unpaired serum proteomics, lipidomics and computational pathology. Analytes including plasma proteins, RNA fusions, tissue proteins, plasma lipids, RNA gene expressions, CNVs, INDELS, SNVs and tumor nuclei characteristics, were processed to obtain a panel of 6363 features. 1024 single-omic and multi-omics feature combinations generated from this panel served as input for 7 different types of MLA to predict binary survival (SR) and disease recurrence (DR) outcomes. The resultant 70 single and 7098 multi-omic biomarker models were evaluated for positive predictive value (PPV) and accuracy (ACC) in predicting DR and SR, and feature proportions learned by each ML model using leave-one-patient-out cross-validation strategy. By recursively eliminating features with low importance, we developed progressively parsimonious biomarker models for predicting SR and DR. Results: Our top model was multi-omic and predicted the SR with ACC = 0.85, PPV = 0.87 and the DR with ACC = 0.90, and PPV = 0.91. It outperformed all models based only on one single analyte type including plasma protein, RNA fusion, tissue protein, plasma lipid, clinical, RNA gene expression, tumor nuclei characteristics, CNV, INDEL and SNV, in predicting the SR. This model contained predominantly plasma protein features. Interestingly, less accurate models contained a greater proportion of other features in addition to plasma proteins. Parsimonious feature reduction of the top model stabilized at 589 features yielding an ACC = 0.85, and PPV = 0.85, comparable to the intact model. Conclusions: This proof-of-concept of the Molecular Twin precision medicine platform applied in PDAC reveals the potential of our unique MLA to provide a novel parsimonious biomarker panel with similar fidelity as much larger biomarker panels. If these results are reproduced on larger datasets, across tumor types, the Molecular Twin platform would have significant potential to democratize precision cancer medicine by discovering smaller biomarker panels with the predictive performance of much larger ones thus reducing cost and simplifying assays.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要