A machine learning approach to identify groups of patients with hematological malignant disorders

COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE(2024)

引用 0|浏览0
暂无评分
摘要
Background and Objective: Vaccination against SARS-CoV-2 in immunocompromised patients with hematologic malignancies (HM) is crucial to reduce the severity of COVID-19. Despite vaccination efforts, over a third of HM patients remain unresponsive, increasing their risk of severe breakthrough infections. This study aims to leverage machine learning's adaptability to COVID-19 dynamics, efficiently selecting patient-specific features to enhance predictions and improve healthcare strategies. Highlighting the complex COVID-hematology connection, the focus is on interpretable machine learning to provide valuable insights to clinicians and biologists. Methods: The study evaluated a dataset with 1166 patients with hematological diseases. The output was the achievement or non-achievement of a serological response after full COVID-19 vaccination. Various machine learning methods were applied, with the best model selected based on metrics such as the Area Under the Curve (AUC), Sensitivity, Specificity, and Matthew Correlation Coefficient (MCC). Individual SHAP values were obtained for the best model, and Principal Component Analysis (PCA) was applied to these values. The patient profiles were then analyzed within identified clusters. Results: Support vector machine (SVM) emerged as the best-performing model. PCA applied to SVM-derived SHAP values resulted in four perfectly separated clusters. These clusters are characterized by the proportion of patients that generate antibodies (PPGA). Cluster 1, with the second-highest PPGA (69.91%), included patients with aggressive diseases and factors contributing to increased immunodeficiency. Cluster 2 had the lowest PPGA (33.3%), but the small sample size limited conclusive findings. Cluster 3, representing the majority of the population, exhibited a high rate of antibody generation (84.39%) and a better prognosis compared to cluster 1. Cluster 4, with a PPGA of 66.33%, included patients with B-cell non-Hodgkin's lymphoma on corticosteroid therapy. Conclusions: The methodology successfully identified four separate patient clusters using Machine Learning and Explainable AI (XAI). We then analyzed each cluster based on the percentage of HM patients who generated antibodies after COVID-19 vaccination. The study suggests the methodology's potential applicability to other diseases, highlighting the importance of interpretable ML in healthcare research and decision-making.
更多
查看译文
关键词
Machine learning,High risk groups identification,Hematological disease,COVID-19,Serological response,SARS-CoV-2 mRNA vaccines,Explainable AI (XAI)
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要