An algorithm to identify cases of pulmonary arterial hypertension from the electronic medical record

AMERICAN JOURNAL OF RESPIRATORY AND CRITICAL CARE MEDICINE(2022)

引用 1|浏览7
暂无评分
摘要
Background Study of pulmonary arterial hypertension (PAH) in claims-based (CB) cohorts may facilitate understanding of disease epidemiology, however previous CB algorithms to identify PAH have had limited test characteristics. We hypothesized that machine learning algorithms (MLA) could accurately identify PAH in an CB cohort. Methods ICD-9/10 codes, CPT codes or PAH medications were used to screen an electronic medical record (EMR) for possible PAH. A subset (Development Cohort) was manually reviewed and adjudicated as PAH or “not PAH” and used to train and test MLAs. A second subset (Refinement Cohort) was manually reviewed and combined with the Development Cohort to make The Final Cohort, again divided into training and testing sets, with MLA characteristics defined on test set. The MLA was validated using an independent EMR cohort. Results 194 PAH and 786 “not PAH” in the Development Cohort trained and tested the initial MLA. In the Final Cohort test set, the final MLA sensitivity was 0.88, specificity was 0.93, positive predictive value was 0.89, and negative predictive value was 0.92. Persistence and strength of PAH medication use and CPT code for right heart catheterization were principal MLA features. Applying the MLA to the EMR cohort using a split cohort internal validation approach, we found 265 additional non-confirmed cases of suspected PAH that exhibited typical PAH demographics, comorbidities, hemodynamics. Conclusions We developed and validated a MLA using only CB features that identified PAH in the EMR with strong test characteristics. When deployed across an entire EMR, the MLA identified cases with known features of PAH.
更多
查看译文
关键词
Pulmonary hypertension, Machine learning, Algorithm
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要