Use of commercially available natural language processing software to identify bleeding from the medical record

Andrew L Walker,Cheri Watson, Ryan Butcher,Mark Yandell,Rashmee U Shah

medRxiv(2021)

引用 0|浏览2
暂无评分
摘要
Background: Real-world evidence derived from the electronic medical record (EMR) is increasingly prevalent. How best to ascertain cardiovascular outcomes from EMRs is unknown. We sought to validate a commercially available natural language processing (NLP) software to extract bleeding events. Methods: We included patients with atrial fibrillation and cancer seen at our cancer center from 1/1/2016 to 12/31/2019. A query set based on SNOMED CT expressions was created to represent bleeding from 11 different organ systems. We ran the query against the clinical notes and randomly selected a sample of notes for physician validation. The primary outcome was the positive predictive value (PPV) of the software to identify bleeding events stratified by organ system. Results: We included 1370 patients with mean age 72 years old (SD 1.5) and 35% female. We processed 66,130 notes; the NLP software identified 6522 notes including 654 unique patients with possible bleeding events. Among 1269 randomly selected notes, the PPV of the software ranged from 0.921 for neurologic bleeds to 0.571 for OB/GYN bleeds. Patterns related to false positive bleeding events identified by the software included historic bleeds, hypothetical bleeds, missed negatives, and word errors. Conclusions: NLP may provide an alternative for population-level screening for bleeding outcomes in cardiovascular studies. Human validation is still needed, but an NLP-driven screening approach may improve efficiency.
更多
查看译文
关键词
medical record,software,processing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要