Extracting Social Determinants of Health from Pediatric Patient Notes Using Large Language Models: Novel Corpus and Methods
arxiv(2024)
摘要
Social determinants of health (SDoH) play a critical role in shaping health
outcomes, particularly in pediatric populations where interventions can have
long-term implications. SDoH are frequently studied in the Electronic Health
Record (EHR), which provides a rich repository for diverse patient data. In
this work, we present a novel annotated corpus, the Pediatric Social History
Annotation Corpus (PedSHAC), and evaluate the automatic extraction of detailed
SDoH representations using fine-tuned and in-context learning methods with
Large Language Models (LLMs). PedSHAC comprises annotated social history
sections from 1,260 clinical notes obtained from pediatric patients within the
University of Washington (UW) hospital system. Employing an event-based
annotation scheme, PedSHAC captures ten distinct health determinants to
encompass living and economic stability, prior trauma, education access,
substance use history, and mental health with an overall annotator agreement of
81.9 F1. Our proposed fine-tuning LLM-based extractors achieve high performance
at 78.4 F1 for event arguments. In-context learning approaches with GPT-4
demonstrate promise for reliable SDoH extraction with limited annotated
examples, with extraction performance at 82.3 F1 for event triggers.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要