A Toolbox for Surfacing Health Equity Harms and Biases in Large Language Models
CoRR(2024)
摘要
Large language models (LLMs) hold immense promise to serve complex health
information needs but also have the potential to introduce harm and exacerbate
health disparities. Reliably evaluating equity-related model failures is a
critical step toward developing systems that promote health equity. In this
work, we present resources and methodologies for surfacing biases with
potential to precipitate equity-related harms in long-form, LLM-generated
answers to medical questions and then conduct an empirical case study with
Med-PaLM 2, resulting in the largest human evaluation study in this area to
date. Our contributions include a multifactorial framework for human assessment
of LLM-generated answers for biases, and EquityMedQA, a collection of seven
newly-released datasets comprising both manually-curated and LLM-generated
questions enriched for adversarial queries. Both our human assessment framework
and dataset design process are grounded in an iterative participatory approach
and review of possible biases in Med-PaLM 2 answers to adversarial queries.
Through our empirical study, we find that the use of a collection of datasets
curated through a variety of methodologies, coupled with a thorough evaluation
protocol that leverages multiple assessment rubric designs and diverse rater
groups, surfaces biases that may be missed via narrower evaluation approaches.
Our experience underscores the importance of using diverse assessment
methodologies and involving raters of varying backgrounds and expertise. We
emphasize that while our framework can identify specific forms of bias, it is
not sufficient to holistically assess whether the deployment of an AI system
promotes equitable health outcomes. We hope the broader community leverages and
builds on these tools and methods towards realizing a shared goal of LLMs that
promote accessible and equitable healthcare for all.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要