Assessment of Artificial Intelligence Performance on the Otolaryngology Residency In-Service Exam

Arushi P. Mahajan, Christina L. Shabet,Joshua Smith,Shannon F. Rudy,Robbi A. Kupfer,Lauren A. Bohm

OTO OPEN（2023）

引用 0|浏览2

暂无评分

摘要

ObjectivesThis study seeks to determine the potential use and reliability of a large language learning model for answering questions in a sub-specialized area of medicine, specifically practice exam questions in otolaryngology-head and neck surgery and assess its current efficacy for surgical trainees and learners.Study Design and SettingAll available questions from a public, paid-access question bank were manually input through ChatGPT.MethodsOutputs from ChatGPT were compared against the benchmark of the answers and explanations from the question bank. Questions were assessed in 2 domains: accuracy and comprehensiveness of explanations.ResultsOverall, our study demonstrates a ChatGPT correct answer rate of 53% and a correct explanation rate of 54%. We find that with increasing difficulty of questions there is a decreasing rate of answer and explanation accuracy.ConclusionCurrently, artificial intelligence-driven learning platforms are not robust enough to be reliable medical education resources to assist learners in sub-specialty specific patient decision making scenarios.

查看译文

关键词

artificial intelligence,BoardVitals,ChatGPT,in-service exams,large language models,otolaryngology residency training

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要