Charting new AI education in gastroenterology: Cross-sectional evaluation of ChatGPT and perplexity AI in medical residency exam

Digestive and Liver Disease(2024)

引用 0|浏览9
暂无评分
摘要
Background Conversational chatbots, fueled by large language models, spark debate over their potential in education and medical career exams. There is debate in the literature about the scientific integrity of the outputs produced by these chatbots. Aims This study evaluates ChatGPT 3.5 and Perplexity AI's cross-sectional performance in responding to questions from the 2023 Italian national residency admission exam (SSM23), comparing results and chatbots' concordance with previous years SSMs. Methods Gastroenterology-related SSM23 questions were input into ChatGPT 3.5 and Perplexity AI, evaluating their performance in correct responses and total scores. This process was repeated with questions from the three preceding years. Additionally, chatbot concordance was assessed using Cohen's method. Results In SSM23, ChatGPT 3.5 outperforms Perplexity AI with 94.11% correct responses, demonstrating consistency across years. Concordance weakened in 2023 (κ=0.203, P = 0.148), but ChatGPT consistently maintains a high standard compared to Perplexity AI. Conclusion ChatGPT 3.5 and Perplexity AI exhibit promise in addressing gastroenterological queries, emphasizing potential educational roles. However, their variable performance mandates cautious use as supplementary tools alongside conventional study methods. Clear guidelines are crucial for educators to balance traditional approaches and innovative systems, enhancing educational standards.
更多
查看译文
关键词
Artificial intelligence,Chatbots,Education,Medical residency
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要