Developing Clinical Decision Support System for Sleep Staging Tasks by Formulating Explanations from Artificial Intelligence: User-Centered Design and Evaluation. (Preprint)


引用 0|浏览3
BACKGROUND Despite the unprecedented performances of deep learning algorithms in clinical domains, full reviews of algorithmic predictions by human experts remain mandatory. Under these circumstances, artificial intelligence (AI) models are primarily designed as clinical decision support systems (CDSSs). However, from the perspective of clinical practitioners, the lack of clinical interpretability and user-centered interfaces block the adoption of these AI systems in practice. OBJECTIVE The aim of this study was to develop an AI-based CDSS for assisting polysomnographic technicians in reviewing AI-predicted sleep staging results. This study proposed and evaluated a CDSS that provides clinically sound explanations for AI predictions in a user-centered fashion. METHODS User needs for the system were identified during interviews with polysomnographic technicians. User observation sessions were conducted to understand the workflow of the practitioners during sleep scoring. Iterative design process was performed to ensure easy integration of the tool into clinical workflows. Then, we evaluated the system with polysomnographic technicians. We measured the improvements in sleep staging accuracies after adopting our tool and assessed qualitatively how the participants perceived and used the tool. RESULTS The user study revealed that technicians desire explanations relevant to key electroencephalogram (EEG) patterns for sleep staging when assessing the correctness of the AI predictions. Here, technicians could evaluate whether AI models properly locate and use those patterns during prediction. Based on this, information in AI models that is closely related to sleep EEG patterns was formulated and visualized during the iterative design process. Furthermore, we developed a different visualization strategy for each pattern based on the way the technicians interpreted the EEG recordings with these patterns during their workflows. Generally, the tool evaluation results from the nine polysomnographic technicians were positive. Quantitatively, technicians achieved better classification performances after reviewing the AI-generated predictions with the proposed system; classification accuracies measured with Macro-F1 scores improved from 60.20 to 62.71. Qualitatively, participants reported that the provided information from the tool effectively supported them, and they were able to develop notable adoption strategies for the tool. CONCLUSIONS Our findings indicate that formulating clinical explanations for automated predictions using the information in the AI with a user-centered design process is an effective strategy for developing a CDSS for sleep staging.
AI 理解论文
Chat Paper