DEFT: a web-based system for DE-identifying Free Text data in electronic medical records using human in the loop deep learning (Preprint)

crossref(2023)

引用 0|浏览2
暂无评分
摘要
BACKGROUND The valuable narrative free text in Electronic Medical Records (EMRs) must be de-identified by removing Personally Identifiable Information (PII) before releasing it for secondary use. Manual de-identification is time-consuming and labour-intensive. Existing de-identification systems have a steep learning curve. OBJECTIVE We sought to develop an accurate, web-based system for de-identifying free text in EMRs, which can be readily and easily adopted in real-world settings. METHODS DEFT was designed with the goals of easy adoption and rapid and secure de-identification at high accuracy. It provides a simple and task-focused web user interface for users to easily perform the de-identification work. An interactive learning loop powered by a state-of-the-art deep learning model is integrated into DEFT to speed up the de-identification process and increase its performance over time. RESULTS DEFT has advantages over existing systems in terms of its support for project management, user access control, data management, and an interactive learning process. In a real-world use case of de-identifying clinical notes, which were extracted from one referral hospital in Sydney, Australia, DEFT achieved a high F1 score of 95.07% using 600 annotated clinical notes. CONCLUSIONS The DEFT system can be rapidly deployed for de-identifying free text in EMRs. End users with minimal technical knowledge can perform the de-identification work with only a shallow learning curve.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要