Towards Radiologist-Level Cancer Risk Assessment In Ct Lung Screening Using Deep Learning

Trajanovski Stojan,Mavroeidis Dimitrios,Swisher Christine Leon,Gebre Binyam Gebrekidan,Veeling Bastiaan S.,Wiemker Rafael,Klinder Tobias,Tahmasebi Amir,Regis Shawn M.,Wald Christoph,McKee Brady J.,Flacke Sebastian,MacMahon Heber, Pien Homer

COMPUTERIZED MEDICAL IMAGING AND GRAPHICS（2021）

引用 42|浏览55

暂无评分

摘要

Purpose: Lung cancer is the leading cause of cancer mortality in the US, responsible for more deaths than breast, prostate, colon and pancreas cancer combined and large population studies have indicated that low-dose computed tomography (CT) screening of the chest can significantly reduce this death rate. Recently, the usefulness of Deep Learning (DL) models for lung cancer risk assessment has been demonstrated. However, in many cases model performances are evaluated on small/medium size test sets, thus not providing strong model generalization and stability guarantees which are necessary for clinical adoption. In this work, our goal is to contribute towards clinical adoption by investigating a deep learning framework on larger and heterogeneous datasets while also comparing to state-of-the-art models. Methods: Three low-dose CT lung cancer screening datasets were used: National Lung Screening Trial (NLST, n = 3410), Lahey Hospital and Medical Center (LHMC, n = 3154) data, Kaggle competition data (from both stages, n = 1397 + 505) and the University of Chicago data (UCM, a subset of NLST, annotated by radiologists, n = 132). At the first stage, our framework employs a nodule detector; while in the second stage, we use both the image context around the nodules and nodule features as inputs to a neural network that estimates the malignancy risk for the entire CT scan. We trained our algorithm on a part of the NLST dataset, and validated it on the other datasets. Special care was taken to ensure there was no patient overlap between the train and validation sets. Results and conclusions: The proposed deep learning model is shown to: (a) generalize well across all three data sets, achieving AUC between 86% to 94%, with our external test-set (LHMC) being at least twice as large compared to other works; (b) have better performance than the widely accepted PanCan Risk Model, achieving 6 and 9% better AUC score in our two test sets; (c) have improved performance compared to the state-of-the-art represented by the winners of the Kaggle Data Science Bowl 2017 competition on lung cancer screening; (d) have comparable performance to radiologists in estimating cancer risk at a patient level.

查看译文

关键词

Lung cancer screening, Deep learning, Low-dose computed tomography screening

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要