CIMIL-CRC: a clinically-informed multiple instance learning framework for patient-level colorectal cancer molecular subtypes classification from H&E stained images
CoRR(2024)
摘要
Treatment approaches for colorectal cancer (CRC) are highly dependent on the
molecular subtype, as immunotherapy has shown efficacy in cases with
microsatellite instability (MSI) but is ineffective for the microsatellite
stable (MSS) subtype. There is promising potential in utilizing deep neural
networks (DNNs) to automate the differentiation of CRC subtypes by analyzing
Hematoxylin and Eosin (H&E) stained whole-slide images (WSIs). Due to the
extensive size of WSIs, Multiple Instance Learning (MIL) techniques are
typically explored. However, existing MIL methods focus on identifying the most
representative image patches for classification, which may result in the loss
of critical information. Additionally, these methods often overlook clinically
relevant information, like the tendency for MSI class tumors to predominantly
occur on the proximal (right side) colon. We introduce `CIMIL-CRC', a DNN
framework that: 1) solves the MSI/MSS MIL problem by efficiently combining a
pre-trained feature extraction model with principal component analysis (PCA) to
aggregate information from all patches, and 2) integrates clinical priors,
particularly the tumor location within the colon, into the model to enhance
patient-level classification accuracy. We assessed our CIMIL-CRC method using
the average area under the curve (AUC) from a 5-fold cross-validation
experimental setup for model development on the TCGA-CRC-DX cohort, contrasting
it with a baseline patch-level classification, MIL-only approach, and
Clinically-informed patch-level classification approach. Our CIMIL-CRC
outperformed all methods (AUROC: 0.92±0.002 (95% CI 0.91-0.92), vs.
0.79±0.02 (95% CI 0.76-0.82), 0.86±0.01 (95% CI 0.85-0.88), and
0.87±0.01 (95% CI 0.86-0.88), respectively). The improvement was
statistically significant.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要