Leveraging Large-Scale Weakly Labeled Data for Semi-Supervised Mass Detection in Mammograms

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021(2021)

引用 7|浏览28
暂无评分
摘要
Mammographic mass detection is an integral part of a computer-aided diagnosis system. Annotating a large number of mammograms at pixel-level in order to train a mass detection model in a fully supervised fashion is costly and time-consuming. This paper presents a novel self-training framework for semi-supervised mass detection with soft image-level labels generated from diagnosis reports by Mammo-RoBERTa, a RoBERTa-based natural language processing model fine-tuned on the fully labeled data and associated mammography reports. Starting with a fully supervised model trained on the data with pixel-level masks, the proposed framework iteratively refines the model itself using the entire weakly labeled data (image-level soft label) in a self-training fashion. A novel sample selection strategy is proposed to identify those most informative samples for each iteration, based on the current model output and the soft labels of the weakly labeled data. A soft cross-entropy loss and a soft focal loss are also designed to serve as the image-level and pixel-level classification loss respectively. Our experiment results show that the proposed semi-supervised framework can improve the mass detection accuracy on top of the supervised baseline, and outperforms the previous state-of-the-art semi-supervised approaches with weakly labeled data, in some cases by a large margin.
更多
查看译文
关键词
large-scale weakly labeled data,semisupervised mass detection,mammograms,mammographic mass detection,computer-aided diagnosis system,mass detection model,fully supervised fashion,self-training framework,soft image-level labels,diagnosis reports,RoBERTa-based natural language processing model,fully labeled data,fully supervised model,pixel-level masks,entire weakly labeled data,image-level soft label,self-training fashion,current model output,soft labels,soft cross-entropy loss,soft focal loss,pixel-level classification loss,semisupervised framework,mass detection accuracy,supervised baseline,previous state-of-the-art semisupervised approaches
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要