Cxrmim: masked image modeling pre-training paradigm for chest x-ray images analysis

Zhendong Wang, Haowen Ma,Jianwei Niu

2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP（2023）

引用 0|浏览2

暂无评分

摘要

As an effective approach for Vision Transformers (ViT) to obtain better initializations and representations in natural image analysis, Masked image modeling (MIM), performs the pretext task of reconstructing images by adopting partial observations without any label. Several works adopted dissimilar mask strategies to make ViT aggregate contextual information to infer missed contents. Nonetheless, chest radiographs conspicuously differ from photographic images, and conducting MIM in chest X-rays remains challenging. On that account, this paper came up with a specialized pre-training recipe cxr-MIM and a masking strategy for chest radiographs on the basis of their physiological characters. In cxrMIM, the out-of-lung region was first analyzed, and the lung region was then reconstructed with the help of mechanical connections and similarities of anatomy and physiology. cxrMIM facilitates ViT to excavate the commonalities of pulmonary structures and promote better performance on downstream tasks. We conducted experiments on the ChestX-ray 14 dataset using advanced self-supervised methods (e.g. MoCo v3, MAE) for comparison. Quantitative and qualitative results signified that cxrMIM reinforced the efficiency of Vision Transformer to resolve the multi-label thorax disease classification problem, and cxrMIM pretrained Swin-B performed comparably to the state-of-the-art CNN models.

查看译文

关键词

Chest X-ray images analysis,masked image modeling,self-supervised pre-training

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要