Multi-image transformer for multi-focus image fusion

SIGNAL PROCESSING-IMAGE COMMUNICATION(2023)

引用 0|浏览5
暂无评分
摘要
Multi-Focus Image Fusion (MFIF) is an image enhancement task that fuses images in which different regions are in focus to achieve an all-in-focus image. In recent years, Generative Adversarial Networks (GANs)-based approaches have significantly improved the MFIF on Convolutional Neural Network (CNN) architectures. However, despite vision transformers (ViTs) achieving more successful results than CNNs in many high and low-level vision problems due to their ability to provide global connectivity, they have not been employed for MFIF until this study. We develop a Multi-image Transformer (MiT) for MFIF by being inspired by a Spatial-Temporal Transformer Network (STTN) so that global connection can be modeled along multiple input images. We call the proposed transformers-based MFIF model MiT-MFIF as it uses the developed MiT as a core component. We have made various modifications to the baseline transformer to be able to utilize vision transformers in MFIF tasks. Comprehensive experiments on standard MFIF datasets demonstrate the effectiveness of the proposed MiT-MFIF. Moreover, proposed method does not require any post-processing step like in competitor GAN-based methods while outperforming the state-of-the-art MFIF methods.
更多
查看译文
关键词
Multi-focus image fusion,Vision transformers,Defocus spread effect,Generative adversarial networks
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要