EfficientMorph: Parameter-Efficient Transformer-Based Architecture for 3D Image Registration
CoRR(2024)
摘要
Transformers have emerged as the state-of-the-art architecture in medical
image registration, outperforming convolutional neural networks (CNNs) by
addressing their limited receptive fields and overcoming gradient instability
in deeper models. Despite their success, transformer-based models require
substantial resources for training, including data, memory, and computational
power, which may restrict their applicability for end users with limited
resources. In particular, existing transformer-based 3D image registration
architectures face three critical gaps that challenge their efficiency and
effectiveness. Firstly, while mitigating the quadratic complexity of full
attention by focusing on local regions, window-based attention mechanisms often
fail to adequately integrate local and global information. Secondly, feature
similarities across attention heads that were recently found in multi-head
attention architectures indicate a significant computational redundancy,
suggesting that the capacity of the network could be better utilized to enhance
performance. Lastly, the granularity of tokenization, a key factor in
registration accuracy, presents a trade-off; smaller tokens improve detail
capture at the cost of higher computational complexity, increased memory
demands, and a risk of overfitting. Here, we propose EfficientMorph, a
transformer-based architecture for unsupervised 3D image registration. It
optimizes the balance between local and global attention through a plane-based
attention mechanism, reduces computational redundancy via cascaded group
attention, and captures fine details without compromising computational
efficiency, thanks to a Hi-Res tokenization strategy complemented by merging
operations. Notably, EfficientMorph sets a new benchmark for performance on the
OASIS dataset with 16-27x fewer parameters.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要