Object Fusion Tracking for RGB-T Images via Channel Swapping and Modal Mutual Attention

IEEE Sensors Journal(2023)

引用 0|浏览1
暂无评分
摘要
RGB-thermal (RGB-T) dual-modal imaging significantly broadens the observation dimensions of the vision system. However, effectively harnessing the inherent advantages of different spectral bands and establishing fusion solutions tightly coupled with end tasks remains highly challenging. This article proposes a modality fusion approach that combines channel switching and cross-modal attention for RGB-T tracking. We explore the hierarchical fusion method adapted to the deep features of different abstraction levels. For low-level features, cross-modal information is introduced to increase the diversity of unimodal data by swapping feature channels with low computational costs. To exploit the semantic representation of high-level deep features and heterogeneous information in multimodal data, a fusion structure based on modal mutual attention is designed, which achieves effective enhancement of RGB-T fusion feature representation by integrating modal self-attention and cross-modal attention. Experimental results on public datasets show that the proposed algorithm is effective and computationally efficient to obtain the state-of-the-art tracking performance and real-time processing.
更多
查看译文
关键词
fusion tracking,channel swapping
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要