谷歌浏览器插件
订阅小程序
在清言上使用

Learning Modality Feature Fusion Via Transformer for RGBT-tracking

INFRARED PHYSICS & TECHNOLOGY(2023)

引用 1|浏览10
暂无评分
摘要
RGB-T tracking can be seen as multi-view fusion tracking, and in this study, we propose a network with transformer structure, Multi-Modal Mutual Propagation Tracker (MMMPT). In order to obtain robust appearance model from multi-modal data, we adopt encoder–decoder architecture for extract information. In the encoding stage, the template features of multiple frames enhance the common features across them through the self-attention mechanism to obtain time-invariant target representation. At the same time, it also interacts with multi-modal data through cross-modal propagation, resulting in a modal-invariant representation of the target. The transformer decoder transfers useful information from the template to search areas through a similarity matrix. We experiment on the RGBT234, GTOT, VTUAV and LasHeR datasets to assess the RGBT-transformer tracker. Extensive experiments indicate that our proposed framework is not inferior to the state-of-the-art trackers in terms of robustness and accuracy.
更多
查看译文
关键词
RGB-T tracking,Deep learning,Transformer,Challenge-aware,Feature fusion
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要