Siamese Visual Tracking with Residual Fusion Learning.

IEEE access（2022）

引用 3|浏览10

暂无评分

摘要

Multi-stage feature fusion is pretty effective for deep Siamese trackers to promote tracking performance. Unfortunately, conventional fusion approaches, such as weighted average, are so simple that they are inappropriate to combine the features with diverse characteristics. In addition, the fusion module is generally optimized along with Siamese network module, which may result in the performance degradation of the whole tracker. In this paper, we propose a novel feature fusion network for Siamese tracker by exploiting the expression capacity of residual fusion learning (SiamRFL). Specifically, the network employs the deep-layer features as direct input to semantically recognize the object from background, and refines the object state with local detail patterns by exploring the shallow-layer features through residual channel. The classification and the regression features can be fused respectively by deploying multiple fusion units. To avoid the degradation problem, we also present an ensemble training framework for our tracker, in which different loss functions are introduced to individually optimize the Siamese and the fusion modules. Compared to the baseline SiamRPN++ tracker, the proposed tracker achieves favorable gains by $0.696\rightarrow 0.709$ , $0.285\rightarrow 0.308$ , $0.603\rightarrow 0.624$ , $0.496\rightarrow 0.520$ and $0.517\rightarrow 0.559$ on OTB100, VOT2019, UAV123, LaSOT and GOT10k datasets, outperforming other approaches by an obvious margin.

查看译文

关键词

Visual tracking,Siamese network,feature fusion,residual learning,ensemble training

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要