High-Performance Long-Term Tracking With Meta-Updater

CVPR, pp.6297-6306, (2020)

Cited by: 17|Views101
EI
Weibo:
An important reason is that online update is a double-edged sword for tracking

Abstract:

Long-term visual tracking has drawn increasing attention because it is much closer to practical applications than short-term tracking. Most top-ranked long-term trackers adopt the offline-trained Siamese architectures, thus, they cannot benefit from great progress of short-term trackers with online update. However, it is quite risky to ...More

Code:

Data:

0
ZH
Introduction
  • The study of visual tracking has begun to shift from short-term tracking to large-scale long-term tracking, roughly due to two reasons.
  • Deep-learning-based methods have dominated the shortterm tracking field [30, 47, 35], from the perspective of either one-shot learning [41, 2, 15, 28, 26, 12, 53, 29] or online learning [37, 10, 8, 21, 40, 7, 49, 50, 9].
  • The risk of online update is amplified for long-term tracking, due to long-term uncertain observations
Highlights
  • The study of visual tracking has begun to shift from short-term tracking to large-scale long-term tracking, roughly due to two reasons
  • An important reason is that online update is a double-edged sword for tracking
  • The risk of online update is amplified for long-term tracking, due to long-term uncertain observations
  • Our long-term tracking framework can benefit from the strength of onlineupdated short-term tracker at low risk. Numerous experimental results on the VOT2018LT, VOT2019LT, OxUvALT, TLP and LaSOT long-term benchmarks show that the proposed method outperforms the state-of-the-art trackers by a large margin
  • We introduce our meta-updater on the basis of an online tracker outputting a response map in each frame (e.g., ECO [8], ATOM [9])
  • A novel meta-updater is proposed by integrating geometric, discriminative, and appearance cues in a sequential manner to determine whether the tracker should be updated or not at the present moment. This method substantially reduces the risk of online update for long-term tracking, and effectively yet efficiently guides the tracker’s update
Methods
  • The authors implement the tracker using Tensorflow on a PC machine with an Intel-i9 CPU (64G RAM) and a NVIDIA GTX2080Ti GPU (11G memory).
  • The authors evaluate the tracker on five benchmarks: VOT2018LT [23], VOT2019LT [24], OxUvALT [42], TLP [36], and LaSOT [11].
  • The authors first compare the tracker with other state-of-the-art algorithms on the VOT2018LT dataset [23], which contains 35 challenging sequences of diverse objects with the total length of 146817 frames.
  • The accuracy evaluation of the VOT2018LT dataset [23] mainly includes tracking precision (Pr), tracking recall (Re) and tracking F-score.
  • Different trackers are ranked according to the tracking F-score.
  • The detailed definitions of Pr, Re and F-score can be found in the VOT2018 challenge official report [23]
Results
  • Evaluation of iterative steps

    Table 7 shows that the performance is gradually improved with the increase of k. k

    4.3.
  • The authors note that the meta-updater is easy to be embedded into other trackers with online learning.
  • To show this good generalization ability, the authors introduce the meta-updater into four tracking algorithms, including ATOM, ECO, RTMDNet and the base tracker.
  • Figure 9 shows the tracking performance of different trackers without and with metaupdater on the LaSOT dataset, and it demonstrates that the proposed meta-updater can consistently improve the tracking accuracy of different trackers.
  • The authors can conclude that the meta-updater has a good generalization ability, which can consistently improve the tracking accuracy almost without sacrificing the efficiency
Conclusion
  • This work presents a novel long-term tracking framework with the proposed meta-updater.
  • A novel meta-updater is proposed by integrating geometric, discriminative, and appearance cues in a sequential manner to determine whether the tracker should be updated or not at the present moment.
  • This method substantially reduces the risk of online update for long-term tracking, and effectively yet efficiently guides the tracker’s update.
Summary
  • Introduction:

    The study of visual tracking has begun to shift from short-term tracking to large-scale long-term tracking, roughly due to two reasons.
  • Deep-learning-based methods have dominated the shortterm tracking field [30, 47, 35], from the perspective of either one-shot learning [41, 2, 15, 28, 26, 12, 53, 29] or online learning [37, 10, 8, 21, 40, 7, 49, 50, 9].
  • The risk of online update is amplified for long-term tracking, due to long-term uncertain observations
  • Methods:

    The authors implement the tracker using Tensorflow on a PC machine with an Intel-i9 CPU (64G RAM) and a NVIDIA GTX2080Ti GPU (11G memory).
  • The authors evaluate the tracker on five benchmarks: VOT2018LT [23], VOT2019LT [24], OxUvALT [42], TLP [36], and LaSOT [11].
  • The authors first compare the tracker with other state-of-the-art algorithms on the VOT2018LT dataset [23], which contains 35 challenging sequences of diverse objects with the total length of 146817 frames.
  • The accuracy evaluation of the VOT2018LT dataset [23] mainly includes tracking precision (Pr), tracking recall (Re) and tracking F-score.
  • Different trackers are ranked according to the tracking F-score.
  • The detailed definitions of Pr, Re and F-score can be found in the VOT2018 challenge official report [23]
  • Results:

    Evaluation of iterative steps

    Table 7 shows that the performance is gradually improved with the increase of k. k

    4.3.
  • The authors note that the meta-updater is easy to be embedded into other trackers with online learning.
  • To show this good generalization ability, the authors introduce the meta-updater into four tracking algorithms, including ATOM, ECO, RTMDNet and the base tracker.
  • Figure 9 shows the tracking performance of different trackers without and with metaupdater on the LaSOT dataset, and it demonstrates that the proposed meta-updater can consistently improve the tracking accuracy of different trackers.
  • The authors can conclude that the meta-updater has a good generalization ability, which can consistently improve the tracking accuracy almost without sacrificing the efficiency
  • Conclusion:

    This work presents a novel long-term tracking framework with the proposed meta-updater.
  • A novel meta-updater is proposed by integrating geometric, discriminative, and appearance cues in a sequential manner to determine whether the tracker should be updated or not at the present moment.
  • This method substantially reduces the risk of online update for long-term tracking, and effectively yet efficiently guides the tracker’s update.
Tables
  • Table1: Input-output relations of our cascaded LSTM model
  • Table2: Comparisons of our tracker and 15 state-of-the-art methods on the VOT2018LT dataset [<a class="ref-link" id="c23" href="#r23">23</a>]. The best three results are shown in red, blue and green colors, respectively. The trackers are ranked from top to bottom according to F-score
  • Table3: Performance evaluation of our tracker and eight competing algorithms on the VOT2019LT dataset. The best three results are shown in red , blue and green colors, respectively. The trackers are ranked from top to bottom using the F-score measure
  • Table4: Performance evaluation of our tracker and 13 competing algorithms on the OxUvALT dataset. The best three results are shown in red, blue and green colors, respectively. The trackers are ranked from top to bottom according to the MaxGM values
  • Table5: Effects of different time steps for our meta-updater
  • Table6: Effectiveness of different inputs of our meta-updater
  • Table7: Evaluation of iterative steps for our cascaded LSTM
  • Table8: Speed comparisons of different trackers without and with meta-updater (MU)
  • Table9: Effectiveness of our meta-updater for different trackers
Download tables as Excel
Related work
  • 2.1. Long-term Visual Tracking

    Although large-scale long-term tracking benchmarks [23, 42] began to emerge since 2018, researchers have attached importance to the long-term tracking task for a long time (such as keypoint-based [17], proposalbased [54], detector-based [22, 32], and other methods). A classical algorithm is the tracking-learning-detection (TLD) method [22], which addresses long-term tracking as a combination of a local tracker (with forward-backward optical flow) and a global re-detector (with an ensemble of weak classifiers). Following this idea, many researchers [34, 32, 42] attempt to handle the long-term tracking problem with different local trackers and different global re-detectors. Among them, the local tracker and global re-detectors can also adopt the same powerful model [32, 26, 51, 48], being equipped with a re-detection scheme (e.g., random search and sliding window). A crucial problem of these trackers is how to switch the tracker between the local tracker and the global re-detector. Usually, they use the outputs of local trackers to conduct self-evaluation, i.e., to determine whether the tracker losses the target or not. This manner has a high risk since the outputs of local trackers are not always reliable and unexpectedly mislead the switcher sometimes. The MBMD method [51], the winner of VOT2018LT, conducts local and global switching with an additional online-updated deep classifier. This tracker exploits a SiamPRN-based network to regress the target in a local search region or every sliding window when re-detection. The recent SPLT method [48] utilizes the same SiamPRN in [51] for tracking and re-detection, replaces the online verifier in [51] with an offline trained matching network, and speeds up the tracker by using their proposed skimming module. A curious phenomenon is that most top-ranked long-term trackers (such as MBMD [51], SPLT [48], and SiamRPN++ [26]), have not adopted excellent online-updated trackers (e.g., ECO [8], ATOM [9]) to conduct local tracking. One of the underlying reasons is that the risk of online update is amplified for long-term tracking, caused by long-term uncertain observations. In this work, we attempt to address this dilemma by designing a high-performance long-term tracker with a meta-updater.
Funding
  • The paper is supported in part by National Natural Science Foundation of China under Grant No 61872056, 61771088, 61725202, U1903215, in part by the National Key RD Program of China under Grant No 2018AAA0102001, and in part by the Fundamental Research Funds for the Central Universities under Grant No DUT19GJ201
Reference
  • Luca Bertinetto, Jack Valmadre, Stuart Golodetz, Ondrej Miksik, and Philip H. S. Torr. Staple: Complementary learners for real-time tracking. In CVPR, 2016.
    Google ScholarLocate open access versionFindings
  • Luca Bertinetto, Jack Valmadre, Joo F. Henriques, Andrea Vedaldi, and Philip H. S. Torr. Fullyconvolutional siamese networks for object tracking. In ECCV Workshop, 2016.
    Google ScholarLocate open access versionFindings
  • Goutam Bhat, Martin Danelljan, Luc Van Gool, and Radu Timofte. Learning discriminative model prediction for tracking. In ICCV, 2019.
    Google ScholarLocate open access versionFindings
  • Kai Chen, Jiaqi Wang, Jiangmiao Pang, Yuhang Cao, Yu Xiong, Xiaoxiao Li, Shuyang Sun, Wansen Feng, Ziwei Liu, Jiarui Xu, Zheng Zhang, Dazhi Cheng, Chenchen Zhu, Tianheng Cheng, Qijie Zhao, Buyu Li, Xin Lu, Rui Zhu, Yue Wu, Jifeng Dai, Jingdong Wang, Jianping Shi, Wanli Ouyang, Chen Change Loy, and Dahua Lin. MMDetection: Open mmlab detection toolbox and benchmark. arXiv preprint arXiv:1906.07155, 2019.
    Findings
  • Janghoon Choi, Junseok Kwon, and Kyoung Mu Lee. Deep meta learning for real-time target-aware visual tracking. In ICCV, 2019.
    Google ScholarLocate open access versionFindings
  • Dorin Comaniciu, Visvanathan Ramesh, and Peter Meer. Kernel-based object tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(5):564–577, 2003.
    Google ScholarLocate open access versionFindings
  • Kenan Dai, Dong Wang, Huchuan Lu, Chong Sun, and Jianhua Li. Visual tracking via adaptive spatiallyregularized correlation filters. In CVPR, 2019.
    Google ScholarFindings
  • Martin Danelljan, Goutam Bhat, Fahad Shahbaz Khan, and Michael Felsberg. ECO: Efficient convolution operators for tracking. In CVPR, 2017.
    Google ScholarLocate open access versionFindings
  • Martin Danelljan, Goutam Bhat, Fahad Shahbaz Khan, and Michael Felsberg. ATOM: Accurate tracking by overlap maximization. In CVPR, 2019.
    Google ScholarLocate open access versionFindings
  • Martin Danelljan, Andreas Robinson, Fahad Shahbaz Khan, and Michael Felsberg. Beyond correlation filters: Learning continuous convolution operators for visual tracking. In ECCV, 2016.
    Google ScholarLocate open access versionFindings
  • Heng Fan, Liting Lin, Fan Yang, Peng Chu, Ge Deng, Sijia Yu, Hexin Bai, Yong Xu, Chunyuan Liao, and Haibin Ling. LaSOT: A high-quality benchmark for large-scale single object tracking. In CVPR, 2019.
    Google ScholarLocate open access versionFindings
  • Heng Fan and Haibin Ling. Siamese cascaded region proposal networks for real-time visual tracking. In CVPR, 2019.
    Google ScholarLocate open access versionFindings
  • Hamed Kiani Galoogahi, Ashton Fagg, and Simon Lucey. Learning background-aware correlation filters for visual tracking. In ICCV, 2017.
    Google ScholarLocate open access versionFindings
  • Alex Graves. Supervised Sequence Labelling with Recurrent Neural Networks, volume 385 of Studies in Computational Intelligence. Springer, 2012.
    Google ScholarLocate open access versionFindings
  • Anfeng He, Chong Luo, Xinmei Tian, and Wenjun Zeng. A twofold siamese network for real-time object tracking. In CVPR, 2018.
    Google ScholarLocate open access versionFindings
  • Joo F Henriques, Rui Caseiro, Pedro Martins, and Jorge Batista. High-speed tracking with kernelized correlation filters. In ICVS, 2008.
    Google ScholarFindings
  • Zhibin Hong, Zhe Chen, Chaohui Wang, Xue Mei, Danil Prokhorov, and Dacheng Tao. MUlti-Store Tracker (MUSTer): A cognitive psychology inspired approach to object tracking. In CVPR, 2015.
    Google ScholarLocate open access versionFindings
  • Jianglei Huang and Wengang Zhou. Re2EMA: Regularized and reinitialized exponential moving average for target model update in object tracking. In AAAI, 2019.
    Google ScholarLocate open access versionFindings
  • Lianghua Huang, Xin Zhao, and Kaiqi Huang. GlobalTrack: A simple and strong baseline for long-term tracking. In AAAI, 2020.
    Google ScholarLocate open access versionFindings
  • Borui Jiang, Ruixuan Luo, Jiayuan Mao, Tete Xiao, and Yuning Jiang. Acquisition of localization confidence for accurate object detection. In ECCV, 2018.
    Google ScholarLocate open access versionFindings
  • Ilchae Jung, Jeany Son, Mooyeol Baek, and Bohyung Han. Real-time MDNet. In ECCV, 2018.
    Google ScholarLocate open access versionFindings
  • Zdenek Kalal, Krystian Mikolajczyk, and Jiri Matas. Tracking-learning-detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(7):1409–1422, 2012.
    Google ScholarLocate open access versionFindings
  • Matej Kristan, Ales Leonardis, Jiri Matas, Michael Felsberg, Roman Pfugfelder, Luka Cehovin Zajc, Tomas Vojir, Goutam Bhat, Alan Lukezic, Abdelrahman Eldesokey, Gustavo Fernandez, and et al. The sixth visual object tracking VOT2018 challenge results. In ECCVW, 2018.
    Google ScholarLocate open access versionFindings
  • Matej Kristan, Jiri Matas, Ales Leonardis, Michael Felsberg, Roman Pflugfelder, Joni-Kristian Kamarainen, Luka Cehovin Zajc, Ondrej Drbohlav, Alan Lukezic, Amanda Berg, Abdelrahman Eldesokey, Jani Kapyla, and Gustavo Fernandez. The seventh visual object tracking VOT2019 challenge results. In ICCVW, 2019.
    Google ScholarLocate open access versionFindings
  • Hankyeol Lee, Seokeon Choi, and Changick Kim. A memory model based on the siamese network for long-term tracking. In ECCVW, 2018.
    Google ScholarLocate open access versionFindings
  • Bo Li, Wei Wu, Qiang Wang, Fangyi Zhang, Junliang Xing, and Junjie Yan. SiamRPN++: Evolution of siamese visual tracking with very deep networks. In CVPR, 2019.
    Google ScholarLocate open access versionFindings
  • Bi Li, Wenxuan Xie, Wenjun Zeng, and Wenyu Liu. Learning to update for object tracking with recurrent meta-learner. IEEE Transcations on Image Processing, 28(7):3624–3635, 2019.
    Google ScholarLocate open access versionFindings
  • Bo Li, Junjie Yan, Wei Wu, Zheng Zhu, and Xiaolin Hu. High performance visual tracking with siamese region proposal network. In CVPR, 2018.
    Google ScholarLocate open access versionFindings
  • Peixia Li, Boyu Chen, Wanli Ouyang, Dong Wang, Xiaoyun Yang, and Huchuan Lu. GradNet: Gradientguided network for visual object tracking. In ICCV, 2019.
    Google ScholarLocate open access versionFindings
  • Peixia Li, Dong Wang, Lijun Wang, and Huchuan Lu. Deep visual tracking: Review and experimental comparison. Pattern Recognition, 76:323–338, 2018.
    Google ScholarLocate open access versionFindings
  • Pengpeng Liang, Erik Blasch, and Haibin Ling. Encoding color information for visual tracking: Algorithms and benchmark. IEEE Transcations on Image Processing, 24(12):5630–5644, 2015.
    Google ScholarLocate open access versionFindings
  • Alan Lukei, Luka ehovin Zajc, Tom Voj, Ji Matas, and Matej Kristan. FCLT - A fully-correlational long-term tracker. In ACCV, 2018.
    Google ScholarLocate open access versionFindings
  • Hao Luo, Youzhi Gu, Xingyu Liao, Shenqi Lai, and Wei Jiang. Bag of tricks and a strong baseline for deep person re-identification. In CVPR, 2019.
    Google ScholarLocate open access versionFindings
  • Chao Ma, Xiaokang Yang, Chongyang Zhang, and Ming Hsuan Yang. Long-term correlation tracking. In CVPR, 2015.
    Google ScholarFindings
  • Seyed Mojtaba Marvasti-Zadeh, Li Cheng, Hossein Ghanei-Yakhdan, and Shohreh Kasaei. Deep learning for visual tracking: A comprehensive survey. CoRR, abs/1912.00535, 2019.
    Findings
  • Abhinav Moudgil and Vineet Gandhi. Long-term visual object tracking benchmark. In ACCV, 2018.
    Google ScholarLocate open access versionFindings
  • Hyeonseob Nam and Bohyung Han. Learning multi– domain convolutional neural networks for visual tracking. In CVPR, 2016.
    Google ScholarLocate open access versionFindings
  • Eunbyung Park and Alexander C. Berg. Meta-tracker: Fast and robust online adaptation for visual object trackers. In ECCV, 2018.
    Google ScholarLocate open access versionFindings
  • David A. Ross, Jongwoo Lim, Ruei-Sung Lin, and Ming-Hsuan Yang. Incremental learning for robust visual tracking. International Journal of Computer Vision, 77(1-3):125–141, 2008.
    Google ScholarLocate open access versionFindings
  • Chong Sun, Dong Wang, Huchuan Lu, and MingHsuan Yang. Correlation tracking via joint discrimination and reliability learning. In CVPR, 2018.
    Google ScholarLocate open access versionFindings
  • Ran Tao, Efstratios Gavves, and Arnold W. M. Smeulders. Siamese instance search for tracking. In CVPR, 2016.
    Google ScholarLocate open access versionFindings
  • Jack Valmadre, Luca Bertinetto, Joao F. Henriques, Ran Tao, Andrea Vedaldi, Arnold W.M. Smeulders, Philip H.S. Torr, and Efstratios Gavves. Long-term tracking in the wild: a benchmark. In ECCV, 2018.
    Google ScholarLocate open access versionFindings
  • Dong Wang, Huchuan Lu, and Ming-Hsuan Yang. Online object tracking with sparse prototypes. IEEE Transcations on Image Processing, 22(1):314–325, 2013.
    Google ScholarLocate open access versionFindings
  • Mengmeng Wang, Yong Liu, and Zeyi Huang. Large margin object tracking with circulant feature maps. In CVPR, pages 4800–4808, 2017.
    Google ScholarLocate open access versionFindings
  • Qiang Wang, Li Zhang, Luca Bertinetto, Weiming Hu, and Philip H. S. Torr. Fast online object tracking and segmentation: A unifying approach. In CVPR, 2019.
    Google ScholarLocate open access versionFindings
  • Yi Wu, Jongwoo Lim, and Ming Hsuan Yang. Object tracking benchmark. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(9):1834–1848, 2015.
    Google ScholarLocate open access versionFindings
  • Bin Yan, Dong Wang, Huchuan Lu, and Xiaoyun Yang. Cooling-Shrinking Attack: Blinding the tracker with imperceptible noises. In CVPR, 2020.
    Google ScholarLocate open access versionFindings
  • Bin Yan, Haojie Zhao, Dong Wang, Huchuan Lu, and Xiaoyun Yang. Skimming-Perusal Tracking: A framework for real-time and robust long-term tracking. In ICCV, 2019.
    Google ScholarLocate open access versionFindings
  • Tianzhu Zhang, Si Liu, Changsheng Xu, Bin Liu, and Ming-Hsuan Yang. Correlation particle filter for visual tracking. IEEE Transactions on Image Processing, 27(6):2676–2687, 2018.
    Google ScholarLocate open access versionFindings
  • Tianzhu Zhang, Changsheng Xu, and Ming-Hsuan Yang. Learning multi-task correlation particle filters for visual tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(2):365– 378, 2019.
    Google ScholarLocate open access versionFindings
  • Yunhua Zhang, Dong Wang, Lijun Wang, Jinqing Qi, and Huchuan Lu. Learning regression and verification networks for long-term visual tracking. CoRR, abs/1809.04320, 2018.
    Findings
  • Yunhua Zhang, Lijun Wang, Jinqing Qi, Dong Wang, Mengyang Feng, and Huchuan Lu. Structured siamese network for real-time visual tracking. In ECCV, pages 355–370, 2018.
    Google ScholarLocate open access versionFindings
  • Zhipeng Zhang and Houwen Peng. Deeper and wider siamese networks for real-time visual tracking. In CVPR, 2019.
    Google ScholarLocate open access versionFindings
  • Gao Zhu, Fatih Porikli, and Hongdong Li. Beyond local search: Tracking objects everywhere with instance-specific proposals. In CVPR, 2016.
    Google ScholarLocate open access versionFindings
  • Zheng Zhu, Qiang Wang, Bo Li, Wei Wu, Junjie Yan, and Weiming Hu. Distractor-aware siamese networks for visual object tracking. In ECCV, 2018.
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments