AI帮你理解科学

AI 生成解读视频

AI抽取解析论文重点内容自动生成视频


pub
生成解读视频

AI 溯源

AI解析本论文相关学术脉络


Master Reading Tree
生成 溯源树

AI 精读

AI抽取本论文的概要总结


微博一下
The results on popular tracking benchmarks show that our triplet loss can improve the performance without reducing speed for these baselines

Triplet Loss In Siamese Network For Object Tracking

COMPUTER VISION - ECCV 2018, PT XIII, (2018): 472-488

引用277|浏览53
EI
下载 PDF 全文
引用
微博一下

摘要

Object tracking is still a critical and challenging problem with many applications in computer vision. For this challenge, more and more researchers pay attention to applying deep learning to get powerful feature for better tracking accuracy. In this paper, a novel triplet loss is proposed to extract expressive deep feature for object tra...更多

代码

数据

0
简介
  • Object tracking containing single object tracking [8, 9] and multi-object tracking [24, 25] remains an important problem with many applications, such as automated surveillance, and vehicle navigation [34].
  • Powerful feature selecting is one of the key step to improve tracking accuracy.
  • This strategy has been widely used for many correlation filter (CF) based trackers.
  • The pre-trained deep networks are applied to extract feature from raw image for improving accuracy, such as DeepSRDCF [6], CCOT [7], MCPF [36], and ECO [4].
  • Besides CF trackers, some deep learning
重点内容
  • Object tracking containing single object tracking [8, 9] and multi-object tracking [24, 25] remains an important problem with many applications, such as automated surveillance, and vehicle navigation [34]
  • We have proposed a novel triplet loss to achieve more powerful feature for object tracking by applying it into Siamese network
  • We have shown the effectiveness of the proposed triplet loss in theory and experiments
  • We found that when the network outputs wrong similarity scores, it gives more absolute gradients for feedback in back-propagation
  • We added this triplet loss into three baseline trackers based on Siamese network for experiments
  • The results on popular tracking benchmarks show that our triplet loss can improve the performance without reducing speed for these baselines
方法
  • Experiments on baseline trackers

    To validate the effectiveness of the triplet loss, the authors compare the baseline trackers (SiamFC [2], CFnet, and SiamImp [28]) against their different variants: SiamFC-init, CFnet2-init, SiamImp-init, SiamFC-tri, CFnet2-tri, and SiamImptri.
  • As shown in Fig. 3, directly training more epochs using logistic loss will reduce the precision and AUC of most baseline trackers excepting CFnet2
  • It indicates that the logistic loss can not enhance the representation power of original networks by training more iterations.
  • The corresponding results in Fig. 3 show it improves the performance in terms of both precision and overlap success rate in all the baseline trackers
  • It is worth mentioning all of the variants with triplet loss operate at almost the same high speed with baselines
结果
  • The authors show the experimental results on several popular tracking benchmarks including OTB-2013 [32], OTB-100 [33], and VOT-2017 [15].
  • Various comparisons on these benchmarks are shown to evaluate the proposed triplet loss, including experiments on baselines and comparisons between the trackers and other state-of-the-art trackers.
  • The one with 2 convolutional layers (CFnet2) obtains high speed and slightly lower performance than the best.
  • The authors' SiamFCT achieves the best EAO among all these compared trackers
  • Another variant with the triplet loss SiamImT occupies top position at the 4th ranking among all the trackers
结论
  • The authors have proposed a novel triplet loss to achieve more powerful feature for object tracking by applying it into Siamese network.
  • The authors found that when the network outputs wrong similarity scores, it gives more absolute gradients for feedback in back-propagation.
  • The authors added this triplet loss into three baseline trackers based on Siamese network for experiments.
  • The results on popular tracking benchmarks show that the triplet loss can improve the performance without reducing speed for these baselines
表格
  • Table1: EAO scores of VOT-2017 real-time challenge for our improved trackers: SiamFCT, CFnet2T, SiamImT, their baselines: SiamFC [<a class="ref-link" id="c2" href="#r2">2</a>], CFnet2, SiamImp [<a class="ref-link" id="c28" href="#r28">28</a>], recent tracker PTAV [<a class="ref-link" id="c10" href="#r10">10</a>], and the other top 9 trackers in VOT-2017 [<a class="ref-link" id="c15" href="#r15">15</a>]
Download tables as Excel
相关工作
  • Trackers with Siamese network: With the development of deep learning in recent years, many classical networks are introduced into object tracking, such as Siamese network [27], [2], [28]. Tao et al [27] trained a Siamese network to learn a matching function in the off-line phase. In the online tracking phase, the learned matching function is applied to find the most similar patch in new frame compared with the initial patch of object in the first frame. This Siamese Instance search Tracker (SINT) performs well in OTB-2013 [32] while its speed is only 2 fps. In order to improve running speed, Bertinetto et al [2] omitted the fully connected layers to reduce computation and only apply 5 fully convolutional layers to train an end-to-end Siamese network (SiamFC) for similarity function. Then, the similarity function is directly applied to online track without complex fine-tuning strategies. Therefore, SiamFC achieves high frame-rates beyond realtime, nearly at 86 fps with GPU. Another related tracker CFnet [28] regards the correlation filter as a network layer to compute the similarity between the generating convolutional features of Siamese network. It enables the learning deep features to be tightly coupled to the correlation filter. The experimental results show that 2 convolutional layers with CF layer in Siamese network (CFnet2) will achieve comparable performance and speed (75 fps) compared with SiamFC containing 5 convolutional layers. Otherwise, CFnet proposes an improved Siamese network (SiamImp) by modifying the structure in some convolutional layers of SiamFC [2]. SiamImp outperforms SiamFC in tracking accuracy on OTB-2013 and OTB-100 while it operates at lower speed, nearly 52 fps.
基金
  • This work was supported in part by the Beijing Natural Science Foundation under Grant 4182056, and the Fok Ying-Tong Education Foundation for Young Teachers under Grant 141067
研究对象与分析
pairs: 53200
SiamFC [2] and CFnet [28]. We randomly sample 53,200 pairs from the dataset ILSVRC15 [22] as a training epoch and perform training over 10 epochs. 10% pairs are chosen as the validation set at each epoch. And we decide the final network used for testing from the trained models at the end of each epoch, by the minimal mean error of distance (presented in [2]) on the validation set

引用论文
  • Bertinetto, L., Valmadre, J., Golodetz, S., Miksik, O., Torr, P.H.: Staple: Complementary learners for real-time tracking. In: IEEE CVPR. pp. 1401–1409 (2016)
    Google ScholarFindings
  • Bertinetto, L., Valmadre, J., Henriques, J.F., Vedaldi, A., Torr, P.H.: Fullyconvolutional siamese networks for object tracking. In: ECCV. pp. 850–865. Springer (2016)
    Google ScholarFindings
  • Cheng, D., Gong, Y., Zhou, S., Wang, J., Zheng, N.: Person re-identification by multi-channel parts-based cnn with improved triplet loss function. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 1335– 1344 (2016)
    Google ScholarLocate open access versionFindings
  • Danelljan, M., Bhat, G., Khan, F.S., Felsberg, M.: Eco: efficient convolution operators for tracking. In: Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA. pp. 21–26 (2017)
    Google ScholarLocate open access versionFindings
  • Danelljan, M., Hager, G., Khan, F., Felsberg, M.: Accurate scale estimation for robust visual tracking. In: BMVC (2014)
    Google ScholarFindings
  • Danelljan, M., Hager, G., Khan, F.S., Felsberg, M.: Learning spatially regularized correlation filters for visual tracking. In: IEEE ICCV. pp. 4310–4318 (2015)
    Google ScholarFindings
  • Danelljan, M., Robinson, A., Shahbaz Khan, F., Felsberg, M.: Beyond correlation filters: Learning continuous convolution operators for visual tracking. In: ECCV (2016)
    Google ScholarFindings
  • Dong, X., Shen, J., Wang, W., Liu, Y., Shao, L., Porikli, F.: Hyperparameter optimization for tracking with continuous deep q-learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 518–527 (2018)
    Google ScholarLocate open access versionFindings
  • Dong, X., Shen, J., Yu, D., Wang, W., Liu, J., Huang, H.: Occlusion-aware realtime object tracking. IEEE Transactions on Multimedia 19(4), 763–771 (2017)
    Google ScholarLocate open access versionFindings
  • Fan, H., Ling, H.: Parallel tracking and verifying: A framework for real-time and high accuracy visual tracking. In: Proc. IEEE Int. Conf. Computer Vision, Venice, Italy (2017)
    Google ScholarLocate open access versionFindings
  • Henriques, J.F., Caseiro, R., Martins, P., Batista, J.: Exploiting the circulant structure of tracking-by-detection with kernels. In: European conference on computer vision. pp. 702–715. Springer (2012)
    Google ScholarLocate open access versionFindings
  • Henriques, J.F., Rui, C., Martins, P., Batista, J.: High-speed tracking with kernelized correlation filters. IEEE Trans. on Pattern Analysis and Machine Intelligence 37(3), 583–596 (2015)
    Google ScholarLocate open access versionFindings
  • Hermans, A., Beyer, L., Leibe, B.: In defense of the triplet loss for person reidentification. arXiv preprint arXiv:1703.07737 (2017)
    Findings
  • Hoffer, E., Ailon, N.: Deep metric learning using triplet network. In: International Workshop on Similarity-Based Pattern Recognition. pp. 84–92. Springer (2015)
    Google ScholarLocate open access versionFindings
  • Kristan, M., Leonardis, A., Matas, J., Felsberg, M., Pflugfelder, R., Cehovin Zajc, L., Vojir, T., Hager, G., Lukezic, A., Eldesokey, A., Fernandez, G.: 16. Kristan, M., Leonardis, A., Matas, J., Felsberg, M., Pflugfelder, R., Cehovin Zajc, L., Vojir, T., Hager, G., Lukezic, A., Fernandez, G.: The visual object tracking vot2016 challenge results. Springer (Oct 2016), http://www.springer.com/gp/book/9783319488806
    Findings
  • 17. Kristan, M., Matas, J., Leonardis, A., Vojır, T., Pflugfelder, R., Fernandez, G., Nebehay, G., Porikli, F., Cehovin, L.: A novel performance evaluation methodology for single-target trackers. IEEE transactions on pattern analysis and machine intelligence 38(11), 2137–2155 (2016)
    Google ScholarLocate open access versionFindings
  • 18. Kristan, M., Matas, J., Leonardis, A., Felsberg, M., Cehovin, L., et al.: The visual object tracking vot2015 challenge results. In: Visual Object Tracking Workshop 2015 at ICCV2015 (2015)
    Google ScholarFindings
  • 19. Li, Y., Zhu, J.: A scale adaptive kernel correlation filter tracker with feature integration. In: ECCV Workshops. pp. 254–265 (2014)
    Google ScholarFindings
  • 20. Nam, H., Han, B.: Learning multi-domain convolutional neural networks for visual tracking. In: IEEE CVPR (2016)
    Google ScholarFindings
  • 21. Ning, J., Yang, J., Jiang, S., Zhang, L., Yang, M.H.: Object tracking via dual linear structured svm and explicit feature map. In: IEEE CVPR. pp. 4266–4274 (2016)
    Google ScholarFindings
  • 22. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV) 115(3), 211–252 (2015). https://doi.org/10.1007/s11263-015-0816-y
    Locate open access versionFindings
  • 23. Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: A unified embedding for face recognition and clustering. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 815–823 (2015)
    Google ScholarLocate open access versionFindings
  • 24. Shen, J., Liang, Z., Liu, J., Sun, H., Shao, L., Tao, D.: Multiobject tracking by submodular optimization. IEEE Transactions on Cybernetics (2018)
    Google ScholarLocate open access versionFindings
  • 25. Shen, J., Yu, D., Deng, L., Dong, X.: Fast online tracking with detection refinement. IEEE Transactions on Intelligent Transportation Systems (2017)
    Google ScholarLocate open access versionFindings
  • 26. Song, H.O., Xiang, Y., Jegelka, S., Savarese, S.: Deep metric learning via lifted structured feature embedding. In: Computer Vision and Pattern Recognition (CVPR), 2016 IEEE Conference on. pp. 4004–4012. IEEE (2016)
    Google ScholarLocate open access versionFindings
  • 27. Tao, R., Gavves, E., Smeulders, A.W.: Siamese instance search for tracking. In: Computer Vision and Pattern Recognition (CVPR), 2016 IEEE Conference on. pp. 1420–1429. IEEE (2016)
    Google ScholarLocate open access versionFindings
  • 28. Valmadre, J., Bertinetto, L., Henriques, J.F., Vedaldi, A., Torr, P.H.S.: End-to-end representation learning for correlation filter based tracking. In: IEEE CVPR. pp. 5000–5008 (2017)
    Google ScholarFindings
  • 29. Vedaldi, A., Lenc, K.: Matconvnet: Convolutional neural networks for matlab. In: Proceedings of the 23rd ACM international conference on Multimedia. pp. 689–692. ACM (2015)
    Google ScholarLocate open access versionFindings
  • 30. Wang, F., Zuo, W., Lin, L., Zhang, D., Zhang, L.: Joint learning of single-image and cross-image representations for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 1288–1296 (2016)
    Google ScholarLocate open access versionFindings
  • 31. Weinberger, K.Q., Blitzer, J., Saul, L.K.: Distance metric learning for large margin nearest neighbor classification. In: Advances in neural information processing systems. pp. 1473–1480 (2006)
    Google ScholarFindings
  • 32. Wu, Y., Lim, J., Yang, M.H.: Online object tracking: A benchmark. In: IEEE CVPR. pp. 2411–2418 (2013)
    Google ScholarFindings
  • 33. Yi, W., Jongwoo, L., Yang, M.H.: Object tracking benchmark. IEEE Transactions on Pattern Analysis and Machine Intelligence 37(9), 1834–1848 (2015)
    Google ScholarLocate open access versionFindings
  • 34. Yilmaz, A., Javed, O., Shah, M.: Object tracking: A survey. Acm computing surveys (CSUR) 38(4), 13 (2006)
    Google ScholarLocate open access versionFindings
  • 35. Zhang, J., Ma, S., Sclaroff, S.: Meem: Robust tracking via multiple experts using entropy minimization. In: ECCV. pp. 188–203 (2014)
    Google ScholarFindings
  • 36. Zhang, T., Xu, C., Yang, M.H.: Multi-task correlation particle filter for robust object tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. vol. 1, p. 3 (2017)
    Google ScholarLocate open access versionFindings
  • 37. Zhuang, B., Lin, G., Shen, C., Reid, I.: Fast training of triplet-based deep binary embedding networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 5955–5964 (2016)
    Google ScholarLocate open access versionFindings
0
您的评分 :

暂无评分

标签
评论
avatar
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn