Cooling-Shrinking Attack: Blinding the Tracker with Imperceptible Noises

CVPR, pp.987-996, (2020)

Cited by: 5|Views55
EI
Weibo:
We show that a discriminator is not necessary in adversarial attack of the tracker because the combination of the adversarial loss and L2 loss has already achieved our goal

Abstract:

Adversarial attack of CNN aims at deceiving models to misbehave by adding imperceptible perturbations to images. This feature facilitates to understand neural networks deeply and to improve the robustness of deep learning models. Although several works have focused on attacking image classifiers and object detectors, an effective and ef...More

Code:

Data:

0
ZH
Introduction
  • Online single object tracking is a fundamental task in computer vision and has many important applications including intelligent surveillance, autonomous driving, human-machine interaction, to name a few.
  • Popular adversarial attack methods can be roughly summarized into two categories: iterative-optimization-based and deep-network-based attacks
  • The former method [9, 21, 35] applies many times of gradient ascent to maximize an adversarial objective function for deceiving deep networks and is usually time-consuming.
  • The latter one [34, 31] applies tremendous data to train an adversarial perturbation-generator.
  • Adversarial attack has become a popular topic and has extended from image classification to more challenging tasks, such as object detection [35, 31] and semantic segmentation [35]
Highlights
  • Online single object tracking is a fundamental task in computer vision and has many important applications including intelligent surveillance, autonomous driving, human-machine interaction, to name a few
  • Adversarial attack is originated from [29], which has shown that state-of-the-art deep learning models can be fooled by adding small perturbations to original images
  • We present an effective and efficient adversarial attacking algorithm for deceiving single object trackers
  • The generator trained with this adversarial loss and L2 loss can deceive SiamRPN++ at a high success rate with imperceptible noises
  • We show that a discriminator is not necessary in adversarial attack of the tracker because the combination of the adversarial loss and L2 loss has already achieved our goal
Methods
  • The authors implement the algorithm with Pytorch [23] deep learning framework. The hardware platform is a PC machine with an intel-i9 CPU (64GB memory) and a RTX-2080Ti GPU (11GB memory).
  • VOT2018 [14] is another challenging tracking benchmark, which simultaneously measures the tracker’s accuracy and robustness
  • This benchmark includes 60 videos and ranks the trackers’ performance with the expected average overlap (EAO) rule.
  • The speed of the attacking algorithm is extremely fast
  • It only takes the model less than 9 ms to transform a clean search region to the adversarial one, and less than 3 ms to transform a clean template to the adversarial one
Conclusion
  • The authors present an effective and efficient adversarial attacking algorithm for deceiving single object trackers.
  • A novel cooling-shrinking loss is proposed to train the perturbation-generator.
  • The generator trained with this adversarial loss and L2 loss can deceive SiamRPN++ at a high success rate with imperceptible noises.
  • The authors' algorithm is quite efficient and can transform clean templates/search regions to adversarial ones in a short time interval
Summary
  • Introduction:

    Online single object tracking is a fundamental task in computer vision and has many important applications including intelligent surveillance, autonomous driving, human-machine interaction, to name a few.
  • Popular adversarial attack methods can be roughly summarized into two categories: iterative-optimization-based and deep-network-based attacks
  • The former method [9, 21, 35] applies many times of gradient ascent to maximize an adversarial objective function for deceiving deep networks and is usually time-consuming.
  • The latter one [34, 31] applies tremendous data to train an adversarial perturbation-generator.
  • Adversarial attack has become a popular topic and has extended from image classification to more challenging tasks, such as object detection [35, 31] and semantic segmentation [35]
  • Methods:

    The authors implement the algorithm with Pytorch [23] deep learning framework. The hardware platform is a PC machine with an intel-i9 CPU (64GB memory) and a RTX-2080Ti GPU (11GB memory).
  • VOT2018 [14] is another challenging tracking benchmark, which simultaneously measures the tracker’s accuracy and robustness
  • This benchmark includes 60 videos and ranks the trackers’ performance with the expected average overlap (EAO) rule.
  • The speed of the attacking algorithm is extremely fast
  • It only takes the model less than 9 ms to transform a clean search region to the adversarial one, and less than 3 ms to transform a clean template to the adversarial one
  • Conclusion:

    The authors present an effective and efficient adversarial attacking algorithm for deceiving single object trackers.
  • A novel cooling-shrinking loss is proposed to train the perturbation-generator.
  • The generator trained with this adversarial loss and L2 loss can deceive SiamRPN++ at a high success rate with imperceptible noises.
  • The authors' algorithm is quite efficient and can transform clean templates/search regions to adversarial ones in a short time interval
Tables
  • Table1: Effect of attacking search regions. The third column represents SiamRPNpp’s original results. The fourth column represents results produced by attacking search regions. The last column represents the performance drop
  • Table2: Effect of attacking the template. The third column represents SiamRPNpp’s original results. The fourth column represents results produced by attacking the template. The last column represents the performance drop
  • Table3: Effect of attacking both search regions and the template
  • Table4: Comparison with other kinds of noises
  • Table5: Adversarial effect on other state-of-the-art trackers
Download tables as Excel
Related work
  • 2.1. Single Object Tracking

    Given the tracked target in the first frame, single object tracking (SOT) aims at capturing the location of the target in the subsequent frames. Different from object detection that recognizes objects of predefined categories, the SOT task belongs to one-shot learning, requiring trackers to be capable of tracking any possible targets. Efficient and robust trackers are difficult to be designed because of challenges, such as occlusion, similar distractors, deformation, and motion blur, during tracking. Recently, with the prosperity of deep learning and the introduction of large-scale object tracking datasets [7, 12], the study of SOT has undergone a rapid development. Currently, state-of-the-art trackers can be divided into two categories. One is based on SiamRPN (including SiamRPN [16], DaSiamRPN [40], SiamRPN+ [39], SiamRPN++ [15], and SiamMask [30]), and the other is based on deep discriminative models (including ATOM [6] and DiMP [3]).
Funding
  • The paper is supported in part by the National Key R&D Program of China under Grant No 2018AAA0102001 and National Natural Science Foundation of China under grant No 61725202, U1903215, 61829102, 91538201, 61751212 and the Fundamental Research Funds for the Central Universities under Grant No DUT19GJ201
Reference
  • Anish Athalye, Logan Engstrom, Andrew Ilyas, and Kevin Kwok. Synthesizing robust adversarial examples. arXiv preprint arXiv:1707.07397, 2017. 2, 3
    Findings
  • Luca Bertinetto, Jack Valmadre, Joao F Henriques, Andrea Vedaldi, and Philip H S Torr. Fully-convolutional siamese networks for object tracking. In ECCV 2016 Workshops, pages 850–865, 2016. 6
    Google ScholarLocate open access versionFindings
  • Goutam Bhat, Martin Danelljan, Luc Van Gool, and Radu Timofte. Learning discriminative model prediction for tracking. In ICCV, 2019. 1, 2, 5, 8
    Google ScholarLocate open access versionFindings
  • Kenan Dai, Yunhua Zhang, Dong Wang, Jianhua Li, Huchuan Lu, and Xiaoyun Yang. High-performance longterm tracking with meta-updater, 2020. 1
    Google ScholarFindings
  • Martin Danelljan, Goutam Bhat, Fahad Shahbaz Khan, and Michael Felsberg. ECO: Efficient convolution operators for tracking. In CVPR, 2017. 6
    Google ScholarLocate open access versionFindings
  • Martin Danelljan, Goutam Bhat, Fahad Shahbaz Khan, and Michael Felsberg. ATOM: Accurate tracking by overlap maximization. In CVPR, 2019. 1, 2
    Google ScholarLocate open access versionFindings
  • Heng Fan, Liting Lin, Fan Yang, Peng Chu, Ge Deng, Sijia Yu, Hexin Bai, Yong Xu, Chunyuan Liao, and Haibin Ling. LaSOT: A high-quality benchmark for large-scale single object tracking. In CVPR, 2019. 1, 2, 5, 7, 8
    Google ScholarLocate open access versionFindings
  • Ruochen Fan, Fang-Lue Zhang, Min Zhang, and Ralph R Martin. Robust tracking-by-detection using a selection and completion mechanism. Computational Visual Media, 3(3):285–294, 2017. 1
    Google ScholarLocate open access versionFindings
  • Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572, 2014. 1, 2
    Findings
  • Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In CVPR, 2016. 2, 3
    Google ScholarLocate open access versionFindings
  • David Held, Sebastian Thrun, and Silvio Savarese. Learning to track at 100 fps with deep regression networks. In ECCV, 2016. 3
    Google ScholarLocate open access versionFindings
  • Lianghua Huang, Xin Zhao, and Kaiqi Huang. GOT-10k: A large high-diversity benchmark for generic object tracking in the Wild. arXiv preprint arXiv:1810.11981, 2018. 1, 2, 5
    Findings
  • Borui Jiang, Ruixuan Luo, Jiayuan Mao, Tete Xiao, and Yuning Jiang. Acquisition of localization confidence for accurate object detection. In ECCV, 2018. 2
    Google ScholarLocate open access versionFindings
  • Matej Kristan, Ales Leonardis, Jiri Matas, Michael Felsberg, Roman Pfugfelder, Luka Cehovin Zajc, Tomas Vojir, Goutam Bhat, Alan Lukezic, Abdelrahman Eldesokey, Gustavo Fernandez, and et al. The sixth visual object tracking VOT2018 challenge results. In ECCV, 2018. 2, 5
    Google ScholarLocate open access versionFindings
  • Bo Li, Wei Wu, Qiang Wang, Fangyi Zhang, Junliang Xing, and Junjie Yan. SiamRPN++: Evolution of siamese visual tracking with very deep networks. In CVPR, 2019. 1, 2, 3, 5, 6, 8
    Google ScholarLocate open access versionFindings
  • Bo Li, Junjie Yan, Wei Wu, Zheng Zhu, and Xiaolin Hu. High performance visual tracking with siamese region proposal network. In CVPR, 2018. 1, 2, 3, 4
    Google ScholarLocate open access versionFindings
  • Junwei Li, Xiaolong Zhou, Sixian Chan, and Shengyong Chen. Object tracking using a convolutional network and a structured output svm. Computational Visual Media, 3(4):325–335, 201
    Google ScholarLocate open access versionFindings
  • Peixia Li, Dong Wang, Lijun Wang, and Huchuan Lu. Deep visual tracking: Review and experimental comparison. Pattern Recognition, 76:323–338, 201
    Google ScholarLocate open access versionFindings
  • Jiajun Lu, Hussein Sibai, Evan Fabry, and David Forsyth. No need to worry about adversarial examples in object detection in autonomous vehicles. arXiv preprint arXiv:1707.03501, 2017. 3
    Findings
  • Yan Luo, Xavier Boix, Gemma Roig, Tomaso Poggio, and Qi Zhao. Foveation-based mechanisms alleviate adversarial examples. arXiv preprint arXiv:1511.06292, 2015. 3
    Findings
  • Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, and Pascal Frossard. Deepfool: A simple and accurate method to fool deep neural networks. In CVPR, 2016. 1, 2
    Google ScholarLocate open access versionFindings
  • Hyeonseob Nam and Bohyung Han. Learning multi–domain convolutional neural networks for visual tracking. In CVPR, 2016. 6
    Google ScholarLocate open access versionFindings
  • Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. Automatic differentiation in PyTorch. In NIPS Autodiff Workshop, 2017. 5
    Google ScholarLocate open access versionFindings
  • Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. Faster R–CNN: Towards real-time object detection with region proposal networks. In NIPS, 2015. 2
    Google ScholarLocate open access versionFindings
  • Olaf Ronneberger, Philipp Fischer, and Thomas Brox. UNet: Convolutional networks for biomedical image segmentation. arXiv preprint arXiv:1505.04597, 2015. 4
    Findings
  • Mahmood Sharif, Sruti Bhagavatula, Lujo Bauer, and Michael K. Reiter. Adversarial generative nets: Neural network attacks on state-of-the-art face recognition. arXiv preprint arXiv:1801.00349, 2017. 2, 3
    Findings
  • Yibing Song, Chao Ma, Xiaohe Wu, Lijun Gong, Linchao Bao, Wangmeng Zuo, Chunhua Shen, Rynson W.H. Lau, and Ming-Hsuan Yang. VITAL: VIsual Tracking via Adversarial Learning. In CVPR, 2018. 6
    Google ScholarLocate open access versionFindings
  • Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. Going deeper with convolutions. In CVPR, 2015. 2
    Google ScholarLocate open access versionFindings
  • Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199, 2013. 1, 2
    Findings
  • Qiang Wang, Li Zhang, Luca Bertinetto, Weiming Hu, and Philip H. S. Torr. Fast online object tracking and segmentation: A unifying approach. In CVPR, 2019. 2, 3, 5
    Google ScholarLocate open access versionFindings
  • Xingxing Wei, Siyuan Liang, Ning Chen, and Xiaochun Cao. Transferable adversarial attacks for image and video object detection. arXiv preprint arXiv:1811.12641, 2018. 1, 2, 7
    Findings
  • Rey Reza Wiyatno and Anqi Xu. Physical adversarial textures that fool visual object tracking. In ICCV, 2019. 2, 3
    Google ScholarLocate open access versionFindings
  • Yi Wu, Jongwoo Lim, and Ming Hsuan Yang. Object tracking benchmark. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(9):1834–1848, 2015. 2, 5, 7
    Google ScholarLocate open access versionFindings
  • Chaowei Xiao, Bo Li, Jun-Yan Zhu, Warren He, Mingyan Liu, and Dawn Song. Generating adversarial examples with adversarial networks. arXiv preprint arXiv:1801.02610, 2018. 1, 2, 7
    Findings
  • Cihang Xie, Jianyu Wang, Zhishuai Zhang, Yuyin Zhou, Lingxi Xie, and Alan Yuille. Adversarial examples for semantic segmentation and object detection. In ICCV, 2017. 1, 2
    Google ScholarLocate open access versionFindings
  • Bin Yan, Haojie Zhao, Dong Wang, Huchuan Lu, and Xiaoyun Yang. ‘Skimming-Perusal’ Tracking: A framework for real-time and robust long-term tracking. In ICCV, 2019. 6
    Google ScholarLocate open access versionFindings
  • Lichao Zhang, Abel Gonzalez-Garcia, Joost van de Weijer, Martin Danelljan, and Fahad Shahbaz Khan. Learning the model update for siamese trackers. In ICCV, 2019. 5, 8
    Google ScholarLocate open access versionFindings
  • Yunhua Zhang, Lijun Wang, Jinqing Qi, Dong Wang, Mengyang Feng, and Huchuan Lu. Structured siamese network for real-time visual tracking. In ECCV, 2018. 6
    Google ScholarLocate open access versionFindings
  • Zhipeng Zhang and Houwen Peng. Deeper and wider siamese networks for real-time visual tracking. In CVPR, 2019. 1, 2, 3
    Google ScholarLocate open access versionFindings
  • Zheng Zhu, Qiang Wang, Bo Li, Wei Wu, Junjie Yan, and Weiming Hu. Distractor-aware siamese networks for visual object tracking. In ECCV, 2018. 1, 2, 3, 5, 8
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments