AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
We proposed to use Neural Architecture Search to further optimize the process of designing Feature Pyramid Networks for Object Detection

NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection.

Computer Vision and Pattern Recognition, (2019): 7036-7045

Cited by: 481|Views293
EI
Full Text
Bibtex
Weibo

Abstract

Current state-of-the-art convolutional architectures for object detection are manually designed. Here we aim to learn a better architecture of feature pyramid network for object detection. We adopt Neural Architecture Search and discover a new feature pyramid architecture in a novel scalable search space covering all cross-scale connectio...More

Code:

Data:

Introduction
  • Learning visual feature representations is a fundamental problem in computer vision. In the past few years, great progress has been made on designing the model architecture of deep convolutional networks (ConvNets) for image classification [12, 15, 35] and object detection [21, 22].
  • Unlike image classification which predicts class probability for an image, object detection has its own challenge to detect and localize multiple objects across a wide range of scales and locations
  • To address this issue, the pyramidal feature representations, which represent an image with multiscale feature layers, are commonly used by many modern object detectors [11, 23, 26].
  • The high-level features, which are semantically strong but lower resolution, are up-
Highlights
  • Learning visual feature representations is a fundamental problem in computer vision
  • The pyramidal feature representations, which represent an image with multiscale feature layers, are commonly used by many modern object detectors [11, 23, 26]
  • We aims to discover an atomic architecture that has identical input and output feature levels and can be applied repeatedly
  • In Appendix A, we show NAS-Feature Pyramid Network can be used for anytime detection
  • We proposed to use Neural Architecture Search to further optimize the process of designing Feature Pyramid Networks for Object Detection
  • Our experiments on the COCO dataset showed that the discovered architecture, named NAS-Feature Pyramid Network, is flexible and performant for building accurate detection model
Methods
  • The authors' method is based on the RetinaNet framework [23] because it is simple and efficient.
  • The RetinaNet framework has two main components: a backbone network and a feature pyramid network (FPN).
  • To discover a better FPN, the authors make use of the Neural Architecture Search framework proposed by [44].
  • Through trial and error the controller learns to generate better architectures over time
  • As it has been identified by previous works [36, 44, 45], the search space plays a crucial role in the success of architecture search
Results
  • In Figure 8a, the authors show that stacking the vanilla FPN architecture does not always improve performance whereas stacking NAS-FPN improves accuracy significantly.
Conclusion
  • The authors proposed to use Neural Architecture Search to further optimize the process of designing Feature Pyramid Networks for Object Detection.
  • The authors' experiments on the COCO dataset showed that the discovered architecture, named NAS-FPN, is flexible and performant for building accurate detection model.
  • On a wide range of accuracy and speed tradeoff, NAS-FPN produces significant.
  • 2https://github.com/tensorflow/models/tree/master/research/object detection improvements upon many backbone architectures
Tables
  • Table1: Performance of RetinaNet with NAS-FPN and other state-of-the-art detectors on test-dev set of COCO
Download tables as Excel
Funding
  • In Figure 8a, we show that stacking the vanilla FPN architecture does not always improve performance whereas stacking NAS-FPN improves accuracy significantly
Reference
  • E. H. Adelson, C. H. Anderson, J. R. Bergen, P. J. Burt, and J. M. Ogden. Pyramid methods in image processinh. RCA engineer, 1984. 2
    Google ScholarFindings
  • B. Baker, O. Gupta, N. Naik, and R. Raskar. Designing neural network architectures using reinforcement learning. In ICLR, 2016. 2
    Google ScholarLocate open access versionFindings
  • T. Bolukbasi, J. Wang, O. Dekel, and V. Saligrama. Adaptive neural networks for efficient inference. In ICML, 2017. 2
    Google ScholarLocate open access versionFindings
  • L.-C. Chen, M. D. Collins, Y. Zhu, G. Papandreou, B. Zoph, F. Schroff, H. Adam, and J. Shlens. Searching for efficient multi-scale architectures for dense image prediction. In NIPS, 2018. 2
    Google ScholarLocate open access versionFindings
  • R. J. L.-S. D. Ooro-Rubio, M. Niepert. Learning short-cut connections for object counting. BMVC, 2018. 2
    Google ScholarFindings
  • T. Elsken, J. H. Metzen, and F. Hutter. Neural architecture search: A survey. arXiv preprint arXiv:1808.05377, 2018. 2
    Findings
  • C. Fu, W. Liu, A. Ranga, A. Tyagi, and A. C. Berg. DSSD: Deconvolutional single shot detector. CoRR, abs/1701.06659, 2011
    Findings
  • G. Ghiasi and C. C. Fowlkes. Laplacian pyramid reconstruction and refinement for semantic segmentation. In ECCV, 2016. 2
    Google ScholarLocate open access versionFindings
  • G. Ghiasi, T. Lin, and Q. V. Le. DropBlock: A regularization method for convolutional networks. NIPS, 2018. 4, 6, 8
    Google ScholarLocate open access versionFindings
  • R. Girshick, I. Radosavovic, G. Gkioxari, P. Dollar, and K. He. Detectron. https://github.com/facebookresearch/detectron, 2018.1, 2
    Findings
  • K. He, G. Gkioxari, P. Dollar, and R. Girshick. Mask RCNN. In ICCV, 2017. 1, 2, 8
    Google ScholarLocate open access versionFindings
  • K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In CVPR, 2016. 1, 2
    Google ScholarLocate open access versionFindings
  • G. Huang, D. Chen, T. Li, F. Wu, L. van der Maaten, and K. Weinberger. Multi-scale dense networks for resource efficient image classification. In ICLR, 2018. 4
    Google ScholarLocate open access versionFindings
  • G. Huang, D. Chen, T. Li, F. Wu, L. van der Maaten, and K. Q. Weinberger. Multi-scale dense networks for resource efficient image classification. In ICLR, 2017. 2
    Google ScholarLocate open access versionFindings
  • G. Huang, Z. Liu, and K. Q. Weinberger. Densely connected convolutional networks. In CVPR, 2017. 1
    Google ScholarLocate open access versionFindings
  • T. Kong, F. Sun, W. Huang, and H. Liu. Deep feature pyramid reconfiguration for object detection. In ECCV, 2018. 1, 2
    Google ScholarLocate open access versionFindings
  • T. Kong, F. Sun, A. Yao, H. Liu, M. Lu, and Y. Chen. RON: reverse connection with objectness prior networks for object detection. In CVPR, 201
    Google ScholarLocate open access versionFindings
  • H. Law and J. Deng. Cornernet: Detecting objects as paired keypoints. In ECCV, 208
    Google ScholarLocate open access versionFindings
  • C.-Y. Lee, S. Xie, P. Gallagher, Z. Zhang, and Z. Tu. Deeplysupervised nets. In AISTATS, 2015. 4
    Google ScholarLocate open access versionFindings
  • H. Li, P. Xiong, J. An, and L. Wang. Pyramid attention network for semantic segmentation. BMVC, 2018. 4
    Google ScholarLocate open access versionFindings
  • Z. Li, C. Peng, G. Yu, X. Zhang, Y. Deng, and J. Sun. Detnet: A backbone network for object detection. In ECCV, 2018. 1
    Google ScholarLocate open access versionFindings
  • T.-Y. Lin, P. Dollar, R. B. Girshick, K. He, B. Hariharan, and S. J. Belongie. Feature pyramid networks for object detection. In CVPR, 2017. 1, 2, 4
    Google ScholarLocate open access versionFindings
  • T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollar. Focal loss for dense object detection. In ICCV, 2017. 1, 2, 3, 8
    Google ScholarLocate open access versionFindings
  • C. Liu, B. Zoph, J. Shlens, W. Hua, L.-J. Li, L. Fei-Fei, A. Yuille, J. Huang, and K. Murphy. Progressive neural architecture search. In ECCV, 2017. 2
    Google ScholarLocate open access versionFindings
  • S. Liu, L. Qi, H. Qin, J. Shi, and J. Jia. Path aggregation network for instance segmentation. In CVPR, 2018. 1, 2
    Google ScholarLocate open access versionFindings
  • W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C. Berg. SSD: single shot multibox detector. In ECCV, 2016. 1
    Google ScholarLocate open access versionFindings
  • N. D. B. B. Md Amirul Islam, Mrigank Rochan and Y. Wang. Gated feedback refinement network for dense image labeling. CVPR, 2017. 2
    Google ScholarLocate open access versionFindings
  • A. Newell, K. Yang, and J. Deng. Stacked hourglass networks for human pose estimation. In ECCV, 2016. 2
    Google ScholarLocate open access versionFindings
  • E. Real, A. Aggarwal, Y. Huang, and Q. V. Le. Regularized evolution for image classifier architecture search. In AAAI, 2018. 2, 5
    Google ScholarLocate open access versionFindings
  • J. Redmon and A. Farhadi. Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767, 2018. 8
    Findings
  • O. Ronneberger, P. Fischer, and T. Brox. U-Net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention, 2015. 2
    Google ScholarLocate open access versionFindings
  • M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.C. Chen. MobileNetV2: inverted residuals and linear bottl. CVPR, 2019. 1, 2, 7, 8
    Google ScholarLocate open access versionFindings
  • J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017. 5
    Findings
  • J.-Y. S. M.-C. K. S.-J. K. Seung-Wook Kim, HyongKeun Kook. Parallel feature pyramid network for object detection. ECCV, 2018. 1
    Google ScholarLocate open access versionFindings
  • C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. E. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Deep residual learning for image recognition. In CVPR, 2015. 1
    Google ScholarLocate open access versionFindings
  • M. Tan, B. Chen, R. Pang, V. Vasudevan, and Q. V. Le. Mnasnet: Platform-aware neural architecture search for mobile. arXiv preprint arXiv:1807.11626, 2018. 3, 8
    Findings
  • S. Teerapittayanon, B. McDanel, and H. Kung. Branchynet: Fast inference via early exiting from deep neural networks. In ICPR, pages 2464–2469. IEEE, 2016. 2
    Google ScholarLocate open access versionFindings
  • S. Woo, S. Hwang, and I. S. Kweon. StairNet: top-down semantic aggregation for accurate one shot detection. In WACV, 2018. 1
    Google ScholarLocate open access versionFindings
  • D. K. Yonghyun Kim, Bong-Nam Kang. San: Learning relationship between convolutional features for multi-scale object detection. ECCV, 2018. 1
    Google ScholarLocate open access versionFindings
  • F. Yu, D. Wang, E. Shelhamer, and T. Darrell. Deep layer aggregation. In CVPR, 2018. 1
    Google ScholarLocate open access versionFindings
  • S. Zhang, L. Wen, X. Bian, Z. Lei, and S. Z. Li. Single-shot refinement neural network for object detection. In CVPR, 2018. 1, 8
    Google ScholarLocate open access versionFindings
  • Q. Zhao, T. Sheng, Y. Wang, Z. Tang, Y. Chen, L. Cai, and H. Ling. M2det: A single-shot object detector based on multi-level feature pyramid network. AAAI, 2019. 2
    Google ScholarLocate open access versionFindings
  • P. Zhou, B. Ni, C. Geng, J. Hu, and Y. Xu. Scaletransferrable object detection. In CVPR, 2018. 1
    Google ScholarLocate open access versionFindings
  • B. Zoph and Q. V. Le. Neural architecture search with reinforcement learning. In ICLR, 2017. 2, 3, 4, 5
    Google ScholarLocate open access versionFindings
  • B. Zoph, V. Vasudevan, J. Shlens, and Q. V. Le. Learning transferable architectures for scalable image recognition. In CVPR, 2018. 2, 3, 4
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科