AI helps you reading Science
AI generates interpretation videos
AI extracts and analyses the key points of the paper to generate videos automatically
AI parses the academic lineage of this thesis
AI extracts a summary of this paper
We proposed to use Neural Architecture Search to further optimize the process of designing Feature Pyramid Networks for Object Detection
NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection.
Computer Vision and Pattern Recognition, (2019): 7036-7045
Current state-of-the-art convolutional architectures for object detection are manually designed. Here we aim to learn a better architecture of feature pyramid network for object detection. We adopt Neural Architecture Search and discover a new feature pyramid architecture in a novel scalable search space covering all cross-scale connectio...More
PPT (Upload PPT)
- Learning visual feature representations is a fundamental problem in computer vision. In the past few years, great progress has been made on designing the model architecture of deep convolutional networks (ConvNets) for image classification [12, 15, 35] and object detection [21, 22].
- Unlike image classification which predicts class probability for an image, object detection has its own challenge to detect and localize multiple objects across a wide range of scales and locations
- To address this issue, the pyramidal feature representations, which represent an image with multiscale feature layers, are commonly used by many modern object detectors [11, 23, 26].
- The high-level features, which are semantically strong but lower resolution, are up-
- Learning visual feature representations is a fundamental problem in computer vision
- The pyramidal feature representations, which represent an image with multiscale feature layers, are commonly used by many modern object detectors [11, 23, 26]
- We aims to discover an atomic architecture that has identical input and output feature levels and can be applied repeatedly
- In Appendix A, we show NAS-Feature Pyramid Network can be used for anytime detection
- We proposed to use Neural Architecture Search to further optimize the process of designing Feature Pyramid Networks for Object Detection
- Our experiments on the COCO dataset showed that the discovered architecture, named NAS-Feature Pyramid Network, is flexible and performant for building accurate detection model
- The authors' method is based on the RetinaNet framework  because it is simple and efficient.
- The RetinaNet framework has two main components: a backbone network and a feature pyramid network (FPN).
- To discover a better FPN, the authors make use of the Neural Architecture Search framework proposed by .
- Through trial and error the controller learns to generate better architectures over time
- As it has been identified by previous works [36, 44, 45], the search space plays a crucial role in the success of architecture search
- In Figure 8a, the authors show that stacking the vanilla FPN architecture does not always improve performance whereas stacking NAS-FPN improves accuracy significantly.
- The authors proposed to use Neural Architecture Search to further optimize the process of designing Feature Pyramid Networks for Object Detection.
- The authors' experiments on the COCO dataset showed that the discovered architecture, named NAS-FPN, is flexible and performant for building accurate detection model.
- On a wide range of accuracy and speed tradeoff, NAS-FPN produces significant.
- 2https://github.com/tensorflow/models/tree/master/research/object detection improvements upon many backbone architectures
- Table1: Performance of RetinaNet with NAS-FPN and other state-of-the-art detectors on test-dev set of COCO
- In Figure 8a, we show that stacking the vanilla FPN architecture does not always improve performance whereas stacking NAS-FPN improves accuracy significantly
- E. H. Adelson, C. H. Anderson, J. R. Bergen, P. J. Burt, and J. M. Ogden. Pyramid methods in image processinh. RCA engineer, 1984. 2
- B. Baker, O. Gupta, N. Naik, and R. Raskar. Designing neural network architectures using reinforcement learning. In ICLR, 2016. 2
- T. Bolukbasi, J. Wang, O. Dekel, and V. Saligrama. Adaptive neural networks for efficient inference. In ICML, 2017. 2
- L.-C. Chen, M. D. Collins, Y. Zhu, G. Papandreou, B. Zoph, F. Schroff, H. Adam, and J. Shlens. Searching for efficient multi-scale architectures for dense image prediction. In NIPS, 2018. 2
- R. J. L.-S. D. Ooro-Rubio, M. Niepert. Learning short-cut connections for object counting. BMVC, 2018. 2
- T. Elsken, J. H. Metzen, and F. Hutter. Neural architecture search: A survey. arXiv preprint arXiv:1808.05377, 2018. 2
- C. Fu, W. Liu, A. Ranga, A. Tyagi, and A. C. Berg. DSSD: Deconvolutional single shot detector. CoRR, abs/1701.06659, 2011
- G. Ghiasi and C. C. Fowlkes. Laplacian pyramid reconstruction and refinement for semantic segmentation. In ECCV, 2016. 2
- G. Ghiasi, T. Lin, and Q. V. Le. DropBlock: A regularization method for convolutional networks. NIPS, 2018. 4, 6, 8
- R. Girshick, I. Radosavovic, G. Gkioxari, P. Dollar, and K. He. Detectron. https://github.com/facebookresearch/detectron, 2018.1, 2
- K. He, G. Gkioxari, P. Dollar, and R. Girshick. Mask RCNN. In ICCV, 2017. 1, 2, 8
- K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In CVPR, 2016. 1, 2
- G. Huang, D. Chen, T. Li, F. Wu, L. van der Maaten, and K. Weinberger. Multi-scale dense networks for resource efficient image classification. In ICLR, 2018. 4
- G. Huang, D. Chen, T. Li, F. Wu, L. van der Maaten, and K. Q. Weinberger. Multi-scale dense networks for resource efficient image classification. In ICLR, 2017. 2
- G. Huang, Z. Liu, and K. Q. Weinberger. Densely connected convolutional networks. In CVPR, 2017. 1
- T. Kong, F. Sun, W. Huang, and H. Liu. Deep feature pyramid reconfiguration for object detection. In ECCV, 2018. 1, 2
- T. Kong, F. Sun, A. Yao, H. Liu, M. Lu, and Y. Chen. RON: reverse connection with objectness prior networks for object detection. In CVPR, 201
- H. Law and J. Deng. Cornernet: Detecting objects as paired keypoints. In ECCV, 208
- C.-Y. Lee, S. Xie, P. Gallagher, Z. Zhang, and Z. Tu. Deeplysupervised nets. In AISTATS, 2015. 4
- H. Li, P. Xiong, J. An, and L. Wang. Pyramid attention network for semantic segmentation. BMVC, 2018. 4
- Z. Li, C. Peng, G. Yu, X. Zhang, Y. Deng, and J. Sun. Detnet: A backbone network for object detection. In ECCV, 2018. 1
- T.-Y. Lin, P. Dollar, R. B. Girshick, K. He, B. Hariharan, and S. J. Belongie. Feature pyramid networks for object detection. In CVPR, 2017. 1, 2, 4
- T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollar. Focal loss for dense object detection. In ICCV, 2017. 1, 2, 3, 8
- C. Liu, B. Zoph, J. Shlens, W. Hua, L.-J. Li, L. Fei-Fei, A. Yuille, J. Huang, and K. Murphy. Progressive neural architecture search. In ECCV, 2017. 2
- S. Liu, L. Qi, H. Qin, J. Shi, and J. Jia. Path aggregation network for instance segmentation. In CVPR, 2018. 1, 2
- W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C. Berg. SSD: single shot multibox detector. In ECCV, 2016. 1
- N. D. B. B. Md Amirul Islam, Mrigank Rochan and Y. Wang. Gated feedback refinement network for dense image labeling. CVPR, 2017. 2
- A. Newell, K. Yang, and J. Deng. Stacked hourglass networks for human pose estimation. In ECCV, 2016. 2
- E. Real, A. Aggarwal, Y. Huang, and Q. V. Le. Regularized evolution for image classifier architecture search. In AAAI, 2018. 2, 5
- J. Redmon and A. Farhadi. Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767, 2018. 8
- O. Ronneberger, P. Fischer, and T. Brox. U-Net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention, 2015. 2
- M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.C. Chen. MobileNetV2: inverted residuals and linear bottl. CVPR, 2019. 1, 2, 7, 8
- J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017. 5
- J.-Y. S. M.-C. K. S.-J. K. Seung-Wook Kim, HyongKeun Kook. Parallel feature pyramid network for object detection. ECCV, 2018. 1
- C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. E. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Deep residual learning for image recognition. In CVPR, 2015. 1
- M. Tan, B. Chen, R. Pang, V. Vasudevan, and Q. V. Le. Mnasnet: Platform-aware neural architecture search for mobile. arXiv preprint arXiv:1807.11626, 2018. 3, 8
- S. Teerapittayanon, B. McDanel, and H. Kung. Branchynet: Fast inference via early exiting from deep neural networks. In ICPR, pages 2464–2469. IEEE, 2016. 2
- S. Woo, S. Hwang, and I. S. Kweon. StairNet: top-down semantic aggregation for accurate one shot detection. In WACV, 2018. 1
- D. K. Yonghyun Kim, Bong-Nam Kang. San: Learning relationship between convolutional features for multi-scale object detection. ECCV, 2018. 1
- F. Yu, D. Wang, E. Shelhamer, and T. Darrell. Deep layer aggregation. In CVPR, 2018. 1
- S. Zhang, L. Wen, X. Bian, Z. Lei, and S. Z. Li. Single-shot refinement neural network for object detection. In CVPR, 2018. 1, 8
- Q. Zhao, T. Sheng, Y. Wang, Z. Tang, Y. Chen, L. Cai, and H. Ling. M2det: A single-shot object detector based on multi-level feature pyramid network. AAAI, 2019. 2
- P. Zhou, B. Ni, C. Geng, J. Hu, and Y. Xu. Scaletransferrable object detection. In CVPR, 2018. 1
- B. Zoph and Q. V. Le. Neural architecture search with reinforcement learning. In ICLR, 2017. 2, 3, 4, 5
- B. Zoph, V. Vasudevan, J. Shlens, and Q. V. Le. Learning transferable architectures for scalable image recognition. In CVPR, 2018. 2, 3, 4