Face Detection with the Faster R-CNN

ieee international conference on automatic face gesture recognition, 2017.

Cited by: 346|Bibtex|Views111|DOI:https://doi.org/10.1109/FG.2017.82
EI
Other Links: dblp.uni-trier.de|academic.microsoft.com|arxiv.org
Weibo:
The Faster R-convolutional neural networks is designed for generic object detection, it demonstrates impressive face detection performance when retrained on a suitable face detection training set

Abstract:

While deep learning based methods for generic object detection have improved rapidly in the last two years, most approaches to face detection are still based on the R-CNN framework [11], leading to limited accuracy and processing speed. In this paper, we investigate applying the Faster RCNN [26], which has recently demonstrated impressive...More

Code:

Data:

0
Introduction
  • Deep convolutional neural networks (CNNs) have dominated many tasks of computer vision.
  • The latest generation, represented by the Faster R-CNN of Ren, He, Girshick, and Sun [12] demonstrates impressive results on various object detection benchmarks.
  • It is the foundational framework for the winning entry of the COCO detection challenge 2015.1 In this report, the authors demonstrate state-of-the-art face detection results using the Faster RCNN on two popular face detection benchmarks, the widely used Face Detection Dataset and Benchmark (FDDB) [7], and the more recent IJB-A benchmark [8].
  • The authors compare different generations of region-based CNN object detection models, and compare to a variety of other recent high-performing detectors
Highlights
  • Deep convolutional neural networks (CNNs) have dominated many tasks of computer vision
  • The latest generation, represented by the Faster R-CNN of Ren, He, Girshick, and Sun [12] demonstrates impressive results on various object detection benchmarks. It is the foundational framework for the winning entry of the COCO detection challenge 2015.1 In this report, we demonstrate state-of-the-art face detection results using the Faster region-based CNN (RCNN) on two popular face detection benchmarks, the widely used Face Detection Dataset and Benchmark (FDDB) [7], and the more recent IJB-A benchmark [8]
  • We train the face detection model based on a pre-trained ImageNet model, VGG16 [13]
  • We have demonstrated state-of-the-art face detection performance on two benchmark datasets using the Faster R-CNN
  • The Faster R-CNN is designed for generic object detection, it demonstrates impressive face detection performance when retrained on a suitable face detection training set
Methods
  • The authors report experiments on comparisons of region proposals and on end-to-end performance of top face detectors. 3.1.
  • The authors train a Faster R-CNN face detection model on the recently released WIDER face dataset [16].
  • There are 12,880 images and 159,424 faces in the training set.
  • In Fig. 1, the authors demonstrate some randomly sampled images of the WIDER EdgeBox DeepBox Faceness RPN Detection Rate.
  • The authors train the face detection model based on a pre-trained ImageNet model, VGG16 [13].
  • The authors randomly sample one image per batch for training.
  • The authors run the SGD solver 50k iterations with a base learning rate of 0.001 and run another 20K iterations reducing the base learning rate to 0.0001.2
Conclusion
  • The authors have demonstrated state-of-the-art face detection performance on two benchmark datasets using the Faster R-CNN.
  • Due to the sharing of convolutional layers between the RPN and Fast R-CNN detector module, it is possible to use a deep CNN in RPN without extra computational burden.
  • The Faster R-CNN is designed for generic object detection, it demonstrates impressive face detection performance when retrained on a suitable face detection training set.
  • It may be possible to further boost its performance by considering the special patterns of human faces
Summary
  • Introduction:

    Deep convolutional neural networks (CNNs) have dominated many tasks of computer vision.
  • The latest generation, represented by the Faster R-CNN of Ren, He, Girshick, and Sun [12] demonstrates impressive results on various object detection benchmarks.
  • It is the foundational framework for the winning entry of the COCO detection challenge 2015.1 In this report, the authors demonstrate state-of-the-art face detection results using the Faster RCNN on two popular face detection benchmarks, the widely used Face Detection Dataset and Benchmark (FDDB) [7], and the more recent IJB-A benchmark [8].
  • The authors compare different generations of region-based CNN object detection models, and compare to a variety of other recent high-performing detectors
  • Methods:

    The authors report experiments on comparisons of region proposals and on end-to-end performance of top face detectors. 3.1.
  • The authors train a Faster R-CNN face detection model on the recently released WIDER face dataset [16].
  • There are 12,880 images and 159,424 faces in the training set.
  • In Fig. 1, the authors demonstrate some randomly sampled images of the WIDER EdgeBox DeepBox Faceness RPN Detection Rate.
  • The authors train the face detection model based on a pre-trained ImageNet model, VGG16 [13].
  • The authors randomly sample one image per batch for training.
  • The authors run the SGD solver 50k iterations with a base learning rate of 0.001 and run another 20K iterations reducing the base learning rate to 0.0001.2
  • Conclusion:

    The authors have demonstrated state-of-the-art face detection performance on two benchmark datasets using the Faster R-CNN.
  • Due to the sharing of convolutional layers between the RPN and Fast R-CNN detector module, it is possible to use a deep CNN in RPN without extra computational burden.
  • The Faster R-CNN is designed for generic object detection, it demonstrates impressive face detection performance when retrained on a suitable face detection training set.
  • It may be possible to further boost its performance by considering the special patterns of human faces
Tables
  • Table1: Comparisons of the entire pipeline of different region-based object detection methods. (Both Faceness [<a class="ref-link" id="c15" href="#r15">15</a>] and DeepBox [<a class="ref-link" id="c10" href="#r10">10</a>]
Download tables as Excel
Reference
  • J. Cheney, B. Klein, A. K. Jain, and B. F. Klare. Unconstrained face detection: State of the art baseline and challenges. In ICB, pages 229–236, 2015.
    Google ScholarLocate open access versionFindings
  • P. Dollar and C. L. Zitnick. Fast edge detection using structured forests. IEEE Trans. Pattern Anal. Mach. Intell., 37(8):1558–1570, 2015.
    Google ScholarLocate open access versionFindings
  • G. Ghiasi and C. C. Fowlkes. Occlusion coherence: Localizing occluded faces with a hierarchical deformable part model. In CVPR, pages 1899–1906, 2014.
    Google ScholarLocate open access versionFindings
  • [5] R. B. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. In CVPR, pages 580–587, 2014.
    Google ScholarLocate open access versionFindings
  • [6] K. He, X. Zhang, S. Ren, and J. Sun. Spatial pyramid pooling in deep convolutional networks for visual recognition. In ECCV, pages 346–361, 2014.
    Google ScholarLocate open access versionFindings
  • [7] V. Jain and E. Learned-Miller. FDDB: A benchmark for face detection in unconstrained settings. Technical Report UMCS-2010-009, University of Massachusetts, Amherst, 2010.
    Google ScholarFindings
  • [8] B. F. Klare, B. Klein, E. Taborsky, A. Blanton, J. Cheney, K. Allen, P. Grother, A. Mah, M. Burge, and A. K. Jain. Pushing the frontiers of unconstrained face detection and recognition: IARPA Janus benchmark A. In CVPR, pages 1931–1939, 2015.
    Google ScholarLocate open access versionFindings
  • [9] A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In NIPS, pages 1106–1114, 2012.
    Google ScholarLocate open access versionFindings
  • [10] W. Kuo, B. Hariharan, and J. Malik. Deepbox: Learning objectness with convolutional networks. In ICCV, pages 2479– 2487, 2015.
    Google ScholarLocate open access versionFindings
  • [11] S. Ren, K. He, R. Girshick, and J. Sun. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell., 2016.
    Google ScholarLocate open access versionFindings
  • [12] S. Ren, K. He, R. B. Girshick, and J. Sun. Faster R-CNN: towards real-time object detection with region proposal networks. In NIPS, pages 91–99, 2015.
    Google ScholarLocate open access versionFindings
  • [13] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. CoRR, abs/1409.1556, 2014.
    Findings
  • [15] S. Yang, P. Luo, C. C. Loy, and X. Tang. From facial parts responses to face detection: A deep learning approach. In ICCV, pages 3676–3684, 2015.
    Google ScholarLocate open access versionFindings
  • [16] S. Yang, P. Luo, C. C. Loy, and X. Tang. WIDER FACE: A face detection benchmark. In CVPR, 2016.
    Google ScholarLocate open access versionFindings
Full Text
Your rating :
0

 

Tags
Comments