Pushing the frontiers of unconstrained face detection and recognition: IARPA Janus Benchmark A

Emma Taborsky
Emma Taborsky
Jordan Cheney
Jordan Cheney
Kristen Allen
Kristen Allen
Alan Mah
Alan Mah

IEEE Conference on Computer Vision and Pattern Recognition, 2015.

Cited by: 453|Bibtex|Views83|DOI:https://doi.org/10.1109/CVPR.2015.7298803
EI
Other Links: academic.microsoft.com|dblp.uni-trier.de
Weibo:
This paper has introduced the Intelligence Advanced Research Projects Activity Janus Benchmark A dataset

Abstract:

Rapid progress in unconstrained face recognition has resulted in a saturation in recognition accuracy for current benchmark datasets. While important for early progress, a chief limitation in most benchmark datasets is the use of a commodity face detector to select face imagery. The implication of this strategy is restricted variations in...More

Code:

Data:

0
Introduction
  • The development of accurate and scalable unconstrained face recognition algorithms is a long term goal of the biometrics and computer vision communities.
  • In order to close this gap, large annotated sets of imagery are needed that are representative of the end goals of unconstrained face recognition.
  • This will help continue to push the frontiers of unconstrained face detection and recognition, which are the primary goals of the IARPA Janus program [1]
Highlights
  • The development of accurate and scalable unconstrained face recognition algorithms is a long term goal of the biometrics and computer vision communities
  • In this paper we introduce the Intelligence Advanced Research Projects Activity (IARPA) Janus Benchmark A (IJB-A)2, which is publicly available for download3
  • This paper has introduced the IARPA Janus Benchmark A (IJB-A) dataset
  • The dataset consists of face images and videos that were collected “in the wild”
  • The IJB-A dataset is motivated by a need to push the state of the art in unconstrained face recognition
  • It allows improvements in the face recognition tasks to proceed without being blocked by parallel research in face detection
Methods
  • A significant amount of effort is required to collect, annotate and verify such a large corpus of imagery.
  • Each subject in the data corpus was manually specified (e.g., Japanese Prime Minister, Shinzo Abe); this specification procedure was performed such that geographic origin of subjects were generally well distributed across the globe.
  • Once a subject was specified, images and videos of the subject were located by performing internet searches on Creative Commons licensed imagery.
  • The first annotation task was annotating a bounding box around all faces in an image or video frame.
  • In order to consolidate the multiple annotations into a single set of annotations, the following approach was applied
Results
  • Correct detection is defined by a predicted bounding box overlapping with at least 50% of the ground truth bounding box.
Conclusion
  • This paper has introduced the IARPA Janus Benchmark A (IJB-A) dataset. The IJB-A is a joint face detection and http://dlib.net face recognition dataset.
  • The IJB-A dataset is motivated by a need to push the state of the art in unconstrained face recognition.
  • As such, it allows improvements in the face recognition tasks to proceed without being blocked by parallel research in face detection.
  • While training on a gallery introduces potential computation bottlenecks in image enrollment, it allows for a paradigm to be explored that is similar to familiar human recognition of “familiar” faces
Summary
  • Introduction:

    The development of accurate and scalable unconstrained face recognition algorithms is a long term goal of the biometrics and computer vision communities.
  • In order to close this gap, large annotated sets of imagery are needed that are representative of the end goals of unconstrained face recognition.
  • This will help continue to push the frontiers of unconstrained face detection and recognition, which are the primary goals of the IARPA Janus program [1]
  • Methods:

    A significant amount of effort is required to collect, annotate and verify such a large corpus of imagery.
  • Each subject in the data corpus was manually specified (e.g., Japanese Prime Minister, Shinzo Abe); this specification procedure was performed such that geographic origin of subjects were generally well distributed across the globe.
  • Once a subject was specified, images and videos of the subject were located by performing internet searches on Creative Commons licensed imagery.
  • The first annotation task was annotating a bounding box around all faces in an image or video frame.
  • In order to consolidate the multiple annotations into a single set of annotations, the following approach was applied
  • Results:

    Correct detection is defined by a predicted bounding box overlapping with at least 50% of the ground truth bounding box.
  • Conclusion:

    This paper has introduced the IARPA Janus Benchmark A (IJB-A) dataset. The IJB-A is a joint face detection and http://dlib.net face recognition dataset.
  • The IJB-A dataset is motivated by a need to push the state of the art in unconstrained face recognition.
  • As such, it allows improvements in the face recognition tasks to proceed without being blocked by parallel research in face detection.
  • While training on a gallery introduces potential computation bottlenecks in image enrollment, it allows for a paradigm to be explored that is similar to familiar human recognition of “familiar” faces
Tables
  • Table1: A comparison of key statistics of the proposed IJB-A dataset and seminal unconstrained face recognition datasets
  • Table2: Geographic distribution of subjects contained in IJB-A
  • Table3: Specific recognition accuracies required to be reported on the IJB-A dataset. Results are from the baseline GOTS algorithm, and the open source algorithm OpenBR
  • Table4: Face detection accuracies on the proposed IJB-A dataset at specified operating points. Shown are the true detect rates (TDR) at false detect rate (FDR) per image of 0.1 (one false detect every 10 images) and 0.01 (one false detect every 100 images)
Download tables as Excel
Funding
  • * This research is based upon work supported by the Office of the Director of National Intelligence (ODNI) and the Intelligence Advanced Research Projects Activity (IARPA)
Study subjects and analysis
subjects with manually localized face images: 500
The implication of this strategy is restricted variations in face pose and other confounding factors. This paper introduces the IARPA Janus Benchmark A (IJB-A), a publicly available media in the wild dataset containing 500 subjects with manually localized face images. Key features of the IJB-A dataset are: (i) full pose variation, (ii) joint use for face recognition and face detection benchmarking, (iii) a mix of images and videos, (iv) wider geographic variation of subjects, (v) protocols supporting both open-set identification (1:N search) and verification (1:1 comparison), (vi) an optional protocol that allows modeling of gallery subjects, and (vii) ground truth eye and nose locations

AMT workers: 5
Specific visual and written guidance was given to annotators to place the bounding box around the boundary of the head. Each image was annotated by at least five AMT workers. In order to consolidate the multiple annotations into a single set of annotations, the following approach was applied.A significant amount of effort is required to collect, annotate and verify such a large corpus of imagery

AMT workers: 5
Specific visual and written guidance was given to annotators to place the bounding box around the boundary of the head. Each image was annotated by at least five AMT workers. In order to consolidate the multiple annotations into a single set of annotations, the following approach was applied

subjects: 500
In this paper we introduce the IARPA Janus Benchmark A (IJB-A)2, which is publicly available for download3. The IJB-A contains images and videos from 500 subjects captured from “in the wild” environment. All labelled subjects have been manually localized with bounding boxes for face detection, as well as fiducial landmarks for the center of the two eyes (if visible) and base of the nose

IJB-A subjects: 500
The following specifications pertain to both the search and compare protocols. There are ten random training and testing splits which occur at subject level, using all 500 IJB-A subjects. For each split, 333 subjects are randomly sampled and placed in the training split

subjects: 333
There are ten random training and testing splits which occur at subject level, using all 500 IJB-A subjects. For each split, 333 subjects are randomly sampled and placed in the training split. These subjects are available for algorithms to build models and learn the variations in facial appearance that are representative of the Janus challenge

subjects: 167
These subjects are available for algorithms to build models and learn the variations in facial appearance that are representative of the Janus challenge. The remaining 167 subjects are placed in the testing split. Additional imagery may be used to train an algorithm under the strict condition that no such imagery contain the same subjects that are in the test split5

randomly selected subjects: 55
Search The search protocol measures the accuracy of open-set and closed-set search on the gallery templates using probe templates. To prevent an algorithm from leveraging apriori knowledge that every probe subject contains a mate in the gallery [5], 55 randomly selected subjects in each split have templates/imagery removed from the gallery set. Every probe template in a given split (regardless of whether or not the gallery contains the probe’s mated templates) are to be searched against the set of gallery tem-

labelled subjects: 500
Examples of the faces in the IJB-A dataset. These images and video frames highlight many of the key characteristics of this publicly available dataset, including full pose variation, a mixture of images and videos, and a wide variation in imaging conditions and geographic origin. The IJB-A dataset contains a mix of images and videos for 500 labelled subjects. Shown are distributions of the number of images and videos per subject. Overview of the data collection and annotation process. The first step involved selecting subjects, or “persons of interest”, that, in aggregate, have wide geographic distribution. For each subject, Creative Commons (CC) licensed images and videos were discovered and ingested. Using crowd sourced labor, multiple annotations were performed for each image and video I-frame for the bounding box location of all faces. In turn, the bounding box for the person of interest was identified, and three fiducial landmarks (both eyes and nose base) were annotated. Finally, the analysts inspected the data to ensure correctness

Reference
  • IARPA Janus Broad Agency Anouncement, IARPA-BAA-1307. 1
    Google ScholarFindings
  • L. Best-Rowden, H. Han, C. Otto, B. Klare, and A. K. Jain. Unconstrained face recognition: Identifying a person of interest from a media collection. IEEE Transactions on Information Forensics and Security, 2014. 3
    Google ScholarLocate open access versionFindings
  • J. R. Beveridge et al. The challenge of face recognition from digital point-and-shoot cameras. In IEEE Biometrics: Theory, Applications, and Systems, 2014
    Google ScholarLocate open access versionFindings
  • J. Cheney, B. Klein, A. K. Jain, and B. F. Klare. Unconstrained face detection: State of the art baseline and challenges. In IAPR Int. Conference on Biometrics, 2012, 7, 8
    Google ScholarLocate open access versionFindings
  • P. Grother and M. Ngan. Face recognition vendor test (FRVT): Performance of face identification algorithms. In NIST Interagency Report 8009, 2014. 2, 5, 6
    Google ScholarLocate open access versionFindings
  • G. B. Huang, M. Ramesh, T. Berg, and E. Learned-Miller. Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Technical Report 07-49, University of Massachusetts, Amherst, October 2007. 1, 4, 6
    Google ScholarFindings
  • V. Jain and E. Learned-Miller. Fddb: A benchmark for face detection in unconstrained settings. Technical Report UMCS-2010-009, University of Massachusetts, Amherst, 2010. 7, 8
    Google ScholarFindings
  • B. Klein, K. Allen, A. Jain, and B. Klare. Limiting factors in unconstrained face recognition and detection. In IEEE Biometrics: Theory, Applications, and Systems (under review), 2015. 4
    Google ScholarLocate open access versionFindings
  • J. C. Klontz and A. Jain. A case study of automated face recognition: The boston marathon bombing suspects. In IEEE Computer, November 2013. 1
    Google ScholarLocate open access versionFindings
  • J. C. Klontz, B. F. Klare, S. Klum, A. K. Jain, and M. J. Burge. Open source biometric recognition. In IEEE Biometrics: Theory, Applications, and Systems (under review), 2013. 7
    Google ScholarLocate open access versionFindings
  • N. Kumar, A. C. Berg, P. N. Belhumeur, and S. K. Nayar. Attribute and simile classifiers for face verification. In Computer Vision, 2009 IEEE 12th International Conference on, pages 365–372. IEEE, 2009. 1, 4
    Google ScholarLocate open access versionFindings
  • S. Liao, Z. Lei, D. Yi, and S. Z. Li. A benchmark study of large-scale unconstrained face recognition. In International Joint Conference on Biometrics (IJCB), 2014. 2
    Google ScholarLocate open access versionFindings
  • M. Mathias, R. Benenson, M. Pedersoli, and L. Van Gool. Face detection without bells and whistles. In ECCV, 2014. 2
    Google ScholarLocate open access versionFindings
  • V. Natu and A. J. OToole. The neural processing of familiar and unfamiliar faces: A review and synopsis. British Journal of Psychology, 102(4):726–747, 2011. 2
    Google ScholarLocate open access versionFindings
  • A. J. O’Toole, X. An, J. Dunlop, V. Natu, and P. J. Phillips. Comparing face recognition algorithms to humans on challenging tasks. ACM Transactions on Applied Perception (TAP), 9(4):16, 2012. 1
    Google ScholarLocate open access versionFindings
  • E. Taborsky, K. Allen, A. Blanton, A. K. Jain, and B. F. Klare. Annotating unconstrained face imagery: A scalable approach. In IAPR Int. Conference on Biometrics, 2014. 4
    Google ScholarLocate open access versionFindings
  • Y. Taigman, M. Yang, M. Ranzato, and L. Wolf. Deepface: Closing the gap to human-level performance in face verification. In IEEE Computer Vision and Pattern Recognition, pages 1701–1708. IEEE, 2014. 1, 2
    Google ScholarLocate open access versionFindings
  • P. Viola and M. J. Jones. Robust real-time face detection. International Journal of Computer Vision, 57(2):137–154, 2004. 2, 8
    Google ScholarLocate open access versionFindings
  • L. Wolf, T. Hassner, and I. Maoz. Face recognition in unconstrained videos with matched background similarity. In IEEE Computer Vision and Pattern Recognition, pages 529– 534. IEEE, 2011. 1, 4, 6
    Google ScholarLocate open access versionFindings
Full Text
Your rating :
0

 

Tags
Comments