Domain Balancing: Face Recognition on Long-Tailed Domains

CVPR, pp. 5670-5678, 2020.

Cited by: 0|Bibtex|Views53|DOI:https://doi.org/10.1109/CVPR42600.2020.00571
EI
Other Links: arxiv.org|dblp.uni-trier.de|academic.microsoft.com
Weibo:
We propose a Domain Balancing Margin to adaptively modify the margin according to the Domain Frequency Indicator for each class, so that the loss produced by the tail domain classes can be relatively up-weighted

Abstract:

Long-tailed problem has been an important topic in face recognition task. However, existing methods only concentrate on the long-tailed distribution of classes. Differently, we devote to the long-tailed domain distribution problem, which refers to the fact that a small number of domains frequently appear while other domains far less exi...More

Code:

Data:

0
Introduction
  • Feature descriptor is of crucial importance to the performance of face recognition, where the training and testing images are drawn from different identities and the distance metric is directly acted on the features to determine whether they belong to the same identity or not.
  • Face recognition often suffers from poor generalization, i.e., the learned features only work well on the domain the same as the training set and perform poorly on the unseen domains.
  • This is one of the most critical issues for face recognition in the wild, partially due to the non-negligible domain shift from the training set to the deployment environment.
Highlights
  • Feature descriptor is of crucial importance to the performance of face recognition, where the training and testing images are drawn from different identities and the distance metric is directly acted on the features to determine whether they belong to the same identity or not
  • Take the domains in Figure 1 as an example, we find the compact regions tend to belong to the head domains, and the sparse regions tend to belong to the tail domains
  • In the loss function, we propose a Domain Balancing Margin (DBM) to adaptively modify the margin according to the Domain Frequency Indicator for each class, so that the loss produced by the tail domain classes can be relatively up-weighted
  • The reason behind may be that the proposed balancing strategy can efficiently mitigate the potential impact of the longtailed domain distribution, which is ubiquitous in the realworld application
  • We investigate a novel long-tailed domain problem in the real-world face recognition, which refers to few common domains and many more rare do
  • A novel Domain Balancing mechanism is proposed to deal with this problem, which contains three components, Domain Frequency Indicator (DFI), Residual Balancing Mapping (RBM) and Domain Balancing Margin (DBM)
Methods
  • Method Deep

    Face [30] FaceNet [26] DeepFR [24] DeepID2+ [29] Center Face [35]

    Baidu [17] Softmax SphereFace [18] CosFace [32] ArcFace [4] Ours

    Training Data 4M 200M 2.6M 300K 0.7M 1.3M 5M 5M 5M 5M 5M.
  • Baidu [17] Softmax SphereFace [18] CosFace [32] ArcFace [4] Ours.
  • SphereFace [18] CosFace [32] ArcFace [4] Ours.
  • A novel Domain Balancing mechanism is proposed to deal with this problem, which contains three components, Domain Frequency Indicator (DFI), Residual Balancing Mapping (RBM) and Domain Balancing Margin (DBM).
  • Extensive analyses and experiments on several face recognition benchmarks demonstrate that the proposed method can effectively enhance the discrimination and achieve superior accuracy
Results
  • 4.5.1 Results on LFW and LFW BLUFR

    LFW is the most widely used benchmark for unconstrained face recognition.
  • Table 4 displays the the comparsion of all the methods on LFW testset.
  • The proposed method improves the performance from 99.62% to 99.78%.
  • The authors evaluate the method on the more challenge LFW BLUFR protocol.
  • The authors' approach still achieves the best results compared to the state-of-thearts.
  • The margin-based methods attain better results than the simple softmax loss for face recognition.
  • The proposed method surpasses the best approach ArcFace by an obvious margin.
  • The reason behind may be that the proposed balancing strategy can efficiently mitigate the potential impact of the longtailed domain distribution, which is ubiquitous in the realworld application
Conclusion
  • The authors investigate a novel long-tailed domain problem in the real-world face recognition, which refers to few common domains and many more rare do-.
Summary
  • Introduction:

    Feature descriptor is of crucial importance to the performance of face recognition, where the training and testing images are drawn from different identities and the distance metric is directly acted on the features to determine whether they belong to the same identity or not.
  • Face recognition often suffers from poor generalization, i.e., the learned features only work well on the domain the same as the training set and perform poorly on the unseen domains.
  • This is one of the most critical issues for face recognition in the wild, partially due to the non-negligible domain shift from the training set to the deployment environment.
  • Methods:

    Method Deep

    Face [30] FaceNet [26] DeepFR [24] DeepID2+ [29] Center Face [35]

    Baidu [17] Softmax SphereFace [18] CosFace [32] ArcFace [4] Ours

    Training Data 4M 200M 2.6M 300K 0.7M 1.3M 5M 5M 5M 5M 5M.
  • Baidu [17] Softmax SphereFace [18] CosFace [32] ArcFace [4] Ours.
  • SphereFace [18] CosFace [32] ArcFace [4] Ours.
  • A novel Domain Balancing mechanism is proposed to deal with this problem, which contains three components, Domain Frequency Indicator (DFI), Residual Balancing Mapping (RBM) and Domain Balancing Margin (DBM).
  • Extensive analyses and experiments on several face recognition benchmarks demonstrate that the proposed method can effectively enhance the discrimination and achieve superior accuracy
  • Results:

    4.5.1 Results on LFW and LFW BLUFR

    LFW is the most widely used benchmark for unconstrained face recognition.
  • Table 4 displays the the comparsion of all the methods on LFW testset.
  • The proposed method improves the performance from 99.62% to 99.78%.
  • The authors evaluate the method on the more challenge LFW BLUFR protocol.
  • The authors' approach still achieves the best results compared to the state-of-thearts.
  • The margin-based methods attain better results than the simple softmax loss for face recognition.
  • The proposed method surpasses the best approach ArcFace by an obvious margin.
  • The reason behind may be that the proposed balancing strategy can efficiently mitigate the potential impact of the longtailed domain distribution, which is ubiquitous in the realworld application
  • Conclusion:

    The authors investigate a novel long-tailed domain problem in the real-world face recognition, which refers to few common domains and many more rare do-.
Tables
  • Table1: Statistics of face datasets for training and testing. (P) and (G) indicates the probe and gallery set respectively
  • Table2: Face verification results (%) with different strategies. (CASIA-Webface, ResNet18, RBM (w/o sg) refers to RBM without the soft gate, i.e., f (x) = 1.)
  • Table3: Performance (%) vs. K on LFW, CALFW and CPLFW datasets, where K is the number of nearest neighbor in Domain Frequency Indicator (DFI)
  • Table4: Face verification (%) on the LFW dataset. ”Training Data” indicates the size of the training data involved. ”Models” indicates the number of models used for evaluation
  • Table5: Face verification (%) on LFW BLUFR protocol
  • Table6: Face verification (%) on CALFW, CPLFW and AgeDB
  • Table7: Face identification and verification on MegaFace Challenge1. ”Rank 1” refers to the rank-1 face identification accuracy, and ”Ver” refers to the face verification TAR at 10−6 FAR
Download tables as Excel
Related work
  • Softmax based Face Recognition. Deep convolutional neural networks (CNNs) [3] have achieved impressive success in face recognition. The current prevailing softmax loss considers the training process as a N-way classification problem. Sun et al [28] propose the DeepID for face verification. In the training process, for each sample, the extracted feature is taken to calculate the dot products with all the class-specific weights. Wen et al [35] propose a new center loss penalizing the distances between the features and their corresponding class centers. Wang et al [31] study the effect of normalization during training and show that optimizing cosine similarity (cosinebased softmax loss) instead of inner-product improves the performance. Recently, a variety of margin based softmax losses [18, 32, 4] have achieved the state-of-the-art performances. SphereFace [18] adds an extra angular margin to attain shaper decision boundary of the original softmax loss. It concentrates the features in a sphere mainfold. CosFace [32] shares a similar idea which encourages the intra-compactness in the cosine manifold. Another effort ArcFace [4] uses an additive angular margin, leading to similar effect. However, these efforts only consider the intra-compactness. RegularFace [38] proposes an exclusive regularization to focus on the inter-separability. These methods mainly devote to enlarge the inter-differences and reduce the intra-variations. Despite their excellent performance on face recognition, they rely more on the large and balanced datasets and often suffer performance degradation when facing with the long-tailed data.
Funding
  • This work has been partially supported by the Chinese National Natural Science Foundation Projects #61876178, #61806196, #61976229, #61872367
Reference
  • Baidu cloud vision api. http://ai.baidu.com.
    Findings
  • Bor-Chun Chen, Chu-Song Chen, and Winston H Hsu.
    Google ScholarFindings
  • Y Le Cun. Convolutional networks for images, speech, and time series. 1995.
    Google ScholarFindings
  • Jiankang Deng, Jia Guo, Niannan Xue, and Stefanos Zafeiriou. Arcface: Additive angular margin loss for deep face recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4690– 4699, 2019.
    Google ScholarLocate open access versionFindings
  • Qi Dong, Shaogang Gong, and Xiatian Zhu. Class rectification hard mining for imbalanced deep learning. In Proceedings of the IEEE International Conference on Computer Vision, pages 1851–1860, 2017.
    Google ScholarLocate open access versionFindings
  • Jianzhu Guo, Xiangyu Zhu, Zhen Lei, and Stan Z Li. Face synthesis for eyeglass-robust face recognition. In Chinese Conference on Biometric Recognition, pages 275–284.
    Google ScholarLocate open access versionFindings
  • Jianzhu Guo, Xiangyu Zhu, Chenxu Zhao, Dong Cao, Zhen Lei, and Stan Z Li. Learning meta face recognition in unseen domains. arXiv preprint arXiv:2003.07733, 2020.
    Findings
  • Yandong Guo, Lei Zhang, Yuxiao Hu, Xiaodong He, and Jianfeng Gao. Ms-celeb-1m: A dataset and benchmark for large-scale face recognition. In European Conference on Computer Vision, pages 87–102.
    Google ScholarLocate open access versionFindings
  • David Ha, Andrew Dai, and Quoc V Le. Hypernetworks. arXiv preprint arXiv:1609.09106, 2016.
    Findings
  • Haibo He and Edwardo A Garcia. Learning from imbalanced data. IEEE Transactions on knowledge and data engineering, 21(9):1263–1284, 2009.
    Google ScholarLocate open access versionFindings
  • Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Identity mappings in deep residual networks. In European conference on computer vision, pages 630–645.
    Google ScholarLocate open access versionFindings
  • Chen Huang, Yining Li, Chen Change Loy, and Xiaoou Tang. Learning deep representation for imbalanced classification. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 5375–5384, 2016.
    Google ScholarLocate open access versionFindings
  • Gary B Huang, Marwan Mattar, Tamara Berg, and Eric Learned-Miller. Labeled faces in the wild: A database forstudying face recognition in unconstrained environments. 2008.
    Google ScholarFindings
  • Ira Kemelmacher-Shlizerman, Steven M Seitz, Daniel Miller, and Evan Brossard. The megaface benchmark: 1 million faces for recognition at scale. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4873–4882, 2016.
    Google ScholarLocate open access versionFindings
  • Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollar. Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision, pages 2980–2988, 2017.
    Google ScholarLocate open access versionFindings
  • Hao Liu, Xiangyu Zhu, Zhen Lei, and Stan Z Li. Adaptiveface: Adaptive margin and sampling for face recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 11947–11956, 2019.
    Google ScholarLocate open access versionFindings
  • Jingtuo Liu, Yafeng Deng, Tao Bai, Zhengping Wei, and Chang Huang. Targeting ultimate accuracy: Face recognition via deep embedding. arXiv preprint arXiv:1506.07310, 2015.
    Findings
  • Weiyang Liu, Yandong Wen, Zhiding Yu, Ming Li, Bhiksha Raj, and Le Song. Sphereface: Deep hypersphere embedding for face recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 212–220, 2017.
    Google ScholarLocate open access versionFindings
  • Ziwei Liu, Zhongqi Miao, Xiaohang Zhan, Jiayun Wang, Boqing Gong, and Stella X Yu. Large-scale long-tailed recognition in an open world. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2537–2546, 2019.
    Google ScholarLocate open access versionFindings
  • Laurens van der Maaten and Geoffrey Hinton. Visualizing data using t-sne. Journal of machine learning research, 9(Nov):2579–2605, 2008.
    Google ScholarLocate open access versionFindings
  • Stylianos Moschoglou, Athanasios Papaioannou, Christos Sagonas, Jiankang Deng, Irene Kotsia, and Stefanos Zafeiriou. Agedb: the first manually collected, in-the-wild age database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pages 51–59, 2017.
    Google ScholarLocate open access versionFindings
  • Hong-Wei Ng and Stefan Winkler. A data-driven approach to cleaning large face datasets. In 2014 IEEE international conference on image processing (ICIP), pages 343–347. IEEE, 2014.
    Google ScholarLocate open access versionFindings
  • Hyun Oh Song, Yu Xiang, Stefanie Jegelka, and Silvio Savarese. Deep metric learning via lifted structured feature embedding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4004– 4012, 2016.
    Google ScholarLocate open access versionFindings
  • Omkar M Parkhi, Andrea Vedaldi, Andrew Zisserman, et al. Deep face recognition. In bmvc, volume 1, page 6, 2015.
    Google ScholarLocate open access versionFindings
  • Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. Automatic differentiation in pytorch. 2017.
    Google ScholarFindings
  • Florian Schroff, Dmitry Kalenichenko, and James Philbin. Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 815–823, 2015.
    Google ScholarLocate open access versionFindings
  • Li Shen, Zhouchen Lin, and Qingming Huang. Relay backpropagation for effective learning of deep convolutional neural networks. In European conference on computer vision, pages 467–482.
    Google ScholarLocate open access versionFindings
  • Yi Sun, Xiaogang Wang, and Xiaoou Tang. Deep learning face representation from predicting 10,000 classes. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1891–1898, 2014.
    Google ScholarLocate open access versionFindings
  • Yi Sun, Xiaogang Wang, and Xiaoou Tang. Deeply learned face representations are sparse, selective, and robust. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2892–2900, 2015.
    Google ScholarLocate open access versionFindings
  • Y Taigman, M Yang, M Ranzato, and L Wolf. Closing the gap to human-level performance in face verification. deepface. In IEEE Computer Vision and Pattern Recognition (CVPR), volume 5, page 6, 2014.
    Google ScholarLocate open access versionFindings
  • Feng Wang, Xiang Xiang, Jian Cheng, and Alan Loddon Yuille. Normface: l 2 hypersphere embedding for face verification. In Proceedings of the 25th ACM international conference on Multimedia, pages 1041–1049. ACM, 2017.
    Google ScholarLocate open access versionFindings
  • Hao Wang, Yitong Wang, Zheng Zhou, Xing Ji, Dihong Gong, Jingchao Zhou, Zhifeng Li, and Wei Liu. Cosface: Large margin cosine loss for deep face recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 5265–5274, 2018.
    Google ScholarLocate open access versionFindings
  • Mei Wang, Weihong Deng, Jiani Hu, Jianteng Peng, Xunqiang Tao, and Yaohai Huang. Racial faces in-the-wild: Reducing racial bias by deep unsupervised domain adaptation. arXiv preprint arXiv:1812.00194, 2018.
    Findings
  • Yu-Xiong Wang, Deva Ramanan, and Martial Hebert. Learning to model the tail. In Advances in Neural Information Processing Systems, pages 7029–7039, 2017.
    Google ScholarLocate open access versionFindings
  • Yandong Wen, Kaipeng Zhang, Zhifeng Li, and Yu Qiao. A discriminative feature learning approach for deep face recognition. In European conference on computer vision, pages 499–515.
    Google ScholarLocate open access versionFindings
  • Dong Yi, Zhen Lei, Shengcai Liao, and Stan Z Li. Learning face representation from scratch. arXiv preprint arXiv:1411.7923, 2014.
    Findings
  • Xiao Zhang, Zhiyuan Fang, Yandong Wen, Zhifeng Li, and Yu Qiao. Range loss for deep face recognition with longtailed training data. In Proceedings of the IEEE International Conference on Computer Vision, pages 5409–5418, 2017.
    Google ScholarLocate open access versionFindings
  • Kai Zhao, Jingyi Xu, and Ming-Ming Cheng. Regularface: Deep face recognition via exclusive regularization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1136–1144, 2019.
    Google ScholarLocate open access versionFindings
  • Tianyue Zheng and Weihong Deng. Cross-pose lfw: A database for studying crosspose face recognition in unconstrained environments. Beijing University of Posts and Telecommunications, Tech. Rep, pages 18–01, 2018.
    Google ScholarLocate open access versionFindings
  • Tianyue Zheng, Weihong Deng, and Jiani Hu. Cross-age lfw: A database for studying cross-age face recognition in unconstrained environments. arXiv preprint arXiv:1708.08197, 2017.
    Findings
  • Q Zhong, C Li, Y Zhang, H Sun, S Yang, D Xie, and S Pu. Towards good practices for recognition & detection. In CVPR workshops, volume 1, 2016.
    Google ScholarLocate open access versionFindings
  • Xiangyu Zhu, Hao Liu, Zhen Lei, Hailin Shi, Fan Yang, Dong Yi, Guojun Qi, and Stan Z Li. Large-scale bisample learning on id versus spot face recognition. International Journal of Computer Vision, 127(6-7):684–700, 2019.
    Google ScholarLocate open access versionFindings
Full Text
Your rating :
0

 

Tags
Comments