A Discriminative Feature Learning Approach for Deep Face Recognition

ECCV, pp. 499-515, 2016.

Cited by: 1534|Bibtex|Views186
EI
Other Links: dblp.uni-trier.de|academic.microsoft.com
Weibo:
By combining the center loss with the softmax loss to jointly supervise the learning of Convolutional neural networks, the discriminative power of the deeply learned features can be highly enhanced for robust face recognition

Abstract:

Convolutional neural networks (CNNs) have been widely used in computer vision community, significantly improving the state-of-the-art. In most of the available CNNs, the softmax loss function is used as the supervision signal to train the deep model. In order to enhance the discriminative power of the deeply learned features, this paper p...More

Code:

Data:

0
Introduction
  • Convolutional neural networks (CNNs) have achieved great success on vision community, significantly improving the state of the art in classification problems, such as object [11,12,18,28,33], scene [41,42], action [3,16,36] and so on
  • It mainly benefits from the large scale training data [8,26] and the end-to-end learning framework.
Highlights
  • Convolutional neural networks (CNNs) have achieved great success on vision community, significantly improving the state of the art in classification problems, such as object [11,12,18,28,33], scene [41,42], action [3,16,36] and so on
  • We propose a new loss function, namely center loss, to efficiently enhance the discriminative power of the deeply learned features in neural networks
  • – We propose a new loss function to minimize the intraclass distances of the deep features
  • – We show that the proposed loss function is very easy to implement in the Convolutional neural networks
  • We have proposed a new loss function, referred to as center loss
  • By combining the center loss with the softmax loss to jointly supervise the learning of Convolutional neural networks, the discriminative power of the deeply learned features can be highly enhanced for robust face recognition
Methods
  • Compared to model B, model C achieves better performance (99.10 % v.s. 99.28 % and 93.8 % v.s. 94.9 %)
  • This shows the advantage of the center loss over the contrastive loss in the designed CNNs. Last, compared to the state-of-the-art results on the two databases, the results of the proposed model C are consistently among the top-ranked sets of approaches based on the two databases, outperforming most of the existing results in Table 2.
Conclusion
  • The authors have proposed a new loss function, referred to as center loss.
  • By combining the center loss with the softmax loss to jointly supervise the learning of CNNs, the discriminative power of the deeply learned features can be highly enhanced for robust face recognition.
  • Extensive experiments on several largescale face benchmarks have convincingly demonstrated the effectiveness of the proposed approach
Summary
  • Introduction:

    Convolutional neural networks (CNNs) have achieved great success on vision community, significantly improving the state of the art in classification problems, such as object [11,12,18,28,33], scene [41,42], action [3,16,36] and so on
  • It mainly benefits from the large scale training data [8,26] and the end-to-end learning framework.
  • Methods:

    Compared to model B, model C achieves better performance (99.10 % v.s. 99.28 % and 93.8 % v.s. 94.9 %)
  • This shows the advantage of the center loss over the contrastive loss in the designed CNNs. Last, compared to the state-of-the-art results on the two databases, the results of the proposed model C are consistently among the top-ranked sets of approaches based on the two databases, outperforming most of the existing results in Table 2.
  • Conclusion:

    The authors have proposed a new loss function, referred to as center loss.
  • By combining the center loss with the softmax loss to jointly supervise the learning of CNNs, the discriminative power of the deeply learned features can be highly enhanced for robust face recognition.
  • Extensive experiments on several largescale face benchmarks have convincingly demonstrated the effectiveness of the proposed approach
Tables
  • Table1: The CNNs architecture we use in toy example, called LeNets++. Some of the convolution layers are followed by max pooling. (5, 32)/1,2 × 2 denotes 2 cascaded convolution layers with 32 filters of size 5 × 5, where the stride and padding are 1 and 2 respectively. 2/2,0 denotes the max-pooling layers with grid of 2 × 2, where the stride and padding are 2 and 0 respectively. In LeNets++, we use the Parametric Rectified
  • Table2: Verification performance of different methods on LFW and YTF datasets
  • Table3: Identification rates of different methods on MegaFace with 1M distractors
  • Table4: Verification TAR of different methods at 10−6 FAR on MegaFace with 1M distractors
Download tables as Excel
Related work
  • Face recognition via deep learning has achieved a series of breakthrough in these years [25,27,29,30,34,37]. The idea of mapping a pair of face images to a distance starts from [6]. They train siamese networks for driving the similarity metric to be small for positive pairs, and large for the negative pairs. Hu et al [13] learn a nonlinear transformations and yield discriminative deep metric with a margin between positive and negative face image pairs. There approaches are required image pairs as input.

    Very recently, [31,34] supervise the learning process in CNNs by challenging identification signal (softmax loss function), which brings richer identityrelated information to deeply learned features. After that, joint identificationverification supervision signal is adopted in [29,37], leading to more discriminative features. [32] enhances the supervision by adding a fully connected layer and loss functions to each convolutional layer. The effectiveness of triplet loss has been demonstrated in [21,25,27]. With the deep embedding, the distance between an anchor and a positive are minimized, while the distance between an anchor and a negative are maximized until the margin is met. They achieve state-of-the-art performance in LFW and YTF datasets.
Funding
  • This work was funded by External Cooperation Program of BIC, Chinese Academy of Sciences (172644KYSB20160033, 172644KYSB20150019), Shenzhen Research Program (KQCX2015033117354153, JSGG20150925164740726, CXZZ20150930104115529 and JCYJ20150925163005055), Guangdong Research Program (2014B050505017 and 2015B010129013), Natural Science Foundation of Guangdong Province (2014A030313688) and the Key Laboratory of Human-Machine Intelligence-Synergy Systems through the Chinese Academy of Sciences
Reference
  • Fg-net aging database. In: (2010). http://www.fgnet.rsunit.com/
    Locate open access versionFindings
  • Ahonen, T., Hadid, A., Pietikainen, M.: Face description with local binary patterns: application to face recognition. IEEE Trans. Pattern Anal. Mach. Intell. 28(12), 2037–2041 (2006)
    Google ScholarLocate open access versionFindings
  • Baccouche, M., Mamalet, F., Wolf, C., Garcia, C., Baskurt, A.: Sequential deep learning for human action recognition. In: Salah, A.A., Lepri, B. (eds.) HBU 2011. LNCS, vol. 7065, pp. 29–39. Springer, Heidelberg (2011). doi:10.1007/ 978-3-642-25446-8 4 4.
    Locate open access versionFindings
  • Chen, B.C., Chen, C.S., Hsu, W.H.: Face recognition and retrieval using cross-age reference coding with cross-age celebrity dataset. IEEE Trans. Multimedia 17(6), 804–815 (2015)
    Google ScholarLocate open access versionFindings
  • Chen, X., Li, Q., Song, Y., Jin, X., Zhao, Q.: Supervised geodesic propagation for semantic label transfer. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7574, pp. 553–565. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33712-3 40 6.
    Locate open access versionFindings
  • Chopra, S., Hadsell, R., LeCun, Y.: Learning a similarity metric discriminatively, with application to face verification. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, vol. 1, pp. 539–54IEEE (2005)
    Google ScholarLocate open access versionFindings
  • Cover, T.M., Hart, P.E.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theor. 13(1), 21–27 (1967)
    Google ScholarLocate open access versionFindings
  • Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 248–255. IEEE (2009)
    Google ScholarLocate open access versionFindings
  • Fukunaga, K., Narendra, P.M.: A branch and bound algorithm for computing knearest neighbors. IEEE Trans. Comput. 100(7), 750–753 (1975)
    Google ScholarLocate open access versionFindings
  • Hadsell, R., Chopra, S., LeCun, Y.: Dimensionality reduction by learning an invariant mapping. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 1735–1742. IEEE (2006)
    Google ScholarLocate open access versionFindings
  • He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. arXiv preprint (2015). arXiv:1512.03385
    Findings
  • He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing humanlevel performance on imagenet classification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1026–1034 (2015)
    Google ScholarLocate open access versionFindings
  • Hu, J., Lu, J., Tan, Y.P.: Discriminative deep metric learning for face verification in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1875–1882 (2014)
    Google ScholarLocate open access versionFindings
  • Huang, G.B., Learned-Miller, E.: Labeled faces in the wild: updates and new reporting procedures. Dept. Comput. Sci., Univ. Massachusetts Amherst, Amherst, MA, USA, Technical report, pp. 14–003 (2014)
    Google ScholarFindings
  • Huang, G.B., Ramesh, M., Berg, T., Learned-Miller, E.: Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Technical report, Technical Report 07–49, University of Massachusetts, Amherst (2007)
    Google ScholarFindings
  • Ji, S., Xu, W., Yang, M., Yu, K.: 3D convolutional neural networks for human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 221–231 (2013)
    Google ScholarLocate open access versionFindings
  • Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the ACM International Conference on Multimedia, pp. 675–678. ACM (2014)
    Google ScholarLocate open access versionFindings
  • Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
    Google ScholarLocate open access versionFindings
  • LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
    Google ScholarLocate open access versionFindings
  • LeCun, Y., Cortes, C., Burges, C.J.: The MNIST database of handwritten digits (1998)
    Google ScholarFindings
  • Liu, J., Deng, Y., Huang, C.: Targeting ultimate accuracy: Face recognition via deep embedding. arXiv preprint (2015). arXiv:1506.07310
    Findings
  • Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3730–3738 (2015)
    Google ScholarLocate open access versionFindings
  • Miller, D., Kemelmacher-Shlizerman, I., Seitz, S.M.: Megaface: a million faces for recognition at scale. arXiv preprint (2015). arXiv:1505.02108
    Findings
  • Ng, H.W., Winkler, S.: A data-driven approach to cleaning large face datasets. In: 2014 IEEE International Conference on Image Processing (ICIP), pp. 343–347. IEEE (2014)
    Google ScholarLocate open access versionFindings
  • Parkhi, O.M., Vedaldi, A., Zisserman, A.: Deep face recognition. In: Proceedings of the British Machine Vision, vol. 1, no. 3, p. 6 (2015)
    Google ScholarLocate open access versionFindings
  • Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)
    Google ScholarLocate open access versionFindings
  • Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 815–823 (2015)
    Google ScholarLocate open access versionFindings
  • Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint (2014). arXiv:1409.1556
    Findings
  • Sun, Y., Chen, Y., Wang, X., Tang, X.: Deep learning face representation by joint identification-verification. In: Advances in Neural Information Processing Systems, pp. 1988–1996 (2014)
    Google ScholarLocate open access versionFindings
  • Sun, Y., Wang, X., Tang, X.: Hybrid deep learning for face verification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1489–1496 (2013)
    Google ScholarLocate open access versionFindings
  • Sun, Y., Wang, X., Tang, X.: Deep learning face representation from predicting 10,000 classes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1891–1898 (2014)
    Google ScholarLocate open access versionFindings
  • Sun, Y., Wang, X., Tang, X.: Deeply learned face representations are sparse, selective, and robust. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2892–2900 (2015)
    Google ScholarLocate open access versionFindings
  • Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
    Google ScholarLocate open access versionFindings
  • Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: Deepface: closing the gap to humanlevel performance in face verification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1701–1708 (2014)
    Google ScholarLocate open access versionFindings
  • Thomee, B., Shamma, D.A., Friedland, G., Elizalde, B., Ni, K., Poland, D., Borth, D., Li, L.J.: The new data and new challenges in multimedia research. arXiv preprint (2015). arXiv:1503.01817
    Findings
  • Wang, L., Qiao, Y., Tang, X.: Action recognition with trajectory-pooled deepconvolutional descriptors. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4305–4314 (2015)
    Google ScholarLocate open access versionFindings
  • Wen, Y., Li, Z., Qiao, Y.: Latent factor guided convolutional neural networks for age-invariant face recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4893–4901 (2016)
    Google ScholarLocate open access versionFindings
  • Wolf, L., Hassner, T., Maoz, I.: Face recognition in unconstrained videos with matched background similarity. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 529–534. IEEE (2011)
    Google ScholarLocate open access versionFindings
  • Yi, D., Lei, Z., Liao, S., Li, S.Z.: Learning face representation from scratch. arXiv preprint (2014). arXiv:1411.7923
    Findings
  • Zhang, K., Zhang, Z., Li, Z., Qiao, Y.: Joint face detection and alignment using multi-task cascaded convolutional networks. arXiv preprint (2016). arXiv:1604.02878
    Findings
  • Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Object detectors emerge in deep scene cnns. arXiv preprint (2014). arXiv:1412.6856
    Findings
  • Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., Oliva, A.: Learning deep features for scene recognition using places database. In: Advances in neural information processing systems, pp. 487–495 (2014)
    Google ScholarLocate open access versionFindings
Full Text
Your rating :
0

 

Tags
Comments