AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
We proposed a novel Proxy-based deep Graph Metric Learning approach from the perspective of graph classification, which offers a new insight into deep metric learning

Fewer is More: A Deep Graph Metric Learning Perspective Using Fewer Proxies

NIPS 2020, (2020)

Cited by: 0|Views180
EI
Full Text
Bibtex
Weibo

Abstract

Deep metric learning plays a key role in various machine learning tasks. Most of the previous works have been confined to sampling from a mini-batch, which cannot precisely characterize the global geometry of the embedding space. Although researchers have developed proxy- and classification-based methods to tackle the sampling issue, th...More
0
Introduction
  • Deep metric learning (DML) has been extensively studied in the past decade due to its broad applications, e.g., zero-shot classification [37, 41, 36], image retrieval [35, 3], person re-identification [7, 48], and face recognition [38].
  • An embedding space with such a desired property is typically learned by metric losses, such as contrastive loss [19, 38] and triplet loss [8]
  • These losses rely on pairs or triplets constructed from samples in a mini-batch, empirically suffering from the sampling issue [7, 23] and leading to a polynomial growth with respect to the number of training examples.
  • These strategies still select hard samples from a subset of the whole training data set, which fail to characterize the global geometry of the embedding space precisely
Highlights
  • Deep metric learning (DML) has been extensively studied in the past decade due to its broad applications, e.g., zero-shot classification [37, 41, 36], image retrieval [35, 3], person re-identification [7, 48], and face recognition [38]
  • We adopt the K-means clustering algorithm to cluster instances and the clustering quality is reported in Normalized Mutual Information (NMI)
  • ProxyGML can be regarded as a DML loss and all proposed modules with proxies will be totally removed during the testing phase
  • We proposed a novel Proxy-based deep Graph Metric Learning (ProxyGML) approach from the perspective of graph classification, which offers a new insight into deep metric learning
  • The proposed reverse label propagation algorithm goes beyond the setting of semi-supervised learning
  • Broader Impact a) Who may benefit from this research? In this paper we proposed a new pipeline for deep metric learning
Methods
  • ProxyNCA64 [23] BN 59.5 49.2 61.9 67.9 64.9 73.2 82.4 86.4 90.6 73.7 – HDC384 [45] HTL512 [5] HDML512 [46] MS512 [35] ProxyGML64 ProxyGML384 ProxyGML512
Results
  • Evaluation Metrics

    Following the standard protocol [30, 25], the authors calculate Recall@n on the image retrieval task.
  • The authors adopt the K-means clustering algorithm to cluster instances and the clustering quality is reported in Normalized Mutual Information (NMI)
  • Both Recall@n and NMI are measured on the test set of any dataset for all experiments.
  • Note that each class of the Stanford Online Products dataset merely has 5 images in average, so the authors set N = 1 for determining k on this dataset without the regularizer, and the initial learning rate for trainable proxies is increased from 3e−2 to 3e−1.
  • ProxyGML can be regarded as a DML loss and all proposed modules with proxies will be totally removed during the testing phase
Conclusion
  • The authors proposed a novel Proxy-based deep Graph Metric Learning (ProxyGML) approach from the perspective of graph classification, which offers a new insight into deep metric learning.
  • By adaptively selecting the most informative proxies for different samples, ProxyGML is able to efficiently capture both global and local similarity relationships among the raw samples.
  • The proposed reverse label propagation algorithm goes beyond the setting of semi-supervised learning.
  • It allows them to adjust the neighbor relationships with the help of ground-truth labels, so that a discriminative metric space can be learned flexibly.
  • The experimental results on CUB-200-2011, Cars196, and Stanford Online Products benchmarks demonstrate the superiority of ProxyGML over the state-of-the-arts
Summary
  • Introduction:

    Deep metric learning (DML) has been extensively studied in the past decade due to its broad applications, e.g., zero-shot classification [37, 41, 36], image retrieval [35, 3], person re-identification [7, 48], and face recognition [38].
  • An embedding space with such a desired property is typically learned by metric losses, such as contrastive loss [19, 38] and triplet loss [8]
  • These losses rely on pairs or triplets constructed from samples in a mini-batch, empirically suffering from the sampling issue [7, 23] and leading to a polynomial growth with respect to the number of training examples.
  • These strategies still select hard samples from a subset of the whole training data set, which fail to characterize the global geometry of the embedding space precisely
  • Objectives:

    Given a labeled training set with C classes, the goal is to fine-tune a deep neural network towards yielding a more discriminative feature embedding.
  • Methods:

    ProxyNCA64 [23] BN 59.5 49.2 61.9 67.9 64.9 73.2 82.4 86.4 90.6 73.7 – HDC384 [45] HTL512 [5] HDML512 [46] MS512 [35] ProxyGML64 ProxyGML384 ProxyGML512
  • Results:

    Evaluation Metrics

    Following the standard protocol [30, 25], the authors calculate Recall@n on the image retrieval task.
  • The authors adopt the K-means clustering algorithm to cluster instances and the clustering quality is reported in Normalized Mutual Information (NMI)
  • Both Recall@n and NMI are measured on the test set of any dataset for all experiments.
  • Note that each class of the Stanford Online Products dataset merely has 5 images in average, so the authors set N = 1 for determining k on this dataset without the regularizer, and the initial learning rate for trainable proxies is increased from 3e−2 to 3e−1.
  • ProxyGML can be regarded as a DML loss and all proposed modules with proxies will be totally removed during the testing phase
  • Conclusion:

    The authors proposed a novel Proxy-based deep Graph Metric Learning (ProxyGML) approach from the perspective of graph classification, which offers a new insight into deep metric learning.
  • By adaptively selecting the most informative proxies for different samples, ProxyGML is able to efficiently capture both global and local similarity relationships among the raw samples.
  • The proposed reverse label propagation algorithm goes beyond the setting of semi-supervised learning.
  • It allows them to adjust the neighbor relationships with the help of ground-truth labels, so that a discriminative metric space can be learned flexibly.
  • The experimental results on CUB-200-2011, Cars196, and Stanford Online Products benchmarks demonstrate the superiority of ProxyGML over the state-of-the-arts
Tables
  • Table1: The ablation study for three different modules on Cars196. # Spos M Lp NMI R@1
  • Table2: Comparison with the state-of-the-art methods. The performances of clustering and retrieval are respectively measured by NMI (%) and Recall@n (%). Superscript denotes embedding dimension. “–” means that the result is not available from the original paper. Backbone networks are denoted by abbreviations: BN—Inception with batch normalization [<a class="ref-link" id="c10" href="#r10">10</a>], G—GoogleNet [<a class="ref-link" id="c31" href="#r31">31</a>]
  • Table3: Comparison of iteration time (training time per iteration), convergence time (training time till convergence), and maximum GPU memory consumption on the Cars196 dataset
  • Table4: Comparison with Proxy-Anchor on the Cars196 dataset. The performance of image retrieval is measured by Recall@n (%)
Download tables as Excel
Related work
  • Distance-Based Deep Metric Learning. Distance-based DML directly optimizes sample margins with conventional metric losses (e.g., contrastive/triplet losses), suffering from the sampling issue [7, 23] and heavily requiring informative sample pairs for fast convergence. To seek informative pairs, Chopra et al [2] introduced a contrastive loss which discards negative pairs whose similarities are smaller than a given threshold. Also, a hard sample mining strategy is proposed to find the most informative negative examples via an improved triplet loss [7]. N-Pair loss [30] and lifted structure loss [25] introduce new weighting schemes by designing a smooth weighting function to obtain more informative pairs. Alternatively, manifold proxy loss [1] is practically an extension of N-Pair loss using proxies, and improves the performance by adopting a manifold-aware distance metric with heavy backbone ensembles. Besides, ProxyNCA [23] generates a set of proxies and optimizes the distances between each raw data sample and a full set of proxies, avoiding the sampling issue. Moreover, [35] introduces a multi-similarity (MS) loss with a general pair weighting strategy, which casts the sampling issue into a unified view of pair weighting by gradient analysis. Unlike the above methods, in this paper, we propose to leverage an easy-to-optimize graph-based classification loss to adjust the similarity relationships between each raw sample and fewer informative proxies.
Funding
  • Our work was supported in part by the National Natural Science Foundation of China under Grant 62071361 and the National Key R&D Program of China under Grant 2017YFE0104100. Broader Impact a) Who may benefit from this research? In this paper we proposed a new pipeline for deep metric learning
Study subjects and analysis
species: 200
We follow the conventional protocol [25, 40, 26] to split them into training and test parts. CUB-200-2011 [32] covers 200 species of birds with 11,788 instances, where the first 100 species (5,864 images) are used for training and the rest 100 species (5,924 images) for testing. Cars196 [14] is composed of 16,185 car images of 196 classes

benchmark datasets: 3
For fair comparison, we report the performance of ProxyGML with varying embedding dimension in {64, 384, 512}. As exhibited in Table 2, ProxyGML generally outperforms the state-of-the-art methods on the three benchmark datasets. Notably, ProxyGML does not consistently outperform the most competitive baselines on Stanford Online Products under all metrics

Reference
  • Nicolas Aziere and Sinisa Todorovic. Ensemble deep manifold similarity learning using hard proxies. In CVPR, pages 7299–7307, 2019.
    Google ScholarLocate open access versionFindings
  • Sumit Chopra, Raia Hadsell, Yann LeCun, et al. Learning a similarity metric discriminatively, with application to face verification. In CVPR, pages 539–546, 2005.
    Google ScholarLocate open access versionFindings
  • Cheng Deng, Xinxun Xu, Hao Wang, Muli Yang, and Dacheng Tao. Progressive cross-modal semantic network for zero-shot sketch-based image retrieval. IEEE Transactions on Image Processing, 29:8892–8902, 2020.
    Google ScholarLocate open access versionFindings
  • Cheng Deng, Xu Yang, Feiping Nie, and Dapeng Tao. Saliency detection via a multiple selfweighted graph-based manifold ranking. IEEE Transactions on Multimedia, 22(4):885–896, 2019.
    Google ScholarLocate open access versionFindings
  • Weifeng Ge. Deep metric learning with hierarchical triplet loss. In ECCV, pages 269–285, 2018.
    Google ScholarLocate open access versionFindings
  • Chen Gong, Dacheng Tao, Wei Liu, Liu Liu, and Jie Yang. Label propagation via teaching-tolearn and learning-to-teach. IEEE Transactions on Neural Networks and Learning Systems, 28(6):1452–1465, 2017.
    Google ScholarLocate open access versionFindings
  • Alexander Hermans, Lucas Beyer, and Bastian Leibe. In defense of the triplet loss for person re-identification. arXiv preprint arXiv:1703.07737, 2017.
    Findings
  • Elad Hoffer and Nir Ailon. Deep metric learning using triplet network. In SIMBAD, pages 84–92.
    Google ScholarLocate open access versionFindings
  • Steven C. H. Hoi, Wei Liu, and Shih-Fu Chang. Semi-supervised distance metric learning for collaborative image retrieval and clustering. ACM Transactions on Multimedia Computing, Communications and Applications, 6(3):Article 18, 2010.
    Google ScholarLocate open access versionFindings
  • Sergey Ioffe and Christian Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167, 2015.
    Findings
  • Ahmet Iscen, Giorgos Tolias, Yannis Avrithis, and Ondrej Chum. Label propagation for deep semi-supervised learning. In CVPR, pages 5070–5079, 2019.
    Google ScholarLocate open access versionFindings
  • Sungyeon Kim, Dongwon Kim, Minsu Cho, and Suha Kwak. Proxy anchor loss for deep metric learning. In CVPR, pages 3238–3247, 2020.
    Google ScholarLocate open access versionFindings
  • Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
    Findings
  • Jonathan Krause, Michael Stark, Jia Deng, and Li Fei-Fei. 3D object representations for fine-grained categorization. In ICCV Workshops, pages 554–561, 2013.
    Google ScholarLocate open access versionFindings
  • Qimai Li, Xiao-Ming Wu, Han Liu, Xiaotong Zhang, and Zhichao Guan. Label efficient semi-supervised learning via graph filtering. In CVPR, pages 9582–9591, 2019.
    Google ScholarLocate open access versionFindings
  • Xiaocui Li, Hongzhi Yin, Ke Zhou, and Xiaofang Zhou. Semi-supervised clustering with deep metric learning and graph embedding. World Wide Web, 23(2):781–798, 2020.
    Google ScholarLocate open access versionFindings
  • Wei Liu, Junfeng He, and Shih-Fu Chang. Large graph construction for scalable semi-supervised learning. In ICML, pages 679–686, 2010.
    Google ScholarLocate open access versionFindings
  • Wei Liu, Shiqian Ma, Dacheng Tao, Jianzhuang Liu, and Peng Liu. Semi-supervised sparse metric learning using alternating linearization optimization. In KDD, pages 1139–1148, 2010.
    Google ScholarLocate open access versionFindings
  • Wei Liu, Cun Mu, Rongrong Ji, Shiqian Ma, John R. Smith, and Shih-Fu Chang. Low-rank similarity metric learning in high dimensions. In AAAI, pages 2792–2799, 2015.
    Google ScholarLocate open access versionFindings
  • Wei Liu, Jun Wang, and Shih-Fu Chang. Robust and scalable graph-based semisupervised learning. Proceedings of the IEEE, 100(9):2624–2638, 2012.
    Google ScholarLocate open access versionFindings
  • Weiyang Liu, Yandong Wen, Zhiding Yu, and Meng Yang. Large-margin softmax loss for convolutional neural networks. In ICML, volume 2, page 7, 2016.
    Google ScholarLocate open access versionFindings
  • Yanbin Liu, Juho Lee, Minseop Park, Saehoon Kim, Eunho Yang, Sung Ju Hwang, and Yi Yang. Learning to propagate labels: Transductive propagation network for few-shot learning. In ICLR, 2019.
    Google ScholarLocate open access versionFindings
  • Yair Movshovitz-Attias, Alexander Toshev, Thomas K Leung, Sergey Ioffe, and Saurabh Singh. No fuss distance metric learning using proxies. In ICCV, pages 360–368, 2017.
    Google ScholarLocate open access versionFindings
  • Hyun Oh Song, Stefanie Jegelka, Vivek Rathod, and Kevin Murphy. Deep metric learning via facility location. In CVPR, pages 5382–5390, 2017.
    Google ScholarLocate open access versionFindings
  • Hyun Oh Song, Yu Xiang, Stefanie Jegelka, and Silvio Savarese. Deep metric learning via lifted structured feature embedding. In CVPR, pages 4004–4012, 2016.
    Google ScholarLocate open access versionFindings
  • Qi Qian, Lei Shang, Baigui Sun, Juhua Hu, Hao Li, and Rong Jin. Softtriple loss: Deep metric learning without triplet sampling. In ICCV, pages 6450–6458, 2019.
    Google ScholarLocate open access versionFindings
  • Qi Qian, Jiasheng Tang, Hao Li, Shenghuo Zhu, and Rong Jin. Large-scale distance metric learning with uncertainty. In CVPR, pages 8542–8550, 2018.
    Google ScholarLocate open access versionFindings
  • Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, et al. Imagenet large scale visual recognition challenge. Int. J. Comput. Vis., 115(3):211–252, 2015.
    Google ScholarLocate open access versionFindings
  • Florian Schroff, Dmitry Kalenichenko, and James Philbin. Facenet: A unified embedding for face recognition and clustering. In CVPR, pages 815–823, 2015.
    Google ScholarLocate open access versionFindings
  • Kihyuk Sohn. Improved deep metric learning with multi-class n-pair loss objective. In NeurIPS, pages 1857–1865, 2016.
    Google ScholarLocate open access versionFindings
  • Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. Going deeper with convolutions. In CVPR, pages 1–9, 2015.
    Google ScholarLocate open access versionFindings
  • Catherine Wah, Steve Branson, Peter Welinder, Pietro Perona, and Serge Belongie. The caltech-ucsd birds-200-2011 dataset. 2011.
    Google ScholarFindings
  • Feng Wang, Jian Cheng, Weiyang Liu, and Haijun Liu. Additive margin softmax for face verification. IEEE Signal Process. Lett., 25(7):926–930, 2018.
    Google ScholarLocate open access versionFindings
  • Hao Wang, Yitong Wang, Zheng Zhou, Xing Ji, Dihong Gong, Jingchao Zhou, Zhifeng Li, and Wei Liu. Cosface: Large margin cosine loss for deep face recognition. In CVPR, pages 5265–5274, 2018.
    Google ScholarLocate open access versionFindings
  • Xun Wang, Xintong Han, Weilin Huang, Dengke Dong, and Matthew R Scott. Multi-similarity loss with general pair weighting for deep metric learning. In CVPR, pages 5022–5030, 2019.
    Google ScholarLocate open access versionFindings
  • Kun Wei, Cheng Deng, and Xu Yang. Lifelong zero-shot learning. In IJCAI, pages 551–557, 2020.
    Google ScholarLocate open access versionFindings
  • Kun Wei, Muli Yang, Hao Wang, Cheng Deng, and Xianglong Liu. Adversarial fine-grained composition learning for unseen attribute-object recognition. In CVPR, pages 3741–3749, 2019.
    Google ScholarLocate open access versionFindings
  • Yandong Wen, Kaipeng Zhang, Zhifeng Li, and Yu Qiao. A discriminative feature learning approach for deep face recognition. In ECCV, pages 499–515.
    Google ScholarLocate open access versionFindings
  • Chao-Yuan Wu, R Manmatha, Alexander J Smola, and Philipp Krahenbuhl. Sampling matters in deep embedding learning. In ICCV, pages 2840–2848, 2017.
    Google ScholarLocate open access versionFindings
  • Xinyi Xu, Yanhua Yang, Cheng Deng, and Feng Zheng. Deep asymmetric metric learning via rich relationship mining. In CVPR, pages 4076–4085, 2019.
    Google ScholarLocate open access versionFindings
  • Muli Yang, Cheng Deng, Junchi Yan, Xianglong Liu, and Dacheng Tao. Learning unseen concepts via hierarchical decomposition and composition. In CVPR, pages 10248–10256, 2020.
    Google ScholarLocate open access versionFindings
  • Xu Yang, Cheng Deng, Tongliang Liu, and Dacheng Tao. Heterogeneous graph attention network for unsupervised multiple-target domain adaptation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020.
    Google ScholarLocate open access versionFindings
  • Xu Yang, Cheng Deng, Xianglong Liu, and Feiping Nie. New l 2, 1-norm relaxation of multi-way graph cut for clustering. In AAAI, 2018.
    Google ScholarLocate open access versionFindings
  • Zhilin Yang, William W Cohen, and Ruslan Salakhutdinov. Revisiting semi-supervised learning with graph embeddings. arXiv preprint arXiv:1603.08861, 2016.
    Findings
  • Yuhui Yuan, Kuiyuan Yang, and Chao Zhang. Hard-aware deeply cascaded embedding. In ICCV, pages 814–823, 2017.
    Google ScholarLocate open access versionFindings
  • Wenzhao Zheng, Zhaodong Chen, Jiwen Lu, and Jie Zhou. Hardness-aware deep metric learning. In CVPR, pages 72–81, 2019.
    Google ScholarLocate open access versionFindings
  • Dengyong Zhou, Olivier Bousquet, Thomas N Lal, Jason Weston, and Bernhard Schölkopf. Learning with local and global consistency. In NeurIPS, pages 321–328, 2004.
    Google ScholarLocate open access versionFindings
  • Yuehua Zhu, Cheng Deng, Huanhuan Cao, and Hao Wang. Object and background disentanglement for unsupervised cross-domain person re-identification. Neurocomputing, 2020.
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments
小科