Positive Unlabeled Learning with Class-prior Approximation

IJCAI, pp. 2014-2021, 2020.

Cited by: 0|Bibtex|Views29|DOI:https://doi.org/10.24963/ijcai.2020/279
EI
Other Links: dblp.uni-trier.de|academic.microsoft.com
Weibo:
Different from previous analyses, we convert positive and unlabeled learning to a direct class-prior estimation and classification problem by introducing the mixture proportion estimation to the loss minimization function

Abstract:

The positive unlabeled (PU) learning aims to train a binary classifier from a set of positive labeled samples and other unlabeled samples. Much research has been done on this special branch of weakly supervised classification problems. Since only part of the positive class is labeled, the classical PU model trains the classifier assuming...More

Code:

Data:

0
Introduction
  • For traditional supervised classification problems, both positive and negative labels should be known before building a suitable binary classifier.
  • In practical application, negative data labels are difficult to obtain, such as date sets, where only the relevant class is known, and the negative class is very large and dense [Kiryo et al, 2017]
  • At this time, without the assistance of negative labels, analysis of positive and unlabeled (PU) data, which tries to learn a binary classifier by using only part of the labeled positive samples and other mixed samples, is used in practical applications.
  • Since only part of the preferences are provided, PU learning methods can push the text of users’ needs while filtering out the irrelevant information
Highlights
  • For traditional supervised classification problems, both positive and negative labels should be known before building a suitable binary classifier
  • Without the assistance of negative labels, analysis of positive and unlabeled (PU) data, which tries to learn a binary classifier by using only part of the labeled positive samples and other mixed samples, is used in practical applications
  • This paper proposed a novel positive and unlabeled learning method with class prior approximation
  • Different from previous analyses, we convert positive and unlabeled learning to a direct class-prior estimation and classification problem by introducing the mixture proportion estimation to the loss minimization function
  • A gradient thresholding algorithm is utilized based on rigorous theoretical analysis
  • Experimental results on both synthetic and real-world datasets clearly show that Classprior Approximation model for PU learning is superior to other state-ofthe-art methods
Methods
  • The proposed CAPU method is compared with related works such as EN1 [Elkan and Noto, 2008], PE2 [Du Plessis and Sugiyama, 2014], KM3 [Ramaswamy et al, 2016], and TIcE4 [Bekker and Davis, 2018].

    The aforementioned methods have made great efforts in estimating the true class prior.
  • TIcE do not provide a classification process, the authors report the accuracies utilizing a benchmark PU learning method: the unbiased PU (UPU)5 [Du Plessis et al, 2015].
  • The UPU [Du Plessis et al, 2015] and an improved method, USMO6 [Sansone et al, 2018], are tested with the truth vale of π.
  • Edu/∼cscott/code/kernel MPE.zip kuleuven.be/software/tice com/kiryor/nnPUlearning emsansone/USMO π Method.
Results
  • To create PU samples from each dataset, the authors derived three different settings of positive and unlabeled samples as follows.
  • The class-prior in the unlabeled set varies in [0.3, 0.5, 0.7]
  • This procedure was repeated 20 times for each setting for each dataset, and the evaluation matrices applied for performance comparisons are the mean class-prior estimates π, the mean absolute errors |π − π| and the F-scores [Fang et al, 2020b] over 20 trials.
  • The results of all methods on the synthetic dataset are reported in Table 1
  • In this two-cluster Gaussian distributed dataset, KM has the nearest class-prior estimation when π = 0.3 and 0.5, and the proposed CAPU method generally has secondary performance.
  • When applying the classification evaluation, the method achieves the best classification accuracy compared with other methods
Conclusion
  • The authors found that when estimating the class prior, the method has similar performances to other methods.
  • Some research based on class-prior estimation has been proposed, such as the KM and TIcE methods
  • These methods provide additional information for later analysis, but they only provide an estimation of the positive proportions.
  • There are some methods based on class-prior correction and classification, such as the benchmark EN and PE methods, which can estimate class prior as well as classify data
  • The authors found that these methods have unstable performance with huge estimation errors, which leads to poor classification results.This paper proposed a novel PU learning method with class prior approximation.
  • Experimental results on both synthetic and real-world datasets clearly show that CAPU is superior to other state-ofthe-art methods
Summary
  • Introduction:

    For traditional supervised classification problems, both positive and negative labels should be known before building a suitable binary classifier.
  • In practical application, negative data labels are difficult to obtain, such as date sets, where only the relevant class is known, and the negative class is very large and dense [Kiryo et al, 2017]
  • At this time, without the assistance of negative labels, analysis of positive and unlabeled (PU) data, which tries to learn a binary classifier by using only part of the labeled positive samples and other mixed samples, is used in practical applications.
  • Since only part of the preferences are provided, PU learning methods can push the text of users’ needs while filtering out the irrelevant information
  • Methods:

    The proposed CAPU method is compared with related works such as EN1 [Elkan and Noto, 2008], PE2 [Du Plessis and Sugiyama, 2014], KM3 [Ramaswamy et al, 2016], and TIcE4 [Bekker and Davis, 2018].

    The aforementioned methods have made great efforts in estimating the true class prior.
  • TIcE do not provide a classification process, the authors report the accuracies utilizing a benchmark PU learning method: the unbiased PU (UPU)5 [Du Plessis et al, 2015].
  • The UPU [Du Plessis et al, 2015] and an improved method, USMO6 [Sansone et al, 2018], are tested with the truth vale of π.
  • Edu/∼cscott/code/kernel MPE.zip kuleuven.be/software/tice com/kiryor/nnPUlearning emsansone/USMO π Method.
  • Results:

    To create PU samples from each dataset, the authors derived three different settings of positive and unlabeled samples as follows.
  • The class-prior in the unlabeled set varies in [0.3, 0.5, 0.7]
  • This procedure was repeated 20 times for each setting for each dataset, and the evaluation matrices applied for performance comparisons are the mean class-prior estimates π, the mean absolute errors |π − π| and the F-scores [Fang et al, 2020b] over 20 trials.
  • The results of all methods on the synthetic dataset are reported in Table 1
  • In this two-cluster Gaussian distributed dataset, KM has the nearest class-prior estimation when π = 0.3 and 0.5, and the proposed CAPU method generally has secondary performance.
  • When applying the classification evaluation, the method achieves the best classification accuracy compared with other methods
  • Conclusion:

    The authors found that when estimating the class prior, the method has similar performances to other methods.
  • Some research based on class-prior estimation has been proposed, such as the KM and TIcE methods
  • These methods provide additional information for later analysis, but they only provide an estimation of the positive proportions.
  • There are some methods based on class-prior correction and classification, such as the benchmark EN and PE methods, which can estimate class prior as well as classify data
  • The authors found that these methods have unstable performance with huge estimation errors, which leads to poor classification results.This paper proposed a novel PU learning method with class prior approximation.
  • Experimental results on both synthetic and real-world datasets clearly show that CAPU is superior to other state-ofthe-art methods
Tables
  • Table1: The comparative results of the various methods on the syntectic dataset when the class prior is set as 30%, 50% and 70% of the unlabeled data. The estimates/the absolute class prior error and the F-scores (%) over 20 trials are reported. The best record under each π is marked in bold. “ ” indicates that the proposed method is significantly better than the corresponding method via paired t-test
  • Table2: The comparative results of various methods on real-world datasets when the class prior is set as 30%, 50% and 70% of the unlabeled data. The estimates/the absolute class-prior error and the F-scores (%) over 20 trials are reported. The best record under each π is marked in bold. “ ” indicates that the proposed method is significantly better than the corresponding method via paired t-test
Download tables as Excel
Funding
  • This work was supported in part by the National Natural Science Foundation of China under Grants 61822113, 62041105, the Science and Technology Major Project of Hubei Province (Next-Generation AI Technologies) under Grant 2019AEA170, the Natural Science Foundation of Hubei Province under Grants 2018CFA050, the Fundamental Research Funds for the Central Universities under Grant 413000092 and 413000082
Reference
  • [Bekker and Davis, 2018] Jessa Bekker and Jesse Davis. Estimating the class prior in positive and unlabeled data through decision tree induction. In AAAI, 2018.
    Google ScholarLocate open access versionFindings
  • [Christoffel et al., 2016] Marthinus Christoffel, Gang Niu, and Masashi Sugiyama. Class-prior estimation for learning from positive and unlabeled data. In ACML, pages 221–236, 2016.
    Google ScholarLocate open access versionFindings
  • [Claesen et al., 2015] Marc Claesen, Frank De Smet, Johan AK Suykens, and Bart De Moor. A robust ensemble approach to learn from positive and unlabeled data using svm base models. Neurocomputing, 160:73–84, 2015.
    Google ScholarLocate open access versionFindings
  • [Du Plessis and Sugiyama, 2014] Marthinus Christoffel Du Plessis and Masashi Sugiyama. Class prior estimation from positive and unlabeled data. IEICE Transactions on Information and Systems, 97(5):1358–1362, 2014.
    Google ScholarLocate open access versionFindings
  • [Du Plessis et al., 2014] Marthinus C Du Plessis, Gang Niu, and Masashi Sugiyama. Analysis of learning from positive and unlabeled data. In NIPS, pages 703–711, 2014.
    Google ScholarLocate open access versionFindings
  • [Du Plessis et al., 2015] Marthinus Du Plessis, Gang Niu, and Masashi Sugiyama. Convex formulation for learning from positive and unlabeled data. In ICML, pages 1386– 1394, 2015.
    Google ScholarLocate open access versionFindings
  • [Elkan and Noto, 2008] Charles Elkan and Keith Noto. Learning classifiers from only positive and unlabeled data. In Proc. SIGKDD, pages 213–220, 2008.
    Google ScholarLocate open access versionFindings
  • [Fang et al., 2020a] Yixiang Fang, Xin Huang, Lu Qin, Ying Zhang, Wenjie Zhang, Reynold Cheng, and Xuemin Lin. A survey of community search over big graphs. The VLDB Journal, 29(1):353–392, 2020.
    Google ScholarLocate open access versionFindings
  • [Fang et al., 2020b] Yixiang Fang, Yixing Yang, Wenjie Zhang, Xuemin Lin, and Xin Cao. Effective and efficient community search over large heterogeneous information networks. VLDB Endowment, 13(6):854–867, 2020.
    Google ScholarLocate open access versionFindings
  • [Gong et al., 2019a] Chen Gong, Tongliang Liu, Jian Yang, and Dacheng Tao. Large-margin label-calibrated support vector machines for positive and unlabeled learning. IEEE T-NNLS, 2019.
    Google ScholarLocate open access versionFindings
  • [Gong et al., 2019b] Chen Gong, Hong Shi, Tongliang Liu, Chuang Zhang, Jian Yang, and Dacheng Tao. Loss decomposition and centroid estimation for positive and unlabeled learning. IEEE T-PAMI, 2019.
    Google ScholarLocate open access versionFindings
  • [Kiryo et al., 2017] Ryuichi Kiryo, Gang Niu, Marthinus C du Plessis, and Masashi Sugiyama. Positive-unlabeled learning with non-negative risk estimator. In NIPS, pages 1675–1685, 2017.
    Google ScholarLocate open access versionFindings
  • [Lee and Liu, 2003] Wee Sun Lee and Bing Liu. Learning with positive and unlabeled examples using weighted logistic regression. In ICML, volume 3, pages 448–455, 2003.
    Google ScholarLocate open access versionFindings
  • [Li and Liu, 2003] Xiaoli Li and Bing Liu. Learning to classify texts using positive and unlabeled data. In IJCAI, volume 3, pages 587–592, 2003.
    Google ScholarLocate open access versionFindings
  • [Li et al., 2010] Wenkai Li, Qinghua Guo, and Charles Elkan. A positive and unlabeled learning algorithm for one-class classification of remote-sensing data. IEEE TGRS, 49(2):717–725, 2010.
    Google ScholarLocate open access versionFindings
  • [Liu et al., 2002] Bing Liu, Wee Sun Lee, Philip S Yu, and Xiaoli Li. Partially supervised classification of text documents. In ICML, volume 2, pages 387–394, 2002.
    Google ScholarLocate open access versionFindings
  • [Liu et al., 2003] Bing Liu, Yang Dai, Xiaoli Li, Wee Sun Lee, and S Yu Philip. Building text classifiers using positive and unlabeled examples. In ICDM, volume 3, pages 179–188, 2003.
    Google ScholarLocate open access versionFindings
  • [Ramaswamy et al., 2016] Harish Ramaswamy, Clayton Scott, and Ambuj Tewari. Mixture proportion estimation via kernel embeddings of distributions. In ICML, pages 2052–2060, 2016.
    Google ScholarLocate open access versionFindings
  • [Sansone et al., 2018] Emanuele Sansone, Francesco GB De Natale, and Zhi-Hua Zhou. Efficient training for positive unlabeled learning. IEEE T-PAMI, 2018.
    Google ScholarLocate open access versionFindings
  • [Smola et al., 2007] Alex Smola, Arthur Gretton, Le Song, and Bernhard Scholkopf. A hilbert space embedding for distributions. In International Conference on Algorithmic Learning Theory, pages 13–31, 2007.
    Google ScholarLocate open access versionFindings
  • [Wang et al., 2018] Zheng Wang, Xiang Bai, Mang Ye, and Shin’ichi Satoh. Incremental deep hidden attribute learning. In ACM Multimedia, pages 72–80, 2018.
    Google ScholarLocate open access versionFindings
  • [Xu et al., 2017] Yixing Xu, Chang Xu, Chao Xu, and Dacheng Tao. Multi-positive and unlabeled learning. In IJCAI, pages 3182–3188, 2017.
    Google ScholarLocate open access versionFindings
  • [Yang et al., 2017] Pengyi Yang, Wei Liu, and Jean Yang. Positive unlabeled learning via wrapper-based adaptive sampling. In IJCAI, pages 3273–3279, 2017.
    Google ScholarLocate open access versionFindings
  • [Ye et al., 2018] Mang Ye, Zheng Wang, Xiangyuan Lan, and Pong C Yuen. Visible thermal person re-identification via dual-constrained top-ranking. In IJCAI, volume 1, page 2, 2018.
    Google ScholarLocate open access versionFindings
  • [Yu et al., 2002] Hwanjo Yu, Jiawei Han, and Kevin ChenChuan Chang. Pebl: positive example based learning for web page classification using svm. In Proc. SIGKDD, pages 239–248, 2002.
    Google ScholarLocate open access versionFindings
  • [Yu et al., 2018] Xiyu Yu, Tongliang Liu, Mingming Gong, Kayhan Batmanghelich, and Dacheng Tao. An efficient and provable approach for mixture proportion estimation using linear independence assumption. In CVPR, pages 4480–4489, 2018.
    Google ScholarLocate open access versionFindings
  • [Zhang et al., 2019] Chuang Zhang, Dexin Ren, Tongliang Liu, Jian Yang, and Chen Gong. Positive and unlabeled learning with label disambiguation. In IJCAI, pages 4250– 4256, 2019.
    Google ScholarLocate open access versionFindings
Full Text
Your rating :
0

 

Tags
Comments