AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
Our empirical results on different datasets demonstrate the advantages of TopoFilter in improving the robustness of deep models to label noise

A Topological Filter for Learning with Label Noise

NIPS 2020, (2020)

Cited by: 0|Views38
EI
Full Text
Bibtex
Weibo

Abstract

Noisy labels can impair the performance of deep neural networks. To tackle this problem, in this paper, we propose a new method for filtering label noise. Unlike most existing methods relying on the posterior probability of a noisy classifier, we focus on the much richer spatial behavior of data in the latent representational space. By ...More

Code:

Data:

0
Introduction
  • Corrupted labels are ubiquitous in real world data, and can severely impair the performance of deep neural networks with strong memorization ability [30, 12, 51].
  • The major challenge is to ensure that the data selection procedure is (1) careful enough to not accumulate errors; and (2) aggressive enough to collect sufficient clean data to train a strong model
  • Existing methods under this category [27, 21, 16, 43, 31] typically select clean data based on the prediction of the noisy classifier.
  • Most of these heuristics do not have a theoretical foundation and are not guaranteed to generalize to unseen datasets or noise patterns
Highlights
  • Corrupted labels are ubiquitous in real world data, and can severely impair the performance of deep neural networks with strong memorization ability [30, 12, 51]
  • We propose a novel method named TopoFilter for the learning with label noise
  • Our empirical results on different datasets demonstrate the advantages of TopoFilter in improving the robustness of deep models to label noise
  • We note that this paper only focuses on the connected components of the data in the latent representational space
Methods
  • The authors' algorithm jointly trains a neural network and collects clean data. At each epoch, clean data are collected based on the their spatial topology in the latent space of the current network.
  • (1) Standard, which is the standard deep network trained on noisy datasets; (2) Forgetting [2]; (3) Bootstrap [35]; (4) Forward Correction [33]; (5) Decoupling [27]; (6) MentorNet [21]; (7) Coteaching [16]; (8) Co-teaching+ [50]; (9) IterNLD [43]; (10) RoG [22]; (11) PENCIL [49]; (12) GCE [52]; (13) SL [44]
  • These methods are from different research directions.
  • This is further confirmed in Fig. 3(b)-(e), where the collected data pool preserves high purity during training with its size approaching the limit steadily
Results
  • The posterior probabilities employed by a few works are closely related to the penultimate layer features used in the method, they intrinsically undergo a dimension reduction process and may lose some critical information.
  • This would explain the superior performance of the method to some degree.
  • This is because the data in the connected components could still
Conclusion
  • The authors propose a novel method named TopoFilter for the learning with label noise.
  • The authors' method leverages the topological property of the data in feature space, and jointly learns the data representation and collects the clean data during training.
  • The authors' empirical results on different datasets demonstrate the advantages of TopoFilter in improving the robustness of deep models to label noise.
  • The authors note that this paper only focuses on the connected components of the data in the latent representational space.
  • The theory has been shown to provide robust solution to learning problems such as weakly supervised learning [19], clustering [32, 8], and graph neural networks [53]
Summary
  • Introduction:

    Corrupted labels are ubiquitous in real world data, and can severely impair the performance of deep neural networks with strong memorization ability [30, 12, 51].
  • The major challenge is to ensure that the data selection procedure is (1) careful enough to not accumulate errors; and (2) aggressive enough to collect sufficient clean data to train a strong model
  • Existing methods under this category [27, 21, 16, 43, 31] typically select clean data based on the prediction of the noisy classifier.
  • Most of these heuristics do not have a theoretical foundation and are not guaranteed to generalize to unseen datasets or noise patterns
  • Methods:

    The authors' algorithm jointly trains a neural network and collects clean data. At each epoch, clean data are collected based on the their spatial topology in the latent space of the current network.
  • (1) Standard, which is the standard deep network trained on noisy datasets; (2) Forgetting [2]; (3) Bootstrap [35]; (4) Forward Correction [33]; (5) Decoupling [27]; (6) MentorNet [21]; (7) Coteaching [16]; (8) Co-teaching+ [50]; (9) IterNLD [43]; (10) RoG [22]; (11) PENCIL [49]; (12) GCE [52]; (13) SL [44]
  • These methods are from different research directions.
  • This is further confirmed in Fig. 3(b)-(e), where the collected data pool preserves high purity during training with its size approaching the limit steadily
  • Results:

    The posterior probabilities employed by a few works are closely related to the penultimate layer features used in the method, they intrinsically undergo a dimension reduction process and may lose some critical information.
  • This would explain the superior performance of the method to some degree.
  • This is because the data in the connected components could still
  • Conclusion:

    The authors propose a novel method named TopoFilter for the learning with label noise.
  • The authors' method leverages the topological property of the data in feature space, and jointly learns the data representation and collects the clean data during training.
  • The authors' empirical results on different datasets demonstrate the advantages of TopoFilter in improving the robustness of deep models to label noise.
  • The authors note that this paper only focuses on the connected components of the data in the latent representational space.
  • The theory has been shown to provide robust solution to learning problems such as weakly supervised learning [19], clustering [32, 8], and graph neural networks [53]
Tables
  • Table1: Test accuracies (%) on CIFAR-10 and CIFAR-100 under different noise types and fractions. The average accuracies and standard deviations over 5 trials are reported. We perform unpaired t-test (95% significance level) on the difference between the test accuracies, and observe the improvement due to our method over state-of-the-art methods is statistically significant for all noise settings
  • Table2: Classification accuracy (%) on Clothing1M test set
Download tables as Excel
Related work
  • One representative class of methods for handling label noise aim to improve the robustness by modeling the noise transition process [38, 33, 15, 18]. However, the estimation of noise transformation is non-trivial, and these methods generally require additional access to the true labels or depend on strong assumptions, which could be impractical. In contrast to these works, our method does not rely on noise modeling, and is thus more generic and flexible.

    A number of approaches have sought to develop noise-robust loss to help resist label corruption. One typical idea is to reduce the influence of noisy samples with carefully designed losses [35, 1, 52, 40, 44, 25, 14, 6] or regularization terms [20, 28, 23]. Closely related to this philosophy, other approaches focus on adaptively re-weighting the contributions of the noisy samples to the loss. The re-weighting functions could be pre-specified based on heuristics [5, 43] or learned automatically [21, 36, 37]. Our method is independent of the loss function, and can be combined with any of them.
Funding
  • Zheng and Chen’s research was partially supported by NSF CCF-1855760 and IIS-1909038
  • Wu and Metaxas’s research was partially supported by NSF CCF-1733843, IIS-1703883, CNS-1747778, IIS-1763523, IIS-1849238-825536 and MURI-Z8424104-440149
  • Goswami’s research was partially supported by NSF CRII-1755791 and CCF-1910873
Reference
  • Eric Arazo, Diego Ortego, Paul Albert, Noel E O’Connor, and Kevin McGuinness. Unsupervised label noise modeling and loss correction. In ICML, 2019.
    Google ScholarLocate open access versionFindings
  • Devansh Arpit, Stanislaw K. Jastrzebski, Nicolas Ballas, David Krueger, Emmanuel Bengio, Maxinder S. Kanwal, Tegan Maharaj, Asja Fischer, Aaron C. Courville, Yoshua Bengio, and Simon Lacoste-Julien. A closer look at memorization in deep networks. In ICML, pages 233–242, 2017.
    Google ScholarLocate open access versionFindings
  • Dara Bahri, Heinrich Jiang, and Maya Gupta. Deep k-nn for noisy labels. arXiv preprint arXiv:2004.12289, 2020.
    Findings
  • Mikhail Belkin, Daniel J Hsu, and Partha Mitra. Overfitting or perfect fitting? risk bounds for classification and regression rules that interpolate. In NeurIPS, 2018.
    Google ScholarLocate open access versionFindings
  • Haw-Shiuan Chang, Erik Learned-Miller, and Andrew McCallum. Active bias: Training more accurate neural networks by emphasizing high variance samples. In NeurIPS, pages 1002–1012, 2017.
    Google ScholarLocate open access versionFindings
  • Nontawat Charoenphakdee, Jongyeong Lee, and Masashi Sugiyama. On symmetric losses for learning from corrupted labels. In ICML, 2019.
    Google ScholarLocate open access versionFindings
  • Kamalika Chaudhuri and Sanjoy Dasgupta. Rates of convergence for the cluster tree. In NeurIPS, pages 343–351, 2010.
    Google ScholarLocate open access versionFindings
  • Frédéric Chazal, Leonidas J Guibas, Steve Y Oudot, and Primoz Skraba. Persistence-based clustering in riemannian manifolds. Journal of the ACM (JACM), 60(6):1–38, 2013.
    Google ScholarLocate open access versionFindings
  • Chao Chen, Xiuyan Ni, Qinxun Bai, and Yusu Wang. A topological regularizer for classifiers via persistent homology. In The 22nd International Conference on Artificial Intelligence and Statistics, pages 2573–2582, 2019.
    Google ScholarLocate open access versionFindings
  • Pengfei Chen, Benben Liao, Guangyong Chen, and Shengyu Zhang. Understanding and utilizing deep neural networks trained with noisy labels. In ICML, 2019.
    Google ScholarLocate open access versionFindings
  • Herbert Edelsbrunner and John Harer. Computational topology: an introduction. American Mathematical Soc., 2010.
    Google ScholarLocate open access versionFindings
  • Benoît Frénay and Michel Verleysen. Classification in the presence of label noise: a survey. IEEE Trans. Neural Netw. Learning Syst., 25(5):845–869, 2014.
    Google ScholarLocate open access versionFindings
  • Wei Gao, Bin-Bin Yang, and Zhi-Hua Zhou. On the resistance of nearest neighbor to random noisy labels. arXiv preprint arXiv:1607.07526, 2016.
    Findings
  • Aritra Ghosh, Himanshu Kumar, and PS Sastry. Robust loss functions under label noise for deep neural networks. In AAAI, 2017.
    Google ScholarLocate open access versionFindings
  • Jacob Goldberger and Ehud Ben-Reuven. Training deep neural-networks using a noise adaptation layer. In ICLR, 2017.
    Google ScholarLocate open access versionFindings
  • Bo Han, Quanming Yao, Xingrui Yu, Gang Niu, Miao Xu, Weihua Hu, Ivor W. Tsang, and Masashi Sugiyama. Co-teaching: Robust training of deep neural networks with extremely noisy labels. In NeurIPS, pages 8536–8546, 2018.
    Google ScholarLocate open access versionFindings
  • Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In CVPR, pages 770–778, 2016.
    Google ScholarLocate open access versionFindings
  • Dan Hendrycks, Mantas Mazeika, Duncan Wilson, and Kevin Gimpel. Using trusted data to train deep networks on labels corrupted by severe noise. In NeurIPS, pages 10477–10486, 2018.
    Google ScholarLocate open access versionFindings
  • Christoph Hofer, Roland Kwitt, Marc Niethammer, and Mandar Dixit. Connectivity-optimized representation learning via persistent homology. In ICML, pages 2751–2760. PMLR, 2019.
    Google ScholarLocate open access versionFindings
  • W Hu, Z Li, and D Yu. Simple and effective regularization methods for training on noisily labeled data with generalization guarantee. In ICLR, 2020.
    Google ScholarLocate open access versionFindings
  • Lu Jiang, Zhengyuan Zhou, Thomas Leung, Li-Jia Li, and Li Fei-Fei. Mentornet: Learning data-driven curriculum for very deep neural networks on corrupted labels. In ICML, pages 2309–2318, 2018.
    Google ScholarLocate open access versionFindings
  • Kimin Lee, Sukmin Yun, Kibok Lee, Honglak Lee, Bo Li, and Jinwoo Shin. Robust inference via generative classifiers for handling noisy labels. In ICML, 2019.
    Google ScholarLocate open access versionFindings
  • Junnan Li, Yongkang Wong, Qi Zhao, and Mohan S Kankanhalli. Learning to learn from noisy labeled data. In CVPR, pages 5051–5059, 2019.
    Google ScholarLocate open access versionFindings
  • Yuncheng Li, Jianchao Yang, Yale Song, Liangliang Cao, Jiebo Luo, and Li-Jia Li. Learning from noisy labels with distillation. In ICCV, pages 1928–1936, 2017.
    Google ScholarLocate open access versionFindings
  • Xingjun Ma, Yisen Wang, Michael E. Houle, Shuo Zhou, Sarah M. Erfani, Shu-Tao Xia, Sudanthi N. R. Wijewickrema, and James Bailey. Dimensionality-driven learning with noisy labels. In ICML, pages 3361–3370, 2018.
    Google ScholarLocate open access versionFindings
  • Markus Maier, Matthias Hein, and Ulrike Von Luxburg. Optimal construction of k-nearestneighbor graphs for identifying noisy clusters. Theoretical Computer Science, 410(19):1749– 1764, 2009.
    Google ScholarLocate open access versionFindings
  • Eran Malach and Shai Shalev-Shwartz. Decoupling" when to update" from" how to update". In NeurIPS, pages 960–970, 2017.
    Google ScholarLocate open access versionFindings
  • Aditya Krishna Menon, Ankit Singh Rawat, Sashank J Reddi, and Sanjiv Kumar. Can gradient clipping mitigate label noise? In ICLR, 2020.
    Google ScholarLocate open access versionFindings
  • Volodymyr Mnih and Geoffrey E. Hinton. Learning to label aerial images from noisy data. In ICML, 2012.
    Google ScholarLocate open access versionFindings
  • David F Nettleton, Albert Orriols-Puig, and Albert Fornells. A study of the effect of different types of noise on the precision of supervised learning techniques. Artificial intelligence review, 33(4):275–306, 2010.
    Google ScholarLocate open access versionFindings
  • Duc Tam Nguyen, Chaithanya Kumar Mummadi, Thi Phuong Nhung Ngo, Thi Hoai Phuong Nguyen, Laura Beggel, and Thomas Brox. Self: Learning to filter noisy labels with selfensembling. In ICLR, 2020.
    Google ScholarLocate open access versionFindings
  • Xiuyan Ni, Novi Quadrianto, Yusu Wang, and Chao Chen. Composing tree graphical models with persistent homology features for clustering mixed-type data. In ICML, pages 2622–2631, 2017.
    Google ScholarLocate open access versionFindings
  • Giorgio Patrini, Alessandro Rozza, Aditya Krishna Menon, Richard Nock, and Lizhen Qu. Making deep neural networks robust to label noise: A loss correction approach. In CVPR, pages 2233–2241, 2017.
    Google ScholarLocate open access versionFindings
  • Xingye Qiao, Jiexin Duan, and Guang Cheng. Rates of convergence for large-scale nearest neighbor classification. In NeurIPS, 2019.
    Google ScholarLocate open access versionFindings
  • Scott Reed, Honglak Lee, Dragomir Anguelov, Christian Szegedy, Dumitru Erhan, and Andrew Rabinovich. Training deep neural networks on noisy labels with bootstrapping. In ICLR Workshop, 2014.
    Google ScholarLocate open access versionFindings
  • Mengye Ren, Wenyuan Zeng, Bin Yang, and Raquel Urtasun. Learning to reweight examples for robust deep learning. In ICML, pages 4331–4340, 2018.
    Google ScholarLocate open access versionFindings
  • Jun Shu, Qi Xie, Lixuan Yi, Qian Zhao, Sanping Zhou, Zongben Xu, and Deyu Meng. Metaweight-net: Learning an explicit mapping for sample weighting. In NeurIPS, pages 1917–1928, 2019.
    Google ScholarLocate open access versionFindings
  • Sainbayar Sukhbaatar, Joan Bruna, Manohar Paluri, Lubomir Bourdev, and Rob Fergus. Training convolutional networks with noisy labels. In ICLR Workshop, 2014.
    Google ScholarLocate open access versionFindings
  • Daiki Tanaka, Daiki Ikami, Toshihiko Yamasaki, and Kiyoharu Aizawa. Joint optimization framework for learning with noisy labels. In CVPR, pages 5552–5560, 2018.
    Google ScholarLocate open access versionFindings
  • Sunil Thulasidasan, Tanmoy Bhattacharya, Jeff Bilmes, Gopinath Chennupati, and Jamal Mohd-Yusof. Combating label noise in deep learning using abstention. In ICML, 2019.
    Google ScholarLocate open access versionFindings
  • Arash Vahdat. Toward robustness against label noise in training deep discriminative neural networks. In NeurIPS, pages 5596–5605, 2017.
    Google ScholarLocate open access versionFindings
  • Andreas Veit, Neil Alldrin, Gal Chechik, Ivan Krasin, Abhinav Gupta, and Serge J. Belongie. Learning from noisy large-scale datasets with minimal supervision. In CVPR, pages 6575–6583, 2017.
    Google ScholarLocate open access versionFindings
  • Yisen Wang, Weiyang Liu, Xingjun Ma, James Bailey, Hongyuan Zha, Le Song, and Shu-Tao Xia. Iterative learning with open-set noisy labels. In CVPR, pages 8688–8696, 2018.
    Google ScholarLocate open access versionFindings
  • Yisen Wang, Xingjun Ma, Zaiyi Chen, Yuan Luo, Jinfeng Yi, and James Bailey. Symmetric cross entropy for robust learning with noisy labels. In CVPR, pages 322–330, 2019.
    Google ScholarLocate open access versionFindings
  • Xiang Wu, Ran He, Zhenan Sun, and Tieniu Tan. A light CNN for deep face representation with noisy labels. IEEE Trans. Information Forensics and Security, 13(11):2884–2896, 2018.
    Google ScholarLocate open access versionFindings
  • Zhirong Wu, Shuran Song, Aditya Khosla, Fisher Yu, Linguang Zhang, Xiaoou Tang, and Jianxiong Xiao. 3d shapenets: A deep representation for volumetric shapes. In CVPR, pages 1912–1920, 2015.
    Google ScholarLocate open access versionFindings
  • Tong Xiao, Tian Xia, Yi Yang, Chang Huang, and Xiaogang Wang. Learning from massive noisy labeled data for image classification. In CVPR, pages 2691–2699, 2015.
    Google ScholarLocate open access versionFindings
  • Yan Yan, Rómer Rosales, Glenn Fung, Ramanathan Subramanian, and Jennifer Dy. Learning from multiple annotators with varying expertise. Machine learning, 95(3):291–327, 2014.
    Google ScholarLocate open access versionFindings
  • Kun Yi and Jianxin Wu. Probabilistic end-to-end noise correction for learning with noisy labels. In CVPR, pages 7017–7025, 2019.
    Google ScholarLocate open access versionFindings
  • Xingrui Yu, Bo Han, Jiangchao Yao, Gang Niu, Ivor W Tsang, and Masashi Sugiyama. How does disagreement help generalization against label corruption? In ICML, 2019.
    Google ScholarLocate open access versionFindings
  • Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, and Oriol Vinyals. Understanding deep learning requires rethinking generalization. In ICLR, 2017.
    Google ScholarLocate open access versionFindings
  • Zhilu Zhang and Mert Sabuncu. Generalized cross entropy loss for training deep neural networks with noisy labels. In NeurIPS, pages 8778–8788, 2018.
    Google ScholarLocate open access versionFindings
  • Qi Zhao, Ze Ye, Chao Chen, and Yusu Wang. Persistence enhanced graph neural network. In International Conference on Artificial Intelligence and Statistics, pages 2896–2906, 2020.
    Google ScholarLocate open access versionFindings
  • Songzhu Zheng, Pengxiang Wu, Aman Goswami, Mayank Goswami, Dimitris Metaxas, and Chao Chen. Error-bounded correction of noisy labels. In ICML, 2020.
    Google ScholarLocate open access versionFindings
Author
Songzhu Zheng
Songzhu Zheng
Mayank Goswami
Mayank Goswami
Your rating :
0

 

Tags
Comments
小科