They are Not Completely Useless: Towards Recycling Transferable Unlabeled Data for Class-Mismatched Semi-Supervised Learning.

arxiv(2023)

引用 4|浏览58
暂无评分
摘要
Semi-Supervised Learning (SSL) with mismatched classes deals with the problem that the classes-of-interests in the limited labeled data are only a subset of the classes in massive unlabeled data. As a result, classical SSL methods would be misled by the classes which are only possessed by the unlabeled data. To solve this problem, some recent methods divide unlabeled data to useful in-distribution (ID) data and harmful out-of-distribution (OOD) data, among which the latter should particularly be weakened. As a result, the potential value contained by OOD data is largely overlooked. To remedy this defect, this paper proposes a "Transferable OOD data Recycling" (TOOR) method which properly utilizes ID data as well as the "recyclable" OOD data to enrich the information for conducting class-mismatched SSL. Specifically, TOOR treats the OOD data that have a close relationship with ID data and labeled data as recyclable, and employs adversarial domain adaptation to project them to the space of ID data and labeled data. In other words, the recyclability of an OOD datum is evaluated by its transferability, and the recyclable OOD data are transferred so that they are compatible with the distribution of known classes-of-interests. Consequently, our TOOR extracts more information from unlabeled data than existing methods, so it achieves an improved performance which is demonstrated by the experiments on typical benchmark datasets.
更多
查看译文
关键词
unlabeled data,recycling transferable
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要