AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
We found that our proposed method is more robust in the choice of its hyperparameters as well as against extreme scenarios, e.g., when the training dataset has some noisy, random labels or a small number of data samples

A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks.

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), (2018)

Cited by: 473|Views427
EI
Full Text
Bibtex
Weibo

Abstract

Detecting test samples drawn sufficiently far away from the training distribution statistically or adversarially is a fundamental requirement for deploying a good classifier in many real-world machine learning applications. However, deep neural networks with the softmax classifier are known to produce highly overconfident posterior distri...More

Code:

Data:

0
Introduction
  • Deep neural networks (DNNs) have achieved high accuracy on many classification tasks, e.g., speech recognition [1], object detection [9] and image classification [12].
  • The predictive uncertainty of DNNs is closely related to the problem of detecting abnormal samples that are drawn far away from in-distribution statistically or adversarially.
  • Hendrycks & Gimpel [13] proposed the maximum value of posterior distribution from the classifier as a baseline method, and it is improved by processing the input and output of DNNs [21].
  • Confidence scores were proposed based on density estimators to characterize them in feature spaces of DNNs [7].
  • Ma et al [22] proposed the local intrinsic dimensionality (LID) and empirically showed that the characteristics of test samples can be estimated effectively using the
Highlights
  • Deep neural networks (DNNs) have achieved high accuracy on many classification tasks, e.g., speech recognition [1], object detection [9] and image classification [12]
  • We found that our proposed method is more robust in the choice of its hyperparameters as well as against extreme scenarios, e.g., when the training dataset has some noisy, random labels or a small number of data samples
  • Given deep neural networks (DNNs) with the softmax classifier, we propose a simple yet effective method for detecting abnormal samples such as out-of-distribution (OOD) and adversarial ones
  • We show that the Mahalanobis distance-based confidence score can be utilized in class-incremental learning tasks [29]: a classifier pre-trained on base classes is progressively updated whenever a new class with corresponding samples occurs
  • We use a thresholdbased detector which measures some confidence score of the test sample, and classifies the test sample as in-distribution if the confidence score is above some threshold
  • Ours improves the area under ROC (AUROC) of local intrinsic dimensionality (LID) from 82.2% to 95.8% when we detect CW samples using ResNet trained on the CIFAR-10 dataset
  • We propose a simple yet effective method for detecting abnormal test samples including both out-of-distribution and adversarial ones
Methods
  • Baseline [13] ODIN [21].
  • Mahalanobis Feature ensemble X X.
  • Input pre-processing TNR at TPR 95% AUROC.
  • Detection accuracy AUPR in AUPR out
Results
  • The authors demonstrate the effectiveness of the proposed method using deep convolutional neural networks such as DenseNet [14] and ResNet [12] on various vision datasets: CIFAR [15], SVHN [28], ImageNet [5] and LSUN [32].
  • For the problem of detecting out-of-distribution (OOD) samples, the authors train DenseNet with 100 layers and ResNet with 34 layers for classifying CIFAR-10, CIFAR-100 and SVHN datasets.
  • The authors use a thresholdbased detector which measures some confidence score of the test sample, and classifies the test sample as in-distribution if the confidence score is above some threshold.
  • The authors consider the baseline method [13], which defines a confidence score as a maximum value of the posterior distribution, and the state-of-the-art ODIN [21], which defines the confidence score as a maximum value of the processed posterior distribution
Conclusion
  • The authors propose a simple yet effective method for detecting abnormal test samples including both out-of-distribution and adversarial ones.
  • The authors' main idea is inducing a generative classifier under LDA assumption, and defining new confidence score based on it.
  • With calibration techniques such as input pre-processing and feature ensemble, the method performs very strongly across multiple tasks: detecting out-of-distribution samples, detecting adversarial attacks and classincremental learning.
  • The authors believe that the approach have a potential to apply to many other related machine learning tasks, e.g., active learning [8], ensemble learning [19] and few-shot learning [31]
Tables
  • Table1: Contribution of each proposed method on distinguishing in- and out-of-distribution test set data. We measure the detection performance using ResNet trained on CIFAR-10, when SVHN dataset is used as OOD. All values are percentages and the best results are indicated in bold
  • Table2: Distinguishing in- and out-of-distribution test set data for image classification under various validation setups. All values are percentages and the best results are indicated in bold
  • Table3: Comparison of AUROC (%) under various validation setups. For evaluation on unknown attack, FGSM samples denoted by “seen” are used for validation. For our method, we use both feature ensemble and input pre-processing. The best results are indicated in bold
Download tables as Excel
Funding
  • This work was supported in part by Institute for Information & communications Technology Promotion (IITP) grant funded by the Korea government (MSIT) (No.R0132-15-1005, Content visual browsing technology in the online and offline environments), National Research Council of Science & Technology (NST) grant by the Korea government (MSIP) (No CRC-15-05-ETRI), DARPA Explainable AI (XAI) program #313498, Sloan Research Fellowship, and Kwanjeong Educational Foundation Scholarship
Reference
  • Amodei, Dario, Ananthanarayanan, Sundaram, Anubhai, Rishita, Bai, Jingliang, Battenberg, Eric, Case, Carl, Casper, Jared, Catanzaro, Bryan, Cheng, Qiang, Chen, Guoliang, et al. Deep speech 2: End-to-end speech recognition in english and mandarin. In ICML, 2016.
    Google ScholarLocate open access versionFindings
  • Amodei, Dario, Olah, Chris, Steinhardt, Jacob, Christiano, Paul, Schulman, John, and Mane, Dan. Concrete problems in ai safety. arXiv preprint arXiv:1606.06565, 2016.
    Findings
  • Carlini, Nicholas and Wagner, David. Adversarial examples are not easily detected: Bypassing ten detection methods. In ACM workshop on AISec, 2017.
    Google ScholarLocate open access versionFindings
  • Chrabaszcz, Patryk, Loshchilov, Ilya, and Hutter, Frank. A downsampled variant of imagenet as an alternative to the cifar datasets. arXiv preprint arXiv:1707.08819, 2017.
    Findings
  • Deng, Jia, Dong, Wei, Socher, Richard, Li, Li-Jia, Li, Kai, and Fei-Fei, Li. Imagenet: A large-scale hierarchical image database. In CVPR, 2009.
    Google ScholarLocate open access versionFindings
  • Evtimov, Ivan, Eykholt, Kevin, Fernandes, Earlence, Kohno, Tadayoshi, Li, Bo, Prakash, Atul, Rahmati, Amir, and Song, Dawn. Robust physical-world attacks on machine learning models. In CVPR, 2018.
    Google ScholarLocate open access versionFindings
  • Feinman, Reuben, Curtin, Ryan R, Shintre, Saurabh, and Gardner, Andrew B. Detecting adversarial samples from artifacts. arXiv preprint arXiv:1703.00410, 2017.
    Findings
  • Gal, Yarin, Islam, Riashat, and Ghahramani, Zoubin. Deep bayesian active learning with image data. In ICML, 2017.
    Google ScholarLocate open access versionFindings
  • Girshick, Ross. Fast r-cnn. In ICCV, 2015.
    Google ScholarLocate open access versionFindings
  • Goodfellow, Ian J, Shlens, Jonathon, and Szegedy, Christian. Explaining and harnessing adversarial examples. In ICLR, 2015.
    Google ScholarLocate open access versionFindings
  • Guo, Chuan, Rana, Mayank, Cisse, Moustapha, and van der Maaten, Laurens. Countering adversarial images using input transformations. arXiv preprint arXiv:1711.00117, 2017.
    Findings
  • He, Kaiming, Zhang, Xiangyu, Ren, Shaoqing, and Sun, Jian. Deep residual learning for image recognition. In CVPR, 2016.
    Google ScholarLocate open access versionFindings
  • Hendrycks, Dan and Gimpel, Kevin. A baseline for detecting misclassified and out-ofdistribution examples in neural networks. In ICLR, 2017.
    Google ScholarLocate open access versionFindings
  • Huang, Gao and Liu, Zhuang. Densely connected convolutional networks. In CVPR, 2017.
    Google ScholarLocate open access versionFindings
  • Krizhevsky, Alex and Hinton, Geoffrey. Learning multiple layers of features from tiny images.
    Google ScholarFindings
  • Kurakin, Alexey, Goodfellow, Ian, and Bengio, Samy. Adversarial examples in the physical world. arXiv preprint arXiv:1607.02533, 2016.
    Findings
  • Lasserre, Julia A, Bishop, Christopher M, and Minka, Thomas P. Principled hybrids of generative and discriminative models. In CVPR, 2006.
    Google ScholarLocate open access versionFindings
  • Lee, Kibok, Lee, Kimin, Min, Kyle, Zhang, Yuting, Shin, Jinwoo, and Lee, Honglak. Hierarchical novelty detection for visual object recognition. In CVPR, 2018.
    Google ScholarLocate open access versionFindings
  • Lee, Kimin, Hwang, Changho, Park, KyoungSoo, and Shin, Jinwoo. Confident multiple choice learning. In ICML, 2017.
    Google ScholarLocate open access versionFindings
  • Lee, Kimin, Lee, Honglak, Lee, Kibok, and Shin, Jinwoo. Training confidence-calibrated classifiers for detecting out-of-distribution samples. In ICLR, 2018.
    Google ScholarLocate open access versionFindings
  • Liang, Shiyu, Li, Yixuan, and Srikant, R. Principled detection of out-of-distribution examples in neural networks. In ICLR, 2018.
    Google ScholarLocate open access versionFindings
  • Ma, Xingjun, Li, Bo, Wang, Yisen, Erfani, Sarah M, Wijewickrema, Sudanthi, Houle, Michael E, Schoenebeck, Grant, Song, Dawn, and Bailey, James. Characterizing adversarial subspaces using local intrinsic dimensionality. In ICLR, 2018.
    Google ScholarLocate open access versionFindings
  • Maaten, Laurens van der and Hinton, Geoffrey. Visualizing data using t-sne. Journal of machine learning research, 2008.
    Google ScholarLocate open access versionFindings
  • McCloskey, Michael and Cohen, Neal J. Catastrophic interference in connectionist networks: The sequential learning problem. In Psychology of learning and motivation. Elsevier, 1989.
    Google ScholarFindings
  • Mensink, Thomas, Verbeek, Jakob, Perronnin, Florent, and Csurka, Gabriela. Distance-based image classification: Generalizing to new classes at near-zero cost. IEEE transactions on pattern analysis and machine intelligence, 2013.
    Google ScholarFindings
  • Moosavi Dezfooli, Seyed Mohsen, Fawzi, Alhussein, and Frossard, Pascal. Deepfool: a simple and accurate method to fool deep neural networks. In CVPR, 2016.
    Google ScholarLocate open access versionFindings
  • Murphy, Kevin P. Machine learning: a probabilistic perspective. 2012.
    Google ScholarFindings
  • Netzer, Yuval, Wang, Tao, Coates, Adam, Bissacco, Alessandro, Wu, Bo, and Ng, Andrew Y. Reading digits in natural images with unsupervised feature learning. In NIPS workshop, 2011.
    Google ScholarLocate open access versionFindings
  • Rebuffi, Sylvestre-Alvise and Kolesnikov, Alexander. icarl: Incremental classifier and representation learning. In CVPR, 2017.
    Google ScholarLocate open access versionFindings
  • Sharif, Mahmood, Bhagavatula, Sruti, Bauer, Lujo, and Reiter, Michael K. Accessorize to a crime: Real and stealthy attacks on state-of-the-art face recognition. In ACM SIGSAC, 2016.
    Google ScholarLocate open access versionFindings
  • Vinyals, Oriol, Blundell, Charles, Lillicrap, Tim, Wierstra, Daan, et al. Matching networks for one shot learning. In NIPS, 2016.
    Google ScholarLocate open access versionFindings
  • Yu, Fisher, Seff, Ari, Zhang, Yinda, Song, Shuran, Funkhouser, Thomas, and Xiao, Jianxiong. Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint arXiv:1506.03365, 2015.
    Findings
Your rating :
0

 

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科