AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
View the video

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
We show our model is able to achieve stateof-the-art performance on benchmark OOD detection and adversarial example detection tasks

Deep Verifier Networks: Verification of Deep Discriminative Models with Deep Generative Models

AAAI, pp.7002-7010, (2021)

Cited by: 22|Views9178
EI
Full Text
Bibtex
Weibo

Abstract

AI Safety is a major concern in many deep learning applications such as autonomous driving. Given a trained deep learning model, an important natural problem is how to reliably verify the model's prediction. In this paper, we propose a novel framework --- deep verifier networks (DVN) to verify the inputs and outputs of deep discriminati...More

Code:

Data:

0
Introduction
  • Deep learning models provide state-of-the-art performance in various applications such as image classification, caption generation, sequence modeling and machine translation.
  • Such performance is based on the assumption that the training and testing data are sampled from the similar distribution [9].
  • On out-of-distribution (OOD) samples, deep learning models can fail silently by producing high confidence in their incorrect predictions even for completely unrecognizable or irrelevant inputs [1].
Highlights
  • Deep learning models provide state-of-the-art performance in various applications such as image classification, caption generation, sequence modeling and machine translation
  • We propose to verify the predictions of deep discriminative models by using deep generative models that try to generate the input given the prediction of the discriminative model
  • We propose to enhance the performance of anomaly detection by verifying predictions of deep discriminative models using deep generative models
  • We propose our model Deep Verifier Networks (DVN) which is based on conditional variational auto-encoders with disentanglement
  • We show our model is able to achieve stateof-the-art performance on benchmark OOD detection and adversarial example detection tasks
  • We believe our method would provide some illuminations for artificial intelligence (AI) safety systems
Methods
  • Each of them has 10 descriptions that are provided by [28]
  • For these two datasets, the authors use 80% samples to train the captioner, and the remaining 20% for testing in a crossvalidation manner.
  • A character level CNN-RNN model [28] is used for the text embedding which produces the 1,024-dimension vector given the description, and projected to a 128-dimension code c.
  • The input of discriminator is the concatenation of z and c, which result in a 228-dimension vector.
Results
  • The authors demonstrate the effectiveness of the DVN on several classification benchmarks, and show its potential for image caption task.
  • It corresponds to the maximum classification probability over all possible thresholds
Conclusion
  • Conclusion and Future Works

    In this paper, the authors propose to enhance the performance of anomaly detection by verifying predictions of deep discriminative models using deep generative models.
  • The authors propose the model Deep Verifier Networks (DVN) which is based on conditional variational auto-encoders with disentanglement.
  • The authors show the model is able to achieve stateof-the-art performance on benchmark OOD detection and adversarial example detection tasks.
  • Ordinary image classifiers such as DenseNet have perfect accuracy for in-distribution queries, but their behaviors are undefined on adversarial queries.
  • Robust image classifier sacrifices some accuracy for robustness to adversarial examples.
  • If the prediction does not pass the verification, the authors switch to the robust image classifier.
  • The authors believe the method would provide some illuminations for AI safety systems
Tables
  • Table1: OOD verification results of image classification under different validation setups. All metrics are percentages and the best results are bolded. The ResNet for SUF [<a class="ref-link" id="c20" href="#r20">20</a>] and our is ResNet34 [<a class="ref-link" id="c12" href="#r12">12</a>], while ODIN [<a class="ref-link" id="c21" href="#r21">21</a>] use more powerful wide ResNet 40 with width 4 [<a class="ref-link" id="c35" href="#r35">35</a>]
  • Table2: Test error rate of classification on CIFAR-10/100 using
  • Table3: The performance of DVN w/o disentanglement of y from z with ResNet backbone, and using CIFAR-10/SVHN as indistribution/OOD, respectively
  • Table4: The performance of DVN w/o replace p(z) with p∗(z)
  • Table5: Comparison of AUROC (%) under different validation setups. The best results are bolded
  • Table6: OOD verification results of image caption under different validation setups. We use CUB-200, LSUN and COCO as the OOD of Oxford-102, while using Oxford-102, LSUN and COCO as OOD of CUB-200
Download tables as Excel
Related work
  • Detecting OOD samples in low-dimensional space using density estimation, nearest neighbor and clustering analysis have been well-studied [26]. However, they are usually unreliable in high-dimensional space, e.g., image [21].

    OOD detection with deep neural networks has been recently developed. [13] found that pre-trained DNNs have higher maximum softmax probability for in-distribution examples than anomalous one. Based on this work, [21] proposed that the maximum softmax probability can be more separable between in/out-of distribution samples when using adversarial perturbations for pre-processing in the training stage. [7] augmented the classifier with a confidence estimation branch, and adjusted the softmax distribution using the predicted confidence score in the training stage. [19] trained a classifier simultaneously with a GAN, and an additional objective of the classifier is to produce low confidence on generated samples. [14] proposed to use the enormous real images rather than the generated OOD samples to train the detector. [31] applied a margin entropy loss over the softmax output, in which a part of training data is labeled as OOD sample and the partition of in-distribution and OOD is changing to train an ensemble classifier. These improvements based on [13] require re-training the model with different modifications.

    [20, 27] claimed to explore the DNNs’ feature space rather than the output posterior distribution, which is applicable to the pre-trained softmax neural networks. [20] obtained the class conditional Gaussian distribution using Gaussian discriminative analysis, and the confidence score is defined using the Mahalanobis distance between the sample and the closest class-conditional Gaussian distribution. By modeling each class of in-distribution samples independently, it showed remarkable results for OOD and adversarial attacks detection. Noticing that its reported best performance also needs the input pre-processing and model change. Besides, [21, 31, 20] need OOD examples for hyper-parameter-validations and require two forward and one backward passes in the test stage. Another limitation of the aforementioned methods is that they can only target for the classifier with softmax output.
Funding
  • We achieve state-of-the-art results in all of these problems
  • We show our model is able to achieve stateof-the-art performance on benchmark OOD detection and adversarial example detection tasks
Study subjects and analysis
samples: 10000
The LSUN (crop) and LSUN (resize) are created in a similar downsampling manner to the TinyImageNet datasets. The Uniform noise and Gaussian noise dataset are with 10,000 samples respectively, which are generated by drawing each pixel in a 32×32 RGB image from an i.i.d uniform distribution of the range [0, 1] or an i.i.d Gaussian distribution with a mean of 0.5 and variance of 1 [21]. Setups

bird species with 11,788 images: 200
Oxford-102 contains 8,189 images of 102 classes of flower. CUB-200 contains 200 bird species with 11,788 images. Dataset CIFAR-10 CIFAR-100

Reference
  • D. Amodei, C. Olah, J. Steinhardt, P. Christiano, J. Schulman, and D. Mane. Concrete problems in ai safety. arXiv preprint arXiv:1606.06565, 2016. 1
    Findings
  • S. R. Bowman, L. Vilnis, O. Vinyals, A. M. Dai, R. Jozefowicz, and S. Bengio. Generating sentences from a continuous space. arXiv preprint arXiv:1511.06349, 2015. 5
    Findings
  • Y. Burda, R. Grosse, and R. Salakhutdinov. Importance weighted autoencoders. arXiv preprint arXiv:1509.00519, 2015. 3
    Findings
  • N. Carlini and D. Wagner. Adversarial examples are not easily detected: Bypassing ten detection methods. In Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, pages 3–1ACM, 2017. 7
    Google ScholarLocate open access versionFindings
  • H. Choi, E. Jang, and A. A. Alemi. Waic, but why? generative ensembles for robust anomaly detection. arXiv preprint arXiv:1810.01392, 2018. 2
    Findings
  • J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. FeiFei. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255.
    Google ScholarLocate open access versionFindings
  • T. Devries and G. W. Taylor. Learning confidence for out-ofdistribution detection in neural networks. 2018. 2
    Google ScholarFindings
  • R. Feinman, R. R. Curtin, S. Shintre, and A. B. Gardner. Detecting adversarial samples from artifacts. arXiv preprint arXiv:1703.00410, 2017. 7
    Findings
  • I. Goodfellow, Y. Bengio, and A. Courville. Deep learning. MIT press, 2016. 1
    Google ScholarFindings
  • I. J. Goodfellow, J. Shlens, and C. Szegedy. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572, 2014. 7
    Findings
  • C. Guo, G. Pleiss, Y. Sun, and K. Q. Weinberger. On calibration of modern neural networks. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pages 1321–1330. JMLR. org, 2017. 1
    Google ScholarLocate open access versionFindings
  • K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016. 5, 6
    Google ScholarLocate open access versionFindings
  • D. Hendrycks and K. Gimpel. A baseline for detecting misclassified and out-of-distribution examples in neural networks. ICLR, 2017. 1, 2, 6, 7
    Google ScholarLocate open access versionFindings
  • M.-M. D. T. Hendrycks, Dan. Deep anomaly detection with outlier exposure. ICLR, 2019. 1, 2
    Google ScholarLocate open access versionFindings
  • R. D. Hjelm, A. Fedorov, S. Lavoie-Marchildon, K. Grewal, A. Trischler, and Y. Bengio. Learning deep representations by mutual information estimation and maximization. arXiv preprint arXiv:1808.06670, 2018. 3
    Findings
  • G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4700–4708, 2017. 5
    Google ScholarLocate open access versionFindings
  • A. Krizhevsky and G. Hinton. Learning multiple layers of features from tiny images. Technical report, Citeseer, 2009. 5
    Google ScholarFindings
  • A. Kurakin, I. Goodfellow, and S. Bengio. Adversarial examples in the physical world. arXiv preprint arXiv:1607.02533, 2016. 7
    Findings
  • K. Lee, H. Lee, K. Lee, and J. Shin. Training confidencecalibrated classifiers for detecting out-of-distribution samples. ICLR, 2018. 2
    Google ScholarLocate open access versionFindings
  • K. Lee, K. Lee, H. Lee, and J. Shin. A simple unified framework for detecting out-of-distribution samples and adversarial attacks. NIPS, 2018. 1, 2, 4, 5, 6, 7
    Google ScholarLocate open access versionFindings
  • S. Liang, Y. Li, and R. Srikant. Enhancing the reliability of out-of-distribution image detection in neural networks. ICLR, 2018. 1, 2, 4, 5, 6
    Google ScholarLocate open access versionFindings
  • X. Ma, B. Li, Y. Wang, S. M. Erfani, S. Wijewickrema, G. Schoenebeck, D. Song, M. E. Houle, and J. Bailey. Characterizing adversarial subspaces using local intrinsic dimensionality. arXiv preprint arXiv:1801.02613, 2018. 7
    Findings
  • S.-M. Moosavi-Dezfooli, A. Fawzi, and P. Frossard. Deepfool: a simple and accurate method to fool deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2574–2582, 2016. 7
    Google ScholarLocate open access versionFindings
  • E. Nalisnick, A. Matsukawa, Y. W. Teh, D. Gorur, and B. Lakshminarayanan. Do deep generative models know what they don’t know? arXiv preprint arXiv:1810.09136, 2018. 5
    Findings
  • Y. Netzer, T. Wang, A. Coates, A. Bissacco, B. Wu, and A. Y. Ng. Reading digits in natural images with unsupervised feature learning. 2011. 5
    Google ScholarFindings
  • M. A. F. Pimentel, D. A. Clifton, C. Lei, and L. Tarassenko. A review of novelty detection. Signal Processing, 99(6):215–249, 2014. 2
    Google ScholarLocate open access versionFindings
  • R. d. M. E. F. J. L. M. D. Quintanilha, Igor M. and L. O. Nunes. Detecting out-of-distribution samples using loworder deep features statistics. Openreview, 2018. 2
    Google ScholarLocate open access versionFindings
  • S. Reed, Z. Akata, H. Lee, and B. Schiele. Learning deep representations of fine-grained visual descriptions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 49–58, 2016. 8
    Google ScholarLocate open access versionFindings
  • S. Reed, Z. Akata, X. Yan, L. Logeswaran, B. Schiele, and H. Lee. Generative adversarial text to image synthesis. arXiv preprint arXiv:1605.05396, 2016. 8
    Findings
  • Y. Song, R. Shu, N. Kushman, and S. Ermon. Constructing unrestricted adversarial examples with generative models. In Advances in Neural Information Processing Systems, pages 8312–8323, 2018. 7, 8
    Google ScholarLocate open access versionFindings
  • A. Vyas, N. Jammalamadaka, X. Zhu, D. Das, and T. L. Willke. Out-of-distribution detection using an ensemble of self supervised leave-out classifiers. ECCV, 2018. 2
    Google ScholarLocate open access versionFindings
  • C. Xiao, J.-Y. Zhu, B. Li, W. He, M. Liu, and D. Song. Spatially transformed adversarial examples. arXiv preprint arXiv:1801.02612, 2018. 7, 8
    Findings
  • K. Xu, J. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhutdinov, R. Zemel, and Y. Bengio. Show, attend and tell: Neural image caption generation with visual attention. arXiv preprint arXiv:1502.03044, 2015. 8
    Findings
  • F. Yu, A. Seff, Y. Zhang, S. Song, T. Funkhouser, and J. Xiao. Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint arXiv:1506.03365, 2015. 5
    Findings
  • S. Zagoruyko and N. Komodakis. Wide residual networks. arXiv preprint arXiv:1605.07146, 2016. 6
    Findings
Your rating :
0

 

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科