Visualizing Deep Neural Network Decisions: Prediction Difference Analysis

Luisa M. Zintgraf
Luisa M. Zintgraf

International Conference on Learning Representations, Volume abs/1702.04595, 2017.

Cited by: 308|Views107
EI
Weibo:
We presented a new method for visualizing deep neural networks that improves on previous methods by using a more powerful conditional, multivariate model

Abstract:

This article presents the prediction difference analysis method for visualizing the response of a deep neural network to a specific input. When classifying images, the method highlights areas in a given input image that provide evidence for or against a certain class. It overcomes several shortcoming of previous methods and provides great...More

Code:

Data:

0
Full Text
Bibtex
Weibo
Introduction
  • Over the last few years, deep neural networks (DNNs) have emerged as the method of choice for perceptual tasks such as speech recognition and image classification.
  • A DNN is a highly complex non-linear function, which makes it hard to understand how a particular classification comes about.
  • This lack of transparency is a significant impediment to the adoption of deep learning in areas of industry, government and healthcare where the cost of errors is high.
  • In section 4 the authors provide several demonstrations of the technique for deep convolutional neural networks (DCNNs) trained on ImageNet data, and further how the method can be applied when classifying MRI brain scans of HIV patients with neurodegenerative disease
Highlights
  • Over the last few years, deep neural networks (DNNs) have emerged as the method of choice for perceptual tasks such as speech recognition and image classification
  • In section 4 we provide several demonstrations of our technique for deep convolutional neural networks (DCNNs) trained on ImageNet data, and further how the method can be applied when classifying MRI brain scans of HIV patients with neurodegenerative disease
  • We presented a new method for visualizing deep neural networks that improves on previous methods by using a more powerful conditional, multivariate model
  • The signed information offers new insights - for research on the networks, as well as the acceptance and usability in domains like healthcare
  • We have presented several ways in which the visualization method can be put into use for analyzing how DCNNs make decisions
Methods
  • The authors illustrate how the proposed visualization method can be applied, on the ImageNet dataset of natural images when using DCNNs, and on a medical imaging dataset of MRI scans when using a logistic regression classifier.
  • For marginal sampling the authors always use the empirical distribution, i.e., the authors replace a feature with samples taken directly from other images, at the same location.
  • For conditional sampling the authors use a multivariate normal distribution.
  • For both sampling methods the authors use 10 samples to estimate p(c|x\i).
  • Note that all images are best viewed digital and in color.
Results
  • The authors trained an L2-regularized Logistic Regression classifier on a subset of the MRI slices and on a balanced version of the dataset to achieve an accuracy of 69.3% in a 10-fold cross-validation test.
Conclusion
  • The authors presented a new method for visualizing deep neural networks that improves on previous methods by using a more powerful conditional, multivariate model.
  • While the method requires significant computational resources, real-time 3D visualization is possible when visualizations are pre-computed.
  • With further optimization and powerful GPUs, pre-computation time can be reduced a lot further.
  • The authors have presented several ways in which the visualization method can be put into use for analyzing how DCNNs make decisions
Summary
  • Introduction:

    Over the last few years, deep neural networks (DNNs) have emerged as the method of choice for perceptual tasks such as speech recognition and image classification.
  • A DNN is a highly complex non-linear function, which makes it hard to understand how a particular classification comes about.
  • This lack of transparency is a significant impediment to the adoption of deep learning in areas of industry, government and healthcare where the cost of errors is high.
  • In section 4 the authors provide several demonstrations of the technique for deep convolutional neural networks (DCNNs) trained on ImageNet data, and further how the method can be applied when classifying MRI brain scans of HIV patients with neurodegenerative disease
  • Methods:

    The authors illustrate how the proposed visualization method can be applied, on the ImageNet dataset of natural images when using DCNNs, and on a medical imaging dataset of MRI scans when using a logistic regression classifier.
  • For marginal sampling the authors always use the empirical distribution, i.e., the authors replace a feature with samples taken directly from other images, at the same location.
  • For conditional sampling the authors use a multivariate normal distribution.
  • For both sampling methods the authors use 10 samples to estimate p(c|x\i).
  • Note that all images are best viewed digital and in color.
  • Results:

    The authors trained an L2-regularized Logistic Regression classifier on a subset of the MRI slices and on a balanced version of the dataset to achieve an accuracy of 69.3% in a 10-fold cross-validation test.
  • Conclusion:

    The authors presented a new method for visualizing deep neural networks that improves on previous methods by using a more powerful conditional, multivariate model.
  • While the method requires significant computational resources, real-time 3D visualization is possible when visualizations are pre-computed.
  • With further optimization and powerful GPUs, pre-computation time can be reduced a lot further.
  • The authors have presented several ways in which the visualization method can be put into use for analyzing how DCNNs make decisions
Related work
  • Broadly speaking, there are two approaches for understanding DCNNs through visualization investigated in the literature: find an input image that maximally activates a given unit or class score to visualize what the network is looking for (Erhan et al, 2009; Simonyan et al, 2013; Yosinski et al, 2015), or visualize how the network responds to a specific input image in order to explain a particular classification made by the network. The latter will be the subject of this paper.

    One such instance-specific method is class saliency visualization proposed by Simonyan et al (2013) who measure how sensitive the classification score is to small changes in pixel values, by computing the partial derivative of the class score with respect to the input features using standard backpropagation. They also show that there is a close connection to using deconvolutional networks for visualization, proposed by Zeiler & Fergus (2014). Other methods include Shrikumar et al (2016), who compare the activation of a unit when a specific input is fed forward through the net to a reference activation for that unit. Zhou et al (2016) and Bach et al (2015) also generate interesting visualization results for individual inputs, but are both not as closely related to our method as the two papers mentioned above. The idea of our method is similar to another analysis Zeiler & Fergus (2014) make: they estimate the importance of input pixels by visualizing the probability of the (correct) class as a function of a gray patch occluding parts of the image. In this paper, we take a more rigorous approach at both removing information from the image and evaluating the effect of this.
Funding
  • This work was supported by AWS in Education Grant award
  • We thank Facebook and Google for financial support, and our reviewers for their time and valuable, constructive feedback. This work was also in part supported by: Innoviris, the Brussels Institute for Research and Innovation, Brussels, Belgium; the Nuts-OHRA Foundation (grant no. 1003-026), Amsterdam, The Netherlands; The Netherlands Organization for Health Research and Development (ZonMW) together with AIDS Fonds (grant no 300020007 and 2009063)
  • Additional unrestricted scientific grants were received from Gilead Sciences, ViiV Healthcare, Janssen Pharmaceutica N.V., Bristol-Myers Squibb, Boehringer Ingelheim, and Merck&Co. We thank Barbara Elsenga, Jane Berkel, Sandra Moll, Maja Totté, and Marjolein Martens for running the AGEhIV study program and capturing our data with such care and passion
Study subjects and analysis
subjects: 121
FA is sensitive to microstructural damage and therefore expected to be, on average, decreased in patients. Subjects were scanned on two 3.0 Tesla scanner systems, 121 subjects on a Philips Intera system and 39 on a Philips Ingenia system. Patients and controls were evenly distributed

samples: 10
For conditional sampling we use a multivariate normal distribution. For both sampling methods we use 10 samples to estimate p(c|x\i) (since no significant difference was observed with more samples). Note that all images are best viewed digital and in color.

Our implementation is available at github.com/lmzintgraf/DeepVis-PredDiff

samples: 10
For conditional sampling we use a multivariate normal distribution. For both sampling methods we use 10 samples to estimate p(c|x\i) (since no significant difference was observed with more samples). Note that all images are best viewed digital and in color

samples: 10
We used the publicly available pre-trained models that were implemented using the deep learning framework caffe (Jia et al, 2014). Analyzing one image took us on average 20, 30 and 70 minutes for the respective classifiers AlexNet, GoogLeNet and VGG (using the GPU implementation of caffe and mini-batches with the standard settings of 10 samples and a window size of k = 10). The results shown here are chosen from among a small set of images in order to show a range of behavior of the algorithm

HIV patients: 100
The dataset used here is referred to as the COBRA dataset. It contains 3D MRIs from 100 HIV patients and 70 healthy individuals, included in the Academic Medical Center (AMC) in Amsterdam, The Netherlands. Of these subjects, diffusion weighted MRI data were acquired

samples: 70
FA images were spatially normalized to standard space Andersson et al (2007), resulting in volumes with 91 × 109 × 91 = 902, 629 voxels. We trained an L2-regularized Logistic Regression classifier on a subset of the MRI slices (slices 29-40 along the first axis) and on a balanced version of the dataset (by taking the first 70 samples of the HIV class) to achieve an accuracy of 69.3% in a 10-fold cross-validation test. Analyzing one image took around half an hour (on a CPU, with k = 3 and l = 7, see algorithm 1)

Reference
  • Jesper LR Andersson, Mark Jenkinson, and Stephen Smith. Non-linear optimisation. fmrib technical report tr07jaUniversity of Oxford FMRIB Centre: Oxford, UK, 2007.
    Google ScholarFindings
  • Sebastian Bach, Alexander Binder, Grégoire Montavon, Frederick Klauschen, Klaus-Robert Müller, and Wojciech Samek. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PloS one, 10(7):e0130140, 2015.
    Google ScholarLocate open access versionFindings
  • Christine Ecker, Andre Marquand, Janaina Mourão-Miranda, Patrick Johnston, Eileen M Daly, Michael J Brammer, Stefanos Maltezos, Clodagh M Murphy, Dene Robertson, Steven C Williams, et al. Describing the brain in autism in five dimensions—magnetic resonance imaging-assisted diagnosis of autism spectrum disorder using a multiparameter classification approach. The Journal of Neuroscience, 30(32):10612–10623, 2010.
    Google ScholarLocate open access versionFindings
  • Dumitru Erhan, Yoshua Bengio, Aaron Courville, and Pascal Vincent. Visualizing higher-layer features of a deep network. Dept. IRO, Université de Montréal, Tech. Rep, 4323, 2009.
    Google ScholarLocate open access versionFindings
  • Bilwaj Gaonkar and Christos Davatzikos. Analytic estimation of statistical significance maps for support vector machine based multi-variate image analysis and classification. NeuroImage, 78:270–283, 2013.
    Google ScholarLocate open access versionFindings
  • Stefan Haufe, Frank Meinecke, Kai Görgen, Sven Dähne, John-Dylan Haynes, Benjamin Blankertz, and Felix Bießmann. On the interpretation of weight vectors of linear models in multivariate neuroimaging. Neuroimage, 87:96–110, 2014.
    Google ScholarLocate open access versionFindings
  • Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. Caffe: Convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093, 2014.
    Findings
  • Stefan Klöppel, Cynthia M Stonnington, Carlton Chu, Bogdan Draganski, Rachael I Scahill, Jonathan D Rohrer, Nick C Fox, Clifford R Jack, John Ashburner, and Richard SJ Frackowiak. Automatic classification of mr scans in alzheimer’s disease. Brain, 131(3):681–689, 2008.
    Google ScholarLocate open access versionFindings
  • Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pp. 1097–1105, 2012.
    Google ScholarLocate open access versionFindings
  • Janaina Mourao-Miranda, Arun LW Bokde, Christine Born, Harald Hampel, and Martin Stetter. Classifying brain states and determining the discriminating activation patterns: Support vector machine on functional mri data. NeuroImage, 28(4):980–995, 2005.
    Google ScholarLocate open access versionFindings
  • Marko Robnik-Šikonja and Igor Kononenko. Explaining classifications for individual instances. Knowledge and Data Engineering, IEEE Transactions on, 20(5):589–600, 2008.
    Google ScholarLocate open access versionFindings
  • Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV), 115(3):211–252, 2015. doi: 10.1007/s11263-015-0816-y.
    Locate open access versionFindings
  • Shayan Shahand, Ammar Benabdelkader, Mohammad Mahdi Jaghoori, Mostapha al Mourabit, Jordi Huguet, Matthan WA Caan, Antoine HC Kampen, and Sílvia D Olabarriaga. A data-centric neuroscience gateway: design, implementation, and experiences. Concurrency and Computation: Practice and Experience, 27(2): 489–506, 2015.
    Google ScholarLocate open access versionFindings
  • Avanti Shrikumar, Peyton Greenside, Anna Shcherbina, and Anshul Kundaje. Not just a black box: Learning important features through propagating activation differences. arXiv preprint arXiv:1605.01713, 2016.
    Findings
  • Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
    Findings
  • Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034, 2013.
    Findings
  • Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9, 2015.
    Google ScholarLocate open access versionFindings
  • Ze Wang, Anna R Childress, Jiongjiong Wang, and John A Detre. Support vector machine learning-based fmri data group analysis. NeuroImage, 36(4):1139–1151, 2007.
    Google ScholarLocate open access versionFindings
  • Jason Yosinski, Jeff Clune, Anh Nguyen, Thomas Fuchs, and Hod Lipson. Understanding neural networks through deep visualization. arXiv preprint arXiv:1506.06579, 2015.
    Findings
  • Matthew D Zeiler and Rob Fergus. Visualizing and understanding convolutional networks. In Computer vision–ECCV 2014, pp. 818–833.
    Google ScholarLocate open access versionFindings
  • Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, and Antonio Torralba. Learning deep features for discriminative localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2921–2929, 2016.
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments