Explaining Naive Bayes and Other Linear Classifiers with Polynomial Time and Delay

NIPS 2020, 2020.

Cited by: 0|Bibtex|Views9
EI
Other Links: arxiv.org|dblp.uni-trier.de|academic.microsoft.com
Weibo:
This paper presents a log-linear algorithm for computing a smallest PI-explanation of linear classifiers

Abstract:

Recent work proposed the computation of so-called PI-explanations of Naive Bayes Classifiers (NBCs). PI-explanations are subset-minimal sets of feature-value pairs that are sufficient for the prediction, and have been computed with state-of-the-art exact algorithms that are worst-case exponential in time and space. In contrast, we show ...More

Code:

Data:

0
Introduction
  • Approaches proposed in recent years for computing explanations of Machine Learning (ML) models can be broadly characterized as heuristic or non-heuristic5.
  • Another line of work is exemplified by Anchor [25], and targets the computation of a set of feature-value pairs associated with a given instance as a way of explaining the prediction.
Highlights
  • Approaches proposed in recent years for computing explanations of Machine Learning (ML) models can be broadly characterized as heuristic or non-heuristic5
  • We describe a polynomial delay algorithm for enumeration of explanations for XLCs
  • This paper presents a log-linear algorithm for computing a smallest PI-explanation of linear classifiers
  • The paper shows that PI-explanations for linear classifiers can be enumerated with polynomial delay
Results
  • One concrete example [21] yields a polynomial time algorithm in the setting of computing a smallest PI-explanation of an XLC.
  • Function OneExplanation(Vs,Flip,∆,ΦR,Idx,Xpl) ; Input: Vs: Values of instance being explained; Flip: Array reference of decision steps; ∆: Sorted δj’s; ΦR: Explanation threshold; Idx: Index for ∆; Xpl: Set reference of explanation literals Output: ΦR: Updated threshold; Idx: Updated index for ∆
  • In the concrete case of NBCs, if the goal is to compute a single explanation, the algorithm detailed is exponentially more efficient than earlier work [29].
  • A smallest PI-explanation can be computed in log-linear time by sorting the δi values and picking the first k literals that ensure the prediction.
  • Function EnterValidState(Vs,Flip,∆,ΦR,Idx,Xpl) ; Input: Vs: Values of instance being explained; Flip: Array reference of decision steps; ∆: Sorted δj’s; ΦR: Explanation threshold; Idx: Index for ∆; Xpl: Set reference of explanation literals Output: ΦR: Updated threshold; Idx: Updated index for ∆
  • Given the definition of the δi constants for real-valued features, and associated literals in case of a no-change constraint, the authors can compute explanations using the restricted knapsack problem formulation as above.
  • The experiment was divided into 3 parts: (1) evaluating the raw performance of XPXLC, (2) comparing it with the state-of-the-art compilation approach STEP [29,30], and (3) using complete enumeration of PI-explanations to assess the quality of explanations of the well-known heuristic explainers Anchor [25] and SHAP [15].
  • In order to compare the relative performance of XPXLC and STEP, the authors apply a one-hot encoding (OHE) to categorical features, retrain the Naive Bayes classifiers and run both tools on the OHE instances15, targeting the complete enumeration of explanations.
Conclusion
  • Exhaustive enumeration provides a distribution of how many times feature-value pairs appear in explanations, and which are likely to be more relevant for the given prediction.
  • The results in the paper apply to NBCs, and so should be contrasted with earlier work [29], which proposes a worst-case exponential time and space solution for computing PI-explanations of NBCs. A natural line of research is to investigate extensions of XLCs that admit polynomial time algorithms for computing PI-explanations.
Summary
  • Approaches proposed in recent years for computing explanations of Machine Learning (ML) models can be broadly characterized as heuristic or non-heuristic5.
  • Another line of work is exemplified by Anchor [25], and targets the computation of a set of feature-value pairs associated with a given instance as a way of explaining the prediction.
  • One concrete example [21] yields a polynomial time algorithm in the setting of computing a smallest PI-explanation of an XLC.
  • Function OneExplanation(Vs,Flip,∆,ΦR,Idx,Xpl) ; Input: Vs: Values of instance being explained; Flip: Array reference of decision steps; ∆: Sorted δj’s; ΦR: Explanation threshold; Idx: Index for ∆; Xpl: Set reference of explanation literals Output: ΦR: Updated threshold; Idx: Updated index for ∆
  • In the concrete case of NBCs, if the goal is to compute a single explanation, the algorithm detailed is exponentially more efficient than earlier work [29].
  • A smallest PI-explanation can be computed in log-linear time by sorting the δi values and picking the first k literals that ensure the prediction.
  • Function EnterValidState(Vs,Flip,∆,ΦR,Idx,Xpl) ; Input: Vs: Values of instance being explained; Flip: Array reference of decision steps; ∆: Sorted δj’s; ΦR: Explanation threshold; Idx: Index for ∆; Xpl: Set reference of explanation literals Output: ΦR: Updated threshold; Idx: Updated index for ∆
  • Given the definition of the δi constants for real-valued features, and associated literals in case of a no-change constraint, the authors can compute explanations using the restricted knapsack problem formulation as above.
  • The experiment was divided into 3 parts: (1) evaluating the raw performance of XPXLC, (2) comparing it with the state-of-the-art compilation approach STEP [29,30], and (3) using complete enumeration of PI-explanations to assess the quality of explanations of the well-known heuristic explainers Anchor [25] and SHAP [15].
  • In order to compare the relative performance of XPXLC and STEP, the authors apply a one-hot encoding (OHE) to categorical features, retrain the Naive Bayes classifiers and run both tools on the OHE instances15, targeting the complete enumeration of explanations.
  • Exhaustive enumeration provides a distribution of how many times feature-value pairs appear in explanations, and which are likely to be more relevant for the given prediction.
  • The results in the paper apply to NBCs, and so should be contrasted with earlier work [29], which proposes a worst-case exponential time and space solution for computing PI-explanations of NBCs. A natural line of research is to investigate extensions of XLCs that admit polynomial time algorithms for computing PI-explanations.
Reference
  • S. Anjomshoae, A. Najjar, D. Calvaresi, and K. Främling. Explainable agents and robots: Results from a systematic literature review. In AAMAS, pages 1078–1088, 2019.
    Google ScholarLocate open access versionFindings
  • D. Barber. Bayesian reasoning and machine learning. Cambridge University Press, 2012.
    Google ScholarFindings
  • D. A. Cohen. Tractable decision for a constraint language implies tractable search. Constraints An Int. J., 9(3):219–229, 2004.
    Google ScholarLocate open access versionFindings
  • G. B. Dantzig. Discrete-variable extremum problems. Operations research, 5(2):266– 288, 1957.
    Google ScholarLocate open access versionFindings
  • A. Darwiche. Three modern roles for logic in AI. CoRR, abs/2004.08599, 2020.
    Findings
  • R. O. Duda, P. E. Hart, and D. G. Stork. Pattern classification. John Wiley & Sons, 1973.
    Google ScholarFindings
  • N. Friedman, D. Geiger, and M. Goldszmidt. Bayesian network classifiers. Mach. Learn., 29(2-3):131–163, 1997.
    Google ScholarLocate open access versionFindings
  • R. Guidotti, A. Monreale, S. Ruggieri, F. Turini, F. Giannotti, and D. Pedreschi. A survey of methods for explaining black box models. ACM Comput. Surv., 51(5):93:1–93:42, 2019.
    Google ScholarLocate open access versionFindings
  • A. Ignatiev, N. Narodytska, and J. Marques-Silva. Abduction-based explanations for machine learning models. In AAAI, pages 1511–1519, 2019.
    Google ScholarLocate open access versionFindings
  • D. S. Johnson, C. H. Papadimitriou, and M. Yannakakis. On generating all maximal independent sets. Inf. Process. Lett., 27(3):119–123, 1988.
    Google ScholarLocate open access versionFindings
  • Kaggle Machine Learning Community. https://www.kaggle.com/.12. H. Kellerer, U. Pferschy, and D. Pisinger. Knapsack problems. Springer, 2004.
    Locate open access versionFindings
  • 13. R. Kohavi. Scaling up the accuracy of naive-bayes classifiers: A decision-tree hybrid. In KDD, pages 202–207, 1996.
    Google ScholarLocate open access versionFindings
  • 14. E. L. Lawler, J. K. Lenstra, and A. H. G. Rinnooy Kan. Generating all maximal independent sets: NP-hardness and polynomial-time algorithms. SIAM J. Comput., 9(3):558–565, 1980.
    Google ScholarLocate open access versionFindings
  • 15. S. M. Lundberg and S. Lee. A unified approach to interpreting model predictions. In NeurIPS, pages 4765–4774, 2017.
    Google ScholarLocate open access versionFindings
  • 16. C. D. Manning, P. Raghavan, and H. Schütze. Introduction to information retrieval. Cambridge University Press, 2008.
    Google ScholarFindings
  • 17. T. Miller. "But why?" understanding explainable artificial intelligence. ACM Crossroads, 25(3):20–25, 2019.
    Google ScholarLocate open access versionFindings
  • 18. T. Miller. Explanation in artificial intelligence: Insights from the social sciences. Artif. Intell., 267:1–38, 2019.
    Google ScholarLocate open access versionFindings
  • 19. B. D. Mittelstadt, C. Russell, and S. Wachter. Explaining explanations in AI. In FAT, pages 279–288, 2019.
    Google ScholarLocate open access versionFindings
  • 20. S. T. Mueller, R. R. Hoffman, W. J. Clancey, A. Emrey, and G. Klein. Explanation in human-AI systems: A literature meta-review, synopsis of key ideas and publications, and bibliography for explainable AI. CoRR, abs/1902.01876, 2019.
    Findings
  • 21. C. H. Papadimitriou and K. Steiglitz. Combinatorial Optimization: Algorithms and Complexity. Prentice-Hall, 1982.
    Google ScholarLocate open access versionFindings
  • 22. J. D. Park. Using weighted MAX-SAT engines to solve MPE. In AAAI, pages 682–687, 2002.
    Google ScholarLocate open access versionFindings
  • 23. Penn Machine Learning Benchmarks. https://github.com/EpistasisLab/penn-ml-benchmarks.
    Findings
  • 24. M. T. Ribeiro, S. Singh, and C. Guestrin. "Why should I trust you?": Explaining the predictions of any classifier. In KDD, pages 1135–1144, 2016.
    Google ScholarLocate open access versionFindings
  • 25. M. T. Ribeiro, S. Singh, and C. Guestrin. Anchors: High-precision model-agnostic explanations. In AAAI, pages 1527–1535, 2018.
    Google ScholarLocate open access versionFindings
  • 26. W. Samek, G. Montavon, A. Vedaldi, L. K. Hansen, and K. Müller, editors. Explainable AI: Interpreting, Explaining and Visualizing Deep Learning. Springer, 2019. 0 0 2000 4000 6000 8000 10000 12000 14000 instances (a) Anchor
    Google ScholarLocate open access versionFindings
  • 0 0 2000 4000 6000 8000 10000 12000 14000 instances (b) SHAP
    Google ScholarFindings
  • W. Samek and K. Müller. Towards explainable artificial intelligence. In Samek et al. [26], pages 5–22.
    Google ScholarLocate open access versionFindings
  • scikit-learn: Machine Learning in Python. https://scikit-learn.org/.29. A. Shih, A. Choi, and A. Darwiche. A symbolic approach to explaining bayesian network classifiers. In IJCAI, pages 5103–5111, 2018.
    Locate open access versionFindings
  • 30. A. Shih, A. Choi, and A. Darwiche. Compiling bayesian network classifiers into decision graphs. In AAAI, pages 7966–7974, 2019.
    Google ScholarLocate open access versionFindings
  • 31. UCI Machine Learning Repository. https://archive.ics.uci.edu/ml.32. Automated Reasoning Group UCLA.http://reasoning.cs.ucla.edu/xai/.33. F. Xu, H. Uszkoreit, Y. Du, W. Fan, D. Zhao, and J. Zhu. Explainable AI: A brief survey on history, research areas, approaches and challenges. In NLPCC, pages 563–574, 2019.
    Locate open access versionFindings
Full Text
Your rating :
0

 

Tags
Comments