SHAP@k: Efficient and Probably Approximately Correct (PAC) Identification of Top-K Features

Sanjay Kariyappa,Leonidas Tsepenekas,Freddy Lécué,Daniele Magazzeni

AAAI 2024（2024）

引用 0|浏览14

暂无评分

摘要

The SHAP framework provides a principled method to explain the predictions of a model by computing feature importance. Motivated by applications in finance, we introduce the Top-k Identification Problem (TkIP) (and its ordered variant TkIP- O), where the objective is to identify the subset (or ordered subset for TkIP-O) of k features corresponding to the highest SHAP values with PAC guarantees. While any sampling-based method that estimates SHAP values (such as KernelSHAP and SamplingSHAP) can be trivially adapted to solve TkIP, doing so is highly sample inefficient. Instead, we leverage the connection between SHAP values and multi-armed bandits (MAB) to show that both TkIP and TkIP-O can be reduced to variants of problems in MAB literature. This reduction allows us to use insights from the MAB literature to develop sample-efficient variants of KernelSHAP and SamplingSHAP. We propose KernelSHAP@k and SamplingSHAP@k for solving TkIP; along with KernelSHAP-O and SamplingSHAP-O to solve the ordering problem in TkIP-O. We perform extensive experiments using several credit-related datasets to show that our methods offer significant improvements of up to 40× in sample efficiency and 39× in runtime.

查看译文

关键词

ML: Transparent, Interpretable, Explainable ML

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要