U-Trustworthy Models.Reliability, Competence, and Confidence in Decision-Making
CoRR(2024)
摘要
With growing concerns regarding bias and discrimination in predictive models,
the AI community has increasingly focused on assessing AI system
trustworthiness. Conventionally, trustworthy AI literature relies on the
probabilistic framework and calibration as prerequisites for trustworthiness.
In this work, we depart from this viewpoint by proposing a novel trust
framework inspired by the philosophy literature on trust. We present a precise
mathematical definition of trustworthiness, termed
𝒰-trustworthiness, specifically tailored for a subset of tasks
aimed at maximizing a utility function. We argue that a model's
𝒰-trustworthiness is contingent upon its ability to maximize Bayes
utility within this task subset. Our first set of results challenges the
probabilistic framework by demonstrating its potential to favor less
trustworthy models and introduce the risk of misleading trustworthiness
assessments. Within the context of 𝒰-trustworthiness, we prove that
properly-ranked models are inherently 𝒰-trustworthy. Furthermore,
we advocate for the adoption of the AUC metric as the preferred measure of
trustworthiness. By offering both theoretical guarantees and experimental
validation, AUC enables robust evaluation of trustworthiness, thereby enhancing
model selection and hyperparameter tuning to yield more trustworthy outcomes.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要