Interpreting Predictive Probabilities: Model Confidence or Human Label Variation?
Conference of the European Chapter of the Association for Computational Linguistics(2024)
摘要
With the rise of increasingly powerful and user-facing NLP systems, there is
growing interest in assessing whether they have a good representation of
uncertainty by evaluating the quality of their predictive distribution over
outcomes. We identify two main perspectives that drive starkly different
evaluation protocols. The first treats predictive probability as an indication
of model confidence; the second as an indication of human label variation. We
discuss their merits and limitations, and take the position that both are
crucial for trustworthy and fair NLP systems, but that exploiting a single
predictive distribution is limiting. We recommend tools and highlight exciting
directions towards models with disentangled representations of uncertainty
about predictions and uncertainty about human labels.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要