The Curious Case of Hallucinatory Unanswerablity: Finding Truths in the Hidden States of Over-Confident Large Language Models

CoRR(2023)

引用 1|浏览19
暂无评分
摘要
Large language models (LLMs) have been shown to possess impressive capabilities, while also raising crucial concerns about the faithfulness of their responses. A primary issue arising in this context is the management of unanswerable queries by LLMs, which often results in hallucinatory behavior, due to overconfidence. In this paper, we explore the behavior of LLMs when presented with unanswerable queries. We ask: do models \textbf{represent} the fact that the question is unanswerable when generating a hallucinatory answer? Our results show strong indications that such models encode the answerability of an input query, with the representation of the first decoded token often being a strong indicator. These findings shed new light on the spatial organization within the latent representations of LLMs, unveiling previously unexplored facets of these models. Moreover, they pave the way for the development of improved decoding techniques with better adherence to factual generation, particularly in scenarios where query unanswerability is a concern.
更多
查看译文
关键词
hallucinatory unanswerablity,large language models,language models,large language,over-confident
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要