Self-Assessment Tests are Unreliable Measures of LLM Personality
CoRR(2023)
摘要
As large language models (LLM) evolve in their capabilities, various recent
studies have tried to quantify their behavior using psychological tools created
to study human behavior. One such example is the measurement of "personality"
of LLMs using self-assessment personality tests developed to measure human
personality. Yet almost none of these works verify the applicability of these
tests on LLMs. In this paper, we analyze the reliability of LLM personality
scores obtained from self-assessment personality tests using two simple
experiments. We first introduce the property of prompt sensitivity, where three
semantically equivalent prompts representing three intuitive ways of
administering self-assessment tests on LLMs are used to measure the personality
of the same LLM. We find that all three prompts lead to very different
personality scores, a difference that is statistically significant for all
traits in a large majority of scenarios. We then introduce the property of
option-order symmetry for personality measurement of LLMs. Since most of the
self-assessment tests exist in the form of multiple choice question (MCQ)
questions, we argue that the scores should also be robust to not just the
prompt template but also the order in which the options are presented. This
test unsurprisingly reveals that the self-assessment test scores are not robust
to the order of the options. These simple tests, done on ChatGPT and three
Llama2 models of different sizes, show that self-assessment personality tests
created for humans are unreliable measures of personality in LLMs.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要