Editing Personality for Large Language Models
arxiv(2023)
摘要
This paper introduces an innovative task focused on editing the personality
traits of Large Language Models (LLMs). This task seeks to adjust the models'
responses to opinion-related questions on specified topics since an
individual's personality often manifests in the form of their expressed
opinions, thereby showcasing different personality traits. Specifically, we
construct a new benchmark dataset PersonalityEdit to address this task. Drawing
on the theory in Social Psychology, we isolate three representative traits,
namely Neuroticism, Extraversion, and Agreeableness, as the foundation for our
benchmark. We then gather data using GPT-4, generating responses that not only
align with a specified topic but also embody the targeted personality trait. We
conduct comprehensive experiments involving various baselines and discuss the
representation of personality behavior in LLMs. Our intriguing findings uncover
potential challenges of the proposed task, illustrating several remaining
issues. We anticipate that our work can provide the NLP community with
insights. Code and datasets are available at
https://github.com/zjunlp/EasyEdit.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要