AntEval: Evaluation of Social Interaction Competencies in LLM-Driven Agents
CoRR(2024)
摘要
Large Language Models (LLMs) have demonstrated their ability to replicate
human behaviors across a wide range of scenarios. However, their capability in
handling complex, multi-character social interactions has yet to be fully
explored, primarily due to the absence of robust, quantitative evaluation
methods. This gap has slowed the development of agents proficient in more
nuanced interactions beyond simple exchanges, for example, small talk. To
address this challenge, we introduce the Multi-Agent Interaction Evaluation
Framework (AntEval), encompassing a novel interaction framework and evaluation
methods. The interaction framework aims to foster an complex interaction
environment that bolsters information exchange and intention expression within
social interactions. Furthermore, we introduce evaluation methods, including
two metrics: Information Exchanging Precision (IEP) and Interaction
Expressiveness Gap (IEG), designed for the quantitative and objective
assessment of agents' interaction competencies. Our findings highlight the
utility of these evaluative methods and show significant potential for
improving LLMs' ability to construct agents that interact in a more natural
manner with human-like intricacy.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要