HR-MultiWOZ: A Task Oriented Dialogue (TOD) Dataset for HR LLM Agent
CoRR(2024)
摘要
Recent advancements in Large Language Models (LLMs) have been reshaping
Natural Language Processing (NLP) task in several domains. Their use in the
field of Human Resources (HR) has still room for expansions and could be
beneficial for several time consuming tasks. Examples such as time-off
submissions, medical claims filing, and access requests are noteworthy, but
they are by no means the sole instances. However, the aforementioned
developments must grapple with the pivotal challenge of constructing a
high-quality training dataset. On one hand, most conversation datasets are
solving problems for customers not employees. On the other hand, gathering
conversations with HR could raise privacy concerns. To solve it, we introduce
HR-Multiwoz, a fully-labeled dataset of 550 conversations spanning 10 HR
domains to evaluate LLM Agent. Our work has the following contributions: (1) It
is the first labeled open-sourced conversation dataset in the HR domain for NLP
research. (2) It provides a detailed recipe for the data generation procedure
along with data analysis and human evaluations. The data generation pipeline is
transferable and can be easily adapted for labeled conversation data generation
in other domains. (3) The proposed data-collection pipeline is mostly based on
LLMs with minimal human involvement for annotation, which is time and
cost-efficient.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要