OpenPI2.0: An Improved Dataset for Entity Tracking in Texts
arXiv (Cornell University)(2023)
摘要
Much text describes a changing world (e.g., procedures, stories, newswires),
and understanding them requires tracking how entities change. An earlier
dataset, OpenPI, provided crowdsourced annotations of entity state changes in
text. However, a major limitation was that those annotations were free-form and
did not identify salient changes, hampering model evaluation. To overcome these
limitations, we present an improved dataset, OpenPI2.0, where entities and
attributes are fully canonicalized and additional entity salience annotations
are added. On our fairer evaluation setting, we find that current
state-of-the-art language models are far from competent. We also show that
using state changes of salient entities as a chain-of-thought prompt,
downstream performance is improved on tasks such as question answering and
classical planning, outperforming the setting involving all related entities
indiscriminately. We offer OpenPI2.0 for the continued development of models
that can understand the dynamics of entities in text.
更多查看译文
关键词
entity tracking,texts,improved dataset
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要