Carpe Diem: on the Evaluation of World Knowledge in Lifelong Language Models
NAACL-HLT(2024)
摘要
The dynamic nature of knowledge in an ever-changing world presents challengesfor language models trained on static data; the model in the real world oftenrequires not only acquiring new knowledge but also overwriting outdatedinformation into updated ones. To study the ability of language models forthese time-dependent dynamics in human language, we introduce a novel task,EvolvingQA, a temporally evolving question-answering benchmark designed fortraining and evaluating LMs on an evolving Wikipedia database. The constructionof EvolvingQA is automated with our pipeline using large language models. Weuncover that existing continual learning baselines suffer from updating andremoving outdated knowledge. Our analysis suggests that models fail to rectifyknowledge due to small weight gradients. In addition, we elucidate thatlanguage models particularly struggle to reflect the change of numerical ortemporal information. Our work aims to model the dynamic nature of real-worldinformation, suggesting faithful evaluations of the evolution-adaptability oflanguage models.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要