Investigating interoperable event corpora: limitations of reusability of resources and portability of models

LANGUAGE RESOURCES AND EVALUATION(2023)

引用 0|浏览13
暂无评分
摘要
Studies on the applicability of heterogeneous semantically interoperable corpora are rare. We investigate to what extent reusability (both of systems and of annotations) is entailed by corpora whose interoperability is based on compliance to standards. In particular, we look at event detection in English texts, supported by the ISO-TimeML annotation scheme. We run two sets of experiments using a common neural network architecture and extensively evaluate our results on both in-distribution and out-of-distribution settings. In all experimental settings, systems obtain state-of-the-art results on the in-distribution data and underperform out-of-distribution ones, setting limits to the benefits of semantically interoperable corpora. By means of a detailed error analysis, we show that while being compliant to a standard guarantees semantic interoperability, this becomes only a necessary condition for reusability, with factors such as differences in the quality of the annotations having a much stronger impact.
更多
查看译文
关键词
Event detection,Semantic interoperability,Reusability of data,Portability of systems,Standards
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要