Revisiting a Pain in the Neck: Semantic Phrase Processing Benchmark for Language Models
arxiv(2024)
摘要
We introduce LexBench, a comprehensive evaluation suite enabled to test
language models (LMs) on ten semantic phrase processing tasks. Unlike prior
studies, it is the first work to propose a framework from the comparative
perspective to model the general semantic phrase (i.e., lexical collocation)
and three fine-grained semantic phrases, including idiomatic expression, noun
compound, and verbal construction. Thanks to , we assess the
performance of 15 LMs across model architectures and parameter scales in
classification, extraction, and interpretation tasks. Through the experiments,
we first validate the scaling law and find that, as expected, large models
excel better than the smaller ones in most tasks. Second, we investigate
further through the scaling semantic relation categorization and find that
few-shot LMs still lag behind vanilla fine-tuned models in the task. Third,
through human evaluation, we find that the performance of strong models is
comparable to the human level regarding semantic phrase processing. Our
benchmarking findings can serve future research aiming to improve the generic
capability of LMs on semantic phrase comprehension. Our source code and data
are available at https://github.com/jacklanda/LexBench
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要