Investigating grammatical abstraction in language models using few-shot learning of novel noun gender
Conference of the European Chapter of the Association for Computational Linguistics(2024)
摘要
Humans can learn a new word and infer its grammatical properties from very
few examples. They have an abstract notion of linguistic properties like
grammatical gender and agreement rules that can be applied to novel syntactic
contexts and words. Drawing inspiration from psycholinguistics, we conduct a
noun learning experiment to assess whether an LSTM and a decoder-only
transformer can achieve human-like abstraction of grammatical gender in French.
Language models were tasked with learning the gender of a novel noun embedding
from a few examples in one grammatical agreement context and predicting
agreement in another, unseen context. We find that both language models
effectively generalise novel noun gender from one to two learning examples and
apply the learnt gender across agreement contexts, albeit with a bias for the
masculine gender category. Importantly, the few-shot updates were only applied
to the embedding layers, demonstrating that models encode sufficient gender
information within the word embedding space. While the generalisation behaviour
of models suggests that they represent grammatical gender as an abstract
category, like humans, further work is needed to explore the details of how
exactly this is implemented. For a comparative perspective with human
behaviour, we conducted an analogous one-shot novel noun gender learning
experiment, which revealed that native French speakers, like language models,
also exhibited a masculine gender bias and are not excellent one-shot learners
either.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要