Modular Adaptation of Multilingual Encoders to Written Swiss German Dialect
CoRR(2024)
摘要
Creating neural text encoders for written Swiss German is challenging due to
a dearth of training data combined with dialectal variation. In this paper, we
build on several existing multilingual encoders and adapt them to Swiss German
using continued pre-training. Evaluation on three diverse downstream tasks
shows that simply adding a Swiss German adapter to a modular encoder achieves
97.5
task of retrieving Swiss German sentences given Standard German queries,
adapting a character-level model is more effective than the other adaptation
strategies. We release our code and the models trained for our experiments at
https://github.com/ZurichNLP/swiss-german-text-encoders
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要