Mitigating Language-Level Performance Disparity in mPLMs via Teacher Language Selection and Cross-lingual Self-Distillation
arxiv(2024)
摘要
Large-scale multilingual Pretrained Language Models (mPLMs) yield impressive
performance on cross-language tasks, yet significant performance disparities
exist across different languages within the same mPLM. Previous studies
endeavored to narrow these disparities by supervise fine-tuning the mPLMs with
multilingual data. However, obtaining labeled multilingual data is
time-consuming, and fine-tuning mPLM with limited labeled multilingual data
merely encapsulates the knowledge specific to the labeled data. Therefore, we
introduce ALSACE to leverage the learned knowledge from the well-performing
languages to guide under-performing ones within the same mPLM, eliminating the
need for additional labeled multilingual data. Experiments show that ALSACE
effectively mitigates language-level performance disparity across various mPLMs
while showing the competitive performance on different multilingual NLU tasks,
ranging from full resource to limited resource settings. The code for our
approach is available at https://github.com/pkunlp-icler/ALSACE.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要