Fine-Tuning Large Language Models to Translate: Will a Touch of Noisy Data in Misaligned Languages Suffice?
arxiv(2024)
摘要
Traditionally, success in multilingual machine translation can be attributed
to three key factors in training data: large volume, diverse translation
directions, and high quality. In the current practice of fine-tuning large
language models (LLMs) for translation, we revisit the importance of all these
factors. We find that LLMs display strong translation capability after being
fine-tuned on as few as 32 training instances, and that fine-tuning on a single
translation direction effectively enables LLMs to translate in multiple
directions. However, the choice of direction is critical: fine-tuning LLMs with
English on the target side can lead to task misinterpretation, which hinders
translations into non-English languages. A similar problem arises when noise is
introduced into the target side of parallel data, especially when the target
language is well-represented in the LLM's pre-training. In contrast, noise in
an under-represented language has a less pronounced effect. Our findings
suggest that attaining successful alignment hinges on teaching the model to
maintain a "superficial" focus, thereby avoiding the learning of erroneous
biases beyond translation.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要