Cross-lingual Transfer Learning with Data Selection for Large-Scale Spoken Language Understanding

EMNLP/IJCNLP (1)(2019)

引用 20|浏览65
暂无评分
摘要
A typical cross-lingual transfer learning approach boosting model performance on a resource-poor language is to pre-train the model on all available supervised data from another resource-rich language. However, in large-scale systems, this leads to high training times and computational requirements. In addition, characteristic differences between the source and target languages raise a natural question of whether source-language data selection can improve the knowledge transfer. In this paper, we address this question and propose a simple but effective language model based source-language data selection method for cross-lingual transfer learning in largescale spoken language understanding. The experimental results show that with data selection i) the source data amount and hence training speed is reduced significantly and ii) model performance is improved.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要