Improving English-Assamese Neural Machine Translation Using Transliteration-Based Approach

Smart Innovation, Systems and Technologies(2023)

引用 0|浏览1
暂无评分
摘要
Natural language translation is a well-defined task of linguistic technology that minimizes communication gap among people of diverse linguistic backgrounds. Although neural machine translation attains remarkable translational performance, it requires adequate amount of train data, which is a challenging task for low-resource language pair translation. Also, neural machine translation handles rare word problems, i.e., low-frequency words translation at the subword level, but it shows weakness for highly inflected language translation. In this work, we have explored neural machine translation on low-resource English-Assamese language pair with a proposed transliteration approach in the data preprocessing step. In the transliteration approach, the source language is transliterated into target language script that leverages a smaller subword vocabulary for the source-target languages. Moreover, the pre-trained embeddings on the monolingual data of transliterated source and target languages are used in the training process. With our approach, the neural machine translation significantly improves translational performance for English-to-Assamese and Assamese-to-English translation and obtain state-of-the-art results.
更多
查看译文
关键词
machine translation,english-assamese,transliteration-based
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要