A Voyage on Neural Machine Translation for Indic Languages

Procedia Computer Science(2023)

引用 3|浏览24
暂无评分
摘要
With the invention of deep learning concepts, Machine Translation (MT) migrated towards Neural Machine Translation (NMT) architectures, eventually from Statistical Machine Translation (SMT), which ruled MT for a few decades. Slowly, NMT paved its path into Indian MT research and witnessed many works for various language pairs in this regard. Numerous NMT architectures are floating across the international and national research pool; many claims to be state-of-the-art architectures. Though NMT for Indic languages (ILNMT) is giving better results for majority speaking language pairs, the translation quality is low due to a lack of significant resources. Automated machine translation models are unavailable for some less spoken Indic languages like Kashmiri and Dogri. Hence, there is increasing demand in the research to address the challenges of developing applicable MT models even when minuscule training data is available. Based on the corpus availability, the languages are categorized into High Resource Languages (HRLs), Low Resource Languages (LRLs), and Zero Resource Languages (ZRLs). Many Indic languages are classified into HRLs, LRLs, and ZRLs based on corpus availability. The vision behind this literature survey paper is to make this paper a collective source for all information regarding the predominant ILNMT architectures, the toolkits available for building NMT models, and various pre-trained language models needed by researchers who contribute to the ILNMT research community. In this survey paper, ILNMT architectures for different Indic languages are covered, e.g., Hindi, Tamil (HRLs), Kannada, Marathi (LRLs), Sinhala, and Nepali (ZRLs). There are a few language-specific survey papers on ILNMT, and this is one of the first kinds of survey papers where all the information is gathered under one canopy.
更多
查看译文
关键词
neural machine translation,indic languages,voyage
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要