Multi-Task Projected Embedding For Igbo

TEXT, SPEECH, AND DIALOGUE (TSD 2018)(2018)

引用 1|浏览28
暂无评分
摘要
NLP research on low resource African languages is often impeded by the unavailability of basic resources: tools, techniques, annotated corpora, and datasets. Besides the lack of funding for the manual development of these resources, building from scratch will amount to the reinvention of the wheel. Therefore, adapting existing techniques and models from well-resourced languages is often an attractive option. One of the most generally applied NLP models is word embeddings. Embedding models often require large amounts of data to train which are not available for most African languages. In this work, we adopt an alignment based projection method to transfer trained English embeddings to the Igbo language. Various English embedding models were projected and evaluated on the odd-word, analogy and word-similarity tasks intrinsically, and also on the diacritic restoration task. Our results show that the projected embeddings performed very well across these tasks.
更多
查看译文
关键词
Low-resource, Igbo, Diacritics, Embedding models, Transfer learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要