A Basic Language Resource Kit Implementation for the Igbo NLP Project

ACM Transactions on Asian and Low-Resource Language Information Processing(2018)

引用 0|浏览2
暂无评分
摘要
Igbo, an African language with around 32 million speakers worldwide, is one of the many languages having few or none of the language processing resources needed for advanced language technology applications. In this article, we describe the approach taken to creating an initial set of resources for Igbo, including an electronic text corpus, a part-of-speech (POS) tagset, and a POS-tagged subcorpus. We discuss the approach taken in gathering texts, the preprocessing of these texts, and the development of the POS tagged corpus. We also discuss some of the problems encountered during corpus and tagset development and the solutions arrived at for these problems.
更多
查看译文
关键词
language,<i>nlp</i>,resource,implementation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要