Minority Language Twitter: Part-of-Speech Tagging and Analysis of Irish Tweets
NUT@IJCNLP(2015)
摘要
Noisy user-generated text poses problems for natural language processing.
In this paper, we show that this statement also holds true for the Irish
language. Irish is regarded as a low-resourced language, with limited
annotated corpora available to NLP researchers and linguists to fully
analyse the linguistic patterns in language use in social media. We
contribute to recent advances in this area of research by reporting on the
development of part-of speech annotation scheme and annotated corpus for
Irish language tweets. We also report on state-of-the-art tagging results of
training and testing three existing POStaggers on our new dataset.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要