Evaluating automation strategies in language documentation

HLT '09: Proceedings of the NAACL HLT 2009 Workshop on Active Learning for Natural Language Processing(2009)

引用 11|浏览22
暂无评分
摘要
This paper presents pilot work integrating machine labeling and active learning with human annotation of data for the language documentation task of creating interlinearized gloss text (IGT) for the Mayan language Uspanteko. The practical goal is to produce a totally annotated corpus that is as accurate as possible given limited time for manual annotation. We describe ongoing pilot studies which examine the influence of three main factors on reducing the time spent to annotate IGT: suggestions from a machine labeler, sample selection methods, and annotator expertise.
更多
查看译文
关键词
Mayan language,human annotation,language documentation task,limited time,machine labeler,manual annotation,ongoing pilot study,pilot work,active learning,annotated corpus,automation strategy
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要