Unsupervised Part-of-Speech Tagging with Bilingual Graph-Based Projections.

HLT '11: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1(2011)

引用 347|浏览568
暂无评分
摘要
We describe a novel approach for inducing unsupervised part-of-speech taggers for languages that have no labeled training data, but have translated text in a resource-rich language. Our method does not assume any knowledge about the target language (in particular no tagging dictionary is assumed), making it applicable to a wide array of resource-poor languages. We use graph-based label propagation for cross-lingual knowledge transfer and use the projected labels as features in an unsupervised model (Berg-Kirkpatrick et al., 2010). Across eight European languages, our approach results in an average absolute improvement of 10.4% over a state-of-the-art baseline, and 16.7% over vanilla hidden Markov models induced with the Expectation Maximization algorithm.
更多
查看译文
关键词
European language,approach result,cross-lingual knowledge transfer,novel approach,resource-poor language,resource-rich language,target language,unsupervised model,unsupervised part-of-speech taggers,Expectation Maximization algorithm,Unsupervised part-of-speech,bilingual graph-based projection
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要