A Bayesian Mixture Model for PoS Induction Using Multiple Features.

Christos Christodoulopoulos,Sharon Goldwater,Mark Steedman

Empirical Methods in Natural Language Processing（2011）

引用 19|浏览35

暂无评分

摘要

In this paper we present a fully unsupervised syntactic class induction system formulated as a Bayesian multinomial mixture model, where each word type is constrained to belong to a single class. By using a mixture model rather than a sequence model (e.g., HMM), we are able to easily add multiple kinds of features, including those at both the type level (morphology features) and token level (context and alignment features, the latter from parallel corpora). Using only context features, our system yields results comparable to state-of-the art, far better than a similar model without the one-class-per-type constraint. Using the additional features provides added benefit, and our final system outperforms the best published results on most of the 25 corpora tested.

查看译文

关键词

pos induction,bayesian mixture model,multiple features

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要