Constituency Parse Reranking for Morphologically Rich Languages
ACTA POLYTECHNICA HUNGARICA(2015)
摘要
In this article we introduce a constituent parsing system which can achieve state-of-the-art results on morphologically rich languages. Our system consists of a Probabilistic Context Free Grammars (PCFG) and n best reranking steps. We compare two methods to handle lexical sparsity in a PCFG parser. The n best reranking step, the discriminative reranker extracts large amount of features from n best parses of the PCFG parser and selects the best tree from these parses. We introduce three feature templates which extend the standard feature set of rerankers. We propose to extract features from Brown clustering - which is a context-based clustering over the words - and analyze the effect of dependency-based and morphology-based feature templates. The effects of these techniques are evaluated on datasets of eight morphologically rich languages.
更多查看译文
关键词
syntactic parsing,constituent parsing,morphologically rich languages,lexical sparsity,Brown clustering
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络