Proceedings of the Workshop on Parsing German

PaGe '08 Proceedings of the Workshop on Parsing German(2008)

引用 23|浏览34
暂无评分
摘要
Welcome to the ACL Workshop on Parsing German, the first of what we hope will be a long and fruitful series of workshops on this topic. German possesses an interesting set of configurational properties on the syntactic level which make it far less flexible with respect to word order than other free word order languages. Analyses of these properties, which have formed a part of the traditional syntax of German since the early 19th century, only re-entered the mainstream of generative linguistics research within the last twenty years or so. In computational linguistics, however, their realization has varied quite widely: "topological fields" in HPSG-style analyses, multiple parse trees, special constraints on liberation in constraint-based dependency-style analyses, various hybrid "deep/shallow" approaches, and agnostic parameter estimation over graphs. This variation can also acutely be felt in the annotation of German treebanks. Many corpora have historically elected to annotate only a few of the different senses of the term "constituent" inherent to German syntax, resulting in standards that make German appear either more like English or more like Czech. The aim of this workshop was to provide a forum for theoretical discussion as well as a shared task, based on the TIGER and TueBa-D/Z German treebanks, for these various approaches to make their case on empirical grounds. This combination we believe to be essential to balancing the considerations of what structure merits learning versus the ease with which it can be learned. Both treebanks are annotated collections of German newspaper text on similar topics. They are annotated with POS, morphology, phrase structure, and grammatical functions. TueBa-D/Z additionally uses topological fields to describe fundamental word order restrictions in German clauses. The treebanks differ significantly in their annotation schemes, however: while TIGER relies on crossing branches to describe long distance relationships, TueBa-D/Z uses pure tree structures with designated labels for long distance relationships. Additionally, the annotation is TIGER is flat on the phrasal level while TueBa-D/Z annotates phrasal structure more hierarchically. A report on the results of this year's shared task can be found in the final paper of these proceedings.
更多
查看译文
关键词
Z German treebanks,topological field,shared task,long distance relationship,German newspaper text,annotation scheme,free word order language,German clause,German treebanks,German syntax
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要