#hardtoparse: POS Tagging and Parsing the Twitterverse.

Jennifer Foster,Özlem Çetinoglu,Joachim Wagner,Joseph Le Roux, Stephen Hogan,Joakim Nivre,Deirdre Hogan,Josef van Genabith

AAAIWS'11-05: Proceedings of the 5th AAAI Conference on Analyzing Microtext（2011）

引用 162|浏览57

暂无评分

摘要

We evaluate the statistical dependency parser, Malt, on a new dataset of sentences taken from tweets. We use a version of Malt which is trained on gold standard phrase structure Wall Street Journal (WSJ) trees converted to Stanford labelled dependencies. We observe a drastic drop in performance moving from our in-domain WSJ test set to the new Twitter dataset, much of which has to do with the propagation of part-of-speech tagging errors. Retraining Malt on dependency trees produced by a state-of-the-art phrase structure parser, which has itself been self-trained on Twitter material, results in a significant improvement. We analyse this improvement by examining in detail the effect of the retraining on individual dependency types.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要