AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
The maximum-likelihood estimate of a compact unlexicalized probabilistic context-free grammars can parse on par with early lexicalized parsers

Accurate Unlexicalized Parsing

ACL, (2003): 423-430

Cited: 3821|Views530
EI

Abstract

We demonstrate that an unlexicalized PCFG can parse much more accurately than previously shown, by making use of simple, linguistically motivated state splits, which break down false independence assumptions latent in a vanilla treebank grammar. Indeed, its performance of 86.36% (LP/LR F PCFG models, and surprisingly close to the current ...More

Code:

Data:

0
Introduction
  • Several results have brought into question how large a role lexicalization plays in such parsers. Johnson (1998) showed that the performance of an unlexicalized PCFG over the Penn treebank could be improved enormously by annotating each node by its parent category.
  • To the extent that no such strong baseline has been provided, the community has tended to greatly overestimate the beneficial effect of lexicalization in probabilistic parsing, rather than looking critically at where lexicalized probabilities are both needed to make the right decision and available in the training data
  • This result affirms the value of linguistic analysis for feature discovery.
  • The authors see this investigation as only one part of the foundation for state-of-the-art parsing which employs both lexical and structural conditioning
Highlights
  • Several results have brought into question how large a role lexicalization plays in such parsers. Johnson (1998) showed that the performance of an unlexicalized probabilistic context-free grammars (PCFGs) over the Penn treebank could be improved enormously by annotating each node by its parent category
  • We show that the parsing performance that can be achieved by an unlexicalized PCFG is far higher than has previously been demonstrated, and is, much higher than community wisdom has thought possible
  • Linguistically motivated annotations which do much to close the gap between a vanilla PCFG and state-of-the-art lexicalized models
  • We construct an unlexicalized PCFG which outperforms the lexicalized PCFGs of Magerman (1995) and Collins (1996) (though not more recent models, such as Charniak (1997) or Collins (1999)). One benefit of this result is a much-strengthened lower bound on the capacity of an unlexicalized PCFG
  • We have shown that, surprisingly, the maximum-likelihood estimate of a compact unlexicalized PCFG can parse on par with early lexicalized parsers
Methods
  • To facilitate comparison with previous work, the authors trained the models on sections 2–21 of the WSJ section of the Penn treebank.
  • The authors used the first 20 files (393 sentences) of section 22 as a development set.
  • All of section 23 was used as a test set for the final model.
  • The authors used a simple array-based Java implementation of a generalized CKY parser, which, for the final best model, was able to exhaustively parse all sentences in section 23 in 1GB of memory, taking approximately 3 sec for average length sentences.6
  • Given a set of transformed trees, the authors viewed the local trees as grammar rewrite rules in the standard way, and used maximum-likelihood estimates for rule probabilities.5 To parse the grammar, the authors used a simple array-based Java implementation of a generalized CKY parser, which, for the final best model, was able to exhaustively parse all sentences in section 23 in 1GB of memory, taking approximately 3 sec for average length sentences.6
Results
  • The authors took the final model and used it to parse section 23 of the treebank. Figure 8 shows the results.
  • The test set F1 is 86.32% for ≤ 40 words, already higher than early lexicalized models, though lower than the state-of-the-art parsers
Conclusion
  • The advantages of unlexicalized grammars are clear enough – easy to estimate, easy to parse with, and time- and space-efficient.
  • The authors have shown that, surprisingly, the maximum-likelihood estimate of a compact unlexicalized PCFG can parse on par with early lexicalized parsers.
  • The authors have shown ways to improve parsing, some easier than lexicalization, and others of which are orthogonal to it, and could presumably be used to benefit lexicalized parsers as well
Funding
  • This paper is based on work supported in part by the National Science Foundation under Grant No IIS0085896, and in part by an IBM Faculty Partnership Award to the second author
Reference
  • James K. Baker. 1979. Trainable grammars for speech recognition. In D. H. Klatt and J. J. Wolf, editors, Speech Communication Papers for the 97th Meeting of the Acoustical Society of America, pages 547–550.
    Google ScholarLocate open access versionFindings
  • Taylor L. Booth and Richard A. Thomson. 1973. Applying probability measures to abstract languages. IEEE Transactions on Computers, C-22:442–450.
    Google ScholarLocate open access versionFindings
  • Sharon A. Caraballo and Eugene Charniak. 1998. New figures of merit for best-first probabilistic chart parsing. Computational Linguistics, 24:275–298.
    Google ScholarLocate open access versionFindings
  • Eugene Charniak, Sharon Goldwater, and Mark Johnson. 1998. Edge-based best-first chart parsing. In Proceedings of the Sixth Workshop on Very Large Corpora, pages 127–133.
    Google ScholarLocate open access versionFindings
  • Eugene Charniak. 1996. Tree-bank grammars. In Proc. of the 13th National Conference on Artificial Intelligence, pp. 1031–1036.
    Google ScholarLocate open access versionFindings
  • Eugene Charniak. 1997. Statistical parsing with a context-free grammar and word statistics. In Proceedings of the 14th National Conference on Artificial Intelligence, pp. 598–603.
    Google ScholarLocate open access versionFindings
  • Eugene Charniak. 2000. A maximum-entropy-inspired parser. In NAACL 1, pages 132–139.
    Google ScholarLocate open access versionFindings
  • Eugene Charniak. 2001. Immediate-head parsing for language models. In ACL 39.
    Google ScholarFindings
  • Noam Chomsky. 1965. Aspects of the Theory of Syntax. MIT Press, Cambridge, MA.
    Google ScholarFindings
  • Michael John Collins. 1996. A new statistical parser based on bigram lexical dependencies. In ACL 34, pages 184–191.
    Google ScholarLocate open access versionFindings
  • M. Collins. 1999. Head-Driven Statistical Models for Natural Language Parsing. Ph.D. thesis, Univ. of Pennsylvania.
    Google ScholarFindings
  • Jason Eisner and Giorgio Satta. 1999. Efficient parsing for bilexical context-free grammars and head-automaton grammars. In ACL 37, pages 457–464.
    Google ScholarLocate open access versionFindings
  • Marilyn Ford, Joan Bresnan, and Ronald M. Kaplan. 1982. A competence-based theory of syntactic closure. In Joan Bresnan, editor, The Mental Representation of Grammatical Relations, pages 727–796. MIT Press, Cambridge, MA.
    Google ScholarLocate open access versionFindings
  • Daniel Gildea. 2001. Corpus variation and parser performance. In 2001 Conference on Empirical Methods in Natural Language Processing (EMNLP).
    Google ScholarLocate open access versionFindings
  • Donald Hindle and Mats Rooth. 1993. Structural ambiguity and lexical relations. Computational Linguistics, 19(1):103–120.
    Google ScholarLocate open access versionFindings
  • Mark Johnson. 1998. PCFG models of linguistic tree representations. Computational Linguistics, 24:613–632.
    Google ScholarLocate open access versionFindings
  • Dan Klein and Christopher D. Manning. 2001. Parsing with treebank grammars: Empirical bounds, theoretical models, and the structure of the Penn treebank. In ACL 39/EACL 10.
    Google ScholarLocate open access versionFindings
  • David M. Magerman. 1995. Statistical decision-tree models for parsing. In ACL 33, pages 276–283.
    Google ScholarLocate open access versionFindings
  • Andrew Radford. 1988. Transformational Grammar. Cambridge University Press, Cambridge.
    Google ScholarFindings
  • Dana Ron, Yoram Singer, and Naftali Tishby. 1994. The power of amnesia. Advances in Neural Information Processing Systems, volume 6, pages 176–183. Morgan Kaufmann.
    Google ScholarLocate open access versionFindings
0
Your rating :

No Ratings

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn