AI帮你理解科学

AI 生成解读视频

AI抽取解析论文重点内容自动生成视频


pub
生成解读视频

AI 溯源

AI解析本论文相关学术脉络


Master Reading Tree
生成 溯源树

AI 精读

AI抽取本论文的概要总结


微博一下
We have presented two grammar-based language models, both of which significantly improve upon both the trigram model baseline for the task and the best previous grammar-based language model

Immediate-head parsing for language models

ACL, pp.124-131, (2001)

被引用397|浏览286
EI
下载 PDF 全文
引用
微博一下

摘要

We present two language models based upon an "immediate-head" parser --- our name for a parser that conditions all events below a constituent c upon the head of c. While all of the most accurate statistical parsers are of the immediate-head variety, no previous grammatical language model uses this technology. The perplexity for both of th...更多

代码

数据

0
简介
  • All of the most accurate statistical parsers [1,3,6,7, 12,14] are lexicalized in that they condition probabilities on the lexical content of the sentences being parsed.
  • The author would like to thank the members of the Brown Laboratory for Linguistic Information Processing (BLLIP) and Brian Roark who gave very useful tips on conducting this research.
  • It is the experience of the statistical parsing community that immediate-head parsers are the most accurate the authors can design
重点内容
  • All of the most accurate statistical parsers [1,3,6,7, 12,14] are lexicalized in that they condition probabilities on the lexical content of the sentences being parsed
  • The author would like to thank the members of the Brown Laboratory for Linguistic Information Processing (BLLIP) and Brian Roark who gave very useful tips on conducting this research
  • One interesting fact about the immediate-trihead model is that of the 3761 sentences in the test corpus, on 2934, or about 75%, the grammar model assigns a higher probability to the sentence than does the trigram model
  • One might well ask what went “wrong” with the remaining 25%? Why should the grammar model ever get beaten? Three possible reasons come to mind: 1
  • We ask this question because what we should do to improve performance of our grammar-based language models depends critically on which of these explanations is correct: if (1) we should collect more data, if (2) we should just live with the tandem grammar-trigram models, and if (3) we should create better parsers
  • We suggest that improvement of the underlying parser should significantly improve the model’s perplexity and that even in the near term there is a lot of potential for improvement in immediate-head language models
  • We have presented two grammar-based language models, both of which significantly improve upon both the trigram model baseline for the task and the best previous grammar-based language model
方法
  • The parser as described in the previous section was trained and tested on the data used in the previously described grammar-based language modeling research [4,15].
  • As seen in Table 2, the immediate-bihead model with a perplexity of 144.98 outperforms both previous models, even though they use trigrams of words in their probability estimates
结果
  • The authors suggest that improvement of the underlying parser should significantly improve the model’s perplexity and that even in the near term there is a lot of potential for improvement in immediate-head language models.
  • Labeled Recall 83.7% 84.9% 79.0%.
  • The authors ask this question because what the authors should do to improve performance of the grammar-based language models depends critically on which of these explanations is correct: if (1) the authors should collect more data, if (2) the authors should just live with the tandem grammar-trigram models, and if (3) the authors should create better parsers
结论
  • One interesting fact about the immediate-trihead model is that of the 3761 sentences in the test corpus, on 2934, or about 75%, the grammar model assigns a higher probability to the sentence than does the trigram model.
  • 2. The grammar model and the trigram model capture different facts about the distribution of words in the language, and for some set of sentences one distribution will perform better than the other.
  • 3. The grammar model is, in some sense, always better than the trigram model, but if the parser Sentence Group.
  • That if the authors were dealing with standard Penn Tree-bank Wall-Street-Journal text, asking for better parsers would be easier said than done.
  • The text in question has been “speechified” by removing punctuation and capitalization, and “simplified” by allowing only a fixed vocabulary of 10,000 words, and replacing all digits and symbols by the symbol “N”
总结
  • Introduction:

    All of the most accurate statistical parsers [1,3,6,7, 12,14] are lexicalized in that they condition probabilities on the lexical content of the sentences being parsed.
  • The author would like to thank the members of the Brown Laboratory for Linguistic Information Processing (BLLIP) and Brian Roark who gave very useful tips on conducting this research.
  • It is the experience of the statistical parsing community that immediate-head parsers are the most accurate the authors can design
  • Methods:

    The parser as described in the previous section was trained and tested on the data used in the previously described grammar-based language modeling research [4,15].
  • As seen in Table 2, the immediate-bihead model with a perplexity of 144.98 outperforms both previous models, even though they use trigrams of words in their probability estimates
  • Results:

    The authors suggest that improvement of the underlying parser should significantly improve the model’s perplexity and that even in the near term there is a lot of potential for improvement in immediate-head language models.
  • Labeled Recall 83.7% 84.9% 79.0%.
  • The authors ask this question because what the authors should do to improve performance of the grammar-based language models depends critically on which of these explanations is correct: if (1) the authors should collect more data, if (2) the authors should just live with the tandem grammar-trigram models, and if (3) the authors should create better parsers
  • Conclusion:

    One interesting fact about the immediate-trihead model is that of the 3761 sentences in the test corpus, on 2934, or about 75%, the grammar model assigns a higher probability to the sentence than does the trigram model.
  • 2. The grammar model and the trigram model capture different facts about the distribution of words in the language, and for some set of sentences one distribution will perform better than the other.
  • 3. The grammar model is, in some sense, always better than the trigram model, but if the parser Sentence Group.
  • That if the authors were dealing with standard Penn Tree-bank Wall-Street-Journal text, asking for better parsers would be easier said than done.
  • The text in question has been “speechified” by removing punctuation and capitalization, and “simplified” by allowing only a fixed vocabulary of 10,000 words, and replacing all digits and symbols by the symbol “N”
表格
  • Table1: Perplexity results for two previous grammar-based language models
  • Table2: Perplexity results for the immediate-bihead model
  • Table3: Perplexity results for the immediatetrihead model
  • Table4: Precision/recall for sentences in which trigram/grammar models performed best bungles the parse, then the grammar model is impacted very badly. Obviously the trigram model has no such Achilles’ heel
Download tables as Excel
基金
  • £ This research was supported in part by NSF grant LIS SBR 9720368 and by NSF grant IIS0095940
引用论文
  • BOD, R. What is the minimal set of fragments that achieves maximal parse accuracy. In Proceedings of Association for Computational Linguistics 2001. 2001.
    Google ScholarLocate open access versionFindings
  • CHARNIAK, E. Tree-bank grammars. In Proceedings of the Thirteenth National Conference on Artificial Intelligence. AAAI Press/MIT Press, Menlo Park, 1996, 1031–1036.
    Google ScholarLocate open access versionFindings
  • CHARNIAK, E. A maximum-entropy-inspired parser. In Proceedings of the 2000 Conference of the North American Chapter of the Association for Computational Linguistics. ACL, New Brunswick NJ, 2000.
    Google ScholarLocate open access versionFindings
  • CHELBA, C. AND JELINEK, F. Exploiting syntactic structure for language modeling. In Proceedings for COLING-ACL 98. ACL, New Brunswick NJ, 1998, 225–231.
    Google ScholarLocate open access versionFindings
  • CHI, Z. AND GEMAN, S. Estimation of probabilistic context-free grammars. Computational Linguistics 24 2 (1998), 299–306.
    Google ScholarLocate open access versionFindings
  • COLLINS, M. J. Three generative lexicalized models for statistical parsing. In Proceedings of the 35th Annual Meeting of the ACL. 1997, 16– 23.
    Google ScholarLocate open access versionFindings
  • COLLINS, M. J. Head-Driven Statistical Models for Natural Language Parsing. University of Pennsylvania, Ph.D. Dissertation, 1999.
    Google ScholarFindings
  • COLLINS, M. J. Discriminative reranking for natural language parsing. In Proceedings of the International Conference on Machine Learning (ICML 2000). 2000.
    Google ScholarLocate open access versionFindings
  • GODDEAU, D. Using probabilistic shift-reduce parsing in speech recognition systems. In Proceedings of the 2nd International Conference on Spoken Language Processing. 1992, 321–324.
    Google ScholarLocate open access versionFindings
  • GOODMAN, J. Putting it all together: language model combination. In ICASSP-2000. 2000.
    Google ScholarLocate open access versionFindings
  • LAUER, M. Corpus statistics meet the noun compound: some empirical results. In Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics. 1995, 47–55.
    Google ScholarLocate open access versionFindings
  • MAGERMAN, D. M. Statistical decision-tree models for parsing. In Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics. 1995, 276–283.
    Google ScholarLocate open access versionFindings
  • MARCUS, M. P., SANTORINI, B. AND MARCINKIEWICZ, M. A. Building a large annotated corpus of English: the Penn treebank. Computational Linguistics 19 (1993), 313–330.
    Google ScholarLocate open access versionFindings
  • RATNAPARKHI, A. Learning to parse natural language with maximum entropy models. Machine Learning 34 1/2/3 (1999), 151–176.
    Google ScholarLocate open access versionFindings
  • 16. STOLCKE, A. An efficient probabilistic context-free parsing algorithm that computes prefix probabilities. Computational Linguistics 21 (1995), 165–202.
    Google ScholarLocate open access versionFindings
  • 17. STOLCKE, A. AND SEGAL, J. Precise n-gram probabilities from stochastic context-free grammars. In Proceedings of the 32th Annual Meeting of the Association for Computational Linguistics. 1994, 74–79.
    Google ScholarLocate open access versionFindings
您的评分 :
0

 

最佳论文
2001年, 荣获ACL的最佳论文奖
标签
评论
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科