AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
We investigate a traditional natural language processing problem Semantic role labeling with DB-long short-term memory network

End-To-End Learning Of Semantic Role Labeling Using Recurrent Neural Networks

PROCEEDINGS OF THE 53RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 7TH ..., (2015): 1127-1137

Cited: 360|Views125
EI
Full Text
Bibtex
Weibo

Abstract

Semantic role labeling (SRL) is one of the basic natural language processing (NLP) problems. To this date, most of the successful SRL systems were built on top of some form of parsing results (Koomen et al., 2005; Palmer et al., 2010; Pradhan et al., 2013), where pre-defined feature templates over the syntactic structure are used. The att...More

Code:

Data:

0
Introduction
  • Semantic role labeling (SRL) is a form of shallow semantic parsing whose goal is to discover the predicate-argument structure of each predicate in a given input sentence.
  • For each target verb all the constituents in the sentence which fill a semantic role of the verb have to be recognized.
  • Whether an argument is related to the predicate is determined; the detail relation type was decided(Palmer et al, 2010)
Highlights
  • Semantic role labeling (SRL) is a form of shallow semantic parsing whose goal is to discover the predicate-argument structure of each predicate in a given input sentence
  • Semantic role labeling is useful as an intermediate step in a wide range of natural language processing (NLP) tasks, such as information extraction (Bastianelli et al, 2013), automatic document categorization (Persson et al, 2009) and questionanswering (Dan and Lapata, 2007; Surdeanu et al, 2003; Moschitti et al, 2003)
  • We propose an end-to-end system using deep bi-directional long short-term memory (DB-long short-term memory) model to address the above difficulties
  • We propose an end-to-end system based on recurrent topology
  • We investigate a traditional natural language processing problem Semantic role labeling with DB-long short-term memory network
  • With more sophisticatedly designed network and training technique based on long short-term memory, such as the attempt to integrate the parse tree concept into long short-term memory framework (Tai et al, 2015), we believe the better performance can be achieved
Methods
  • The authors mainly evaluated and analyzed the system on the commonly used CoNLL-2005 shared task data set and the conclusions are validated on CoNLL-2012 shared task. 4.1 Data set

    CoNLL-2005 data set takes section 2-21 of Wall Street Journal (WSJ) data as training set, and section 24 as development set.
  • The test set consists of section 23 of WSJ concatenated with 3 sections from Brown corpus (Carreras and Marquez, 2005).
  • The description and separation of train, development and test data set can be found in (Pradhan et al, 2013).
  • In this part, the authors will analyze the performance of two different networks, the CNN and LSTM network.
  • In order to have good understanding of the contribution from each modeling decision, the authors started from a simple model and add more units step by step
Results
  • The authors' model achieves F1 score of 81.07 on CoNLL-2005 shared task and 81.27 on CoNLL-2012 shared task, both outperforming the previous systems based on parsing results and feature engineering, which heavily rely on the linguistic knowledge from expert
Conclusion
  • The authors investigate a traditional NLP problem SRL with DB-LSTM network
  • With this model, the authors are able to bypass the traditional steps for extracting the intermediate NLP features such as POS and syntactic parsing and avoid human engineering the feature templates.
  • The authors achieve strong ability of learning semantic rules without worrying about over-fitting even on such limited training set.
  • It outperforms the convolution method with large context length.
  • With more sophisticatedly designed network and training technique based on LSTM, such as the attempt to integrate the parse tree concept into LSTM framework (Tai et al, 2015), the authors believe the better performance can be achieved
Tables
  • Table1: An example sequence with 4 input features: argument, predicate, predicate context (context length is 3) , region mark. “IOB” tagging scheme is used (Collobert et al, 2011)
  • Table2: F1 of CNN method on development set and test set of CoNLL-2005 data set
  • Table3: F1 with LSTM method on development set and test set of CoNLL-2005 data set and CoNLL-2012 data set. Emb: the type of embedding. d: the number of LSTM layers. ctx-p: predicate context length. mr: region mark feature. h: hidden layer size
  • Table4: Comparison with previous methods
  • Table5: F1 on each sub sets and classes (CoNLL2005). (We remove the classes with low statistics.)
Download tables as Excel
Related work
  • People solve SRL problems in two major ways. The first one follows the traditional spirit widely used in NLP basic problems. A linear classifier is employed with feature templates. Most efforts focus on how to extract the feature templates that can best describe the text properties from training corpus. One of the most important features is from syntactic parsing, although syntactic parsing is also considered as a difficult problem. Thus system combination appear to be the general solution.
Reference
  • G. Hinton A. Graves, A. Mohamed. 2013. Speech recognition with deep recurrent neural networks. In IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013.
    Google ScholarLocate open access versionFindings
  • Emanuele Bastianelli, Giuseppe Castellucci, Danilo Croce, and Roberto Basili. 2013. Textual inference and meaning representation in human robot interaction. In Proceedings of the Joint Symposium on Semantic Processing. Textual Inference and Structures in Corpora, pages 65–69.
    Google ScholarLocate open access versionFindings
  • Yoshua Bengio, Patrice Simard, and Paolo Frasconi. 1994. Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks, 5(2):157–166.
    Google ScholarLocate open access versionFindings
  • Yoshua Bengio, Rejean Ducharme, Pascal Vincent, and Christian Janvin. 2003. A neural probabilistic language model. Journal of Machine Learning Research, 3:1137–1155, March.
    Google ScholarLocate open access versionFindings
  • Yoshua Bengio, Holger Schwenk, Jean-Sbastien Sencal, Frderic Morin, and Jean-Luc Gauvain. 2006. Neural probabilistic language models. In Innovations in Machine Learning, volume 194 of Studies in Fuzziness and Soft Computing, pages 137–186. Springer Berlin Heidelberg.
    Google ScholarLocate open access versionFindings
  • Xavier Carreras and Lluıs Marquez. 2005. Introduction to the CoNLL-2005 shared task: Semantic role labeling. In Proceedings of the Ninth Conference on Computational Natural Language Learning (CoNLL-2005), pages 152–164, Ann Arbor, Michigan, June. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Eugene Charniak and Mark Johnson. 2005. Coarseto-fine n-best parsing and maxent discriminative reranking. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, ACL ’05, pages 173–180, Stroudsburg, PA, USA. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Eugene Charniak. 2000. A maximum-entropyinspired parser. In Proceedings of the 1st North American Chapter of the Association for Computational Linguistics Conference, NAACL 2000, pages 132–139, Stroudsburg, PA, USA. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Michael Collins. 2003. Head-driven statistical models for natural language parsing. Comput. Linguist., 29(4):589–637, December.
    Google ScholarLocate open access versionFindings
  • Ronan Collobert and Jason Weston. 2008. A unified architecture for natural language processing: Deep neural networks with multitask learning. In Proceedings of the 25th International Conference on Machine Learning, ICML ’08, pages 160–167, New York, NY, USA. ACM.
    Google ScholarLocate open access versionFindings
  • 20Natural language processing (almost) from scratch. Journal of Marchine Learning Research, 12:2493–2537, November.
    Google ScholarLocate open access versionFindings
  • Shen Dan and Mirella Lapata. 2007. Using semantic roles to improve question answering. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLPCoNLL).
    Google ScholarLocate open access versionFindings
  • Alex Graves, Marcus Liwicki, Santiago Fernandez, Roman Bertolami, Horst Bunke, and Jurgen Schmidhuber. 2009. A novel connectionist system for unconstrained handwriting recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(5):855–868.
    Google ScholarLocate open access versionFindings
  • Alex Graves, Greg Wayne, and Ivo Danihelka. 20Neural turing machines. arXiv:1410.5401.
    Findings
  • S. Hochreiter and J. Jurgen Schmidhuber. 1997. Long short-term memory. Neural Computation, 9(8):1735–1780.
    Google ScholarLocate open access versionFindings
  • Yoon Kim. 2014. Convolutional neural networks for sentence classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pages 1746–1751.
    Google ScholarLocate open access versionFindings
  • Peter Koomen, Vasin Punyakanok, Dan Roth, and Wen-tau Yih. 2005. Generalized inference with multiple semantic role labeling systems. In Proceedings of the 9th Conference on Computational Natural Language Learning, CONLL ’05, pages 181–184, Stroudsburg, PA, USA. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • John D. Lafferty, Andrew McCallum, and Fernando C. N. Pereira. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the 8th International Conference on Machine Learning, ICML ’01, pages 282–289, San Francisco, CA, USA. Morgan Kaufmann Publishers Inc.
    Google ScholarLocate open access versionFindings
  • Yann Lecun, Lon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. In Proceedings of the IEEE, pages 2278–2324.
    Google ScholarLocate open access versionFindings
  • Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Distributed representations of phrases and their compositionality. In Advances on Neural Information Processing Systems.
    Google ScholarLocate open access versionFindings
  • Andriy Mnih and Koray Kavukcuoglu. 2013. Learning word embeddings efficiently with noise-contrastive estimation. In Advances in Neural Information Processing Systems, pages 2265–2273.
    Google ScholarLocate open access versionFindings
  • Alessandro Moschitti, Paul Morarescu, and Sanda M. Harabagiu. 2003. Open domain information extraction via automatic semantic labeling. In FLAIRS Conference’03, pages 397–401.
    Google ScholarLocate open access versionFindings
  • Martha Palmer, Daniel Gildea, and Nianwen Xue. 2010. Semantic Role Labeling. Synthesis Lectures on Human Language Technology Series. Morgan and Claypool.
    Google ScholarFindings
  • Jacob Persson, Richard Johansson, and Pierre Nugues. 2009. Text categorization using predicatecargument structures. In Proceedings of NODALIDA, pages 142–149.
    Google ScholarLocate open access versionFindings
  • Sameer Pradhan, Kadri Hacioglu, Wayne Ward, James H. Martin, and Daniel Jurafsky. 2005. Semantic role chunking combining complementary syntactic views. In Proceedings of the 9th Conference on Computational Natural Language Learning, CONLL ’05, pages 217–220, Stroudsburg, PA, USA. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Sameer Pradhan, Alessandro Moschitti, Nianwen Xue, Hwee Tou Ng, Anders Bjorkelund, Olga Uryupina, Yuchen Zhang, and Zhi Zhong. 2013. Towards robust linguistic analysis using ontonotes. In Proceedings of the Seventeenth Conference on Computational Natural Language Learning, pages 143–152, Sofia, Bulgaria, August. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • V. Punyakanok, D. Roth, and W. Yih. 2008a. The importance of syntactic parsing and inference in semantic role labeling. Computational Linguistics, 34(2).
    Google ScholarLocate open access versionFindings
  • Vasin Punyakanok, Dan Roth, and Wen tau Yih. 2008b. The importance of syntactic parsing and inference in semantic role labeling. Computational linguistics, 6(9).
    Google ScholarLocate open access versionFindings
  • M. Schuster and K. K. Paliwal. 1997. Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing, 45:2673–2681.
    Google ScholarLocate open access versionFindings
  • Mihai Surdeanu, Sanda Harabagiu, John Williams, and Paul Aarseth. 2003. Using predicate-argument structures for information extraction. In Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1, ACL ’03, pages 8–15, Stroudsburg, PA, USA. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Mihai Surdeanu, Lluıs Marquez, Xavier Carreras, and Pere R. Comas. 2007. Combination strategies for semantic role labeling. Journal of Artificial Intelligence Research, 29:105–151.
    Google ScholarLocate open access versionFindings
  • Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014. Sequence to sequence learning with neural networks. In Advances on Neural Information Processing Systems.
    Google ScholarLocate open access versionFindings
  • Oscar Tackstrom, Kuzman Ganchev, and Dipanjan Das. 2015. Efficient inference and structured learning for semantic role labeling. Transactions of the Association for Computational Linguistics, 3:29–41.
    Google ScholarLocate open access versionFindings
  • Kai Sheng Tai, Richard Socher, and Christopher D. Manning. 2015. Improved semantic representations from tree-structured long short-term memory networks. In Proceedings of the 53st Annual Meeting on Association for Computational Linguistics, ACL ’15, Stroudsburg, PA, USA. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Kristina Toutanova, Aria Haghighi, and Christopher D. Manning. 2008. A global joint model for semantic role labeling. Computational Linguistics, 34:161– 191.
    Google ScholarLocate open access versionFindings
  • Oriol Vinyals, Lukasz Kaiser, Terry Koo, Slav Petrov, Ilya Sutskever, and Geoffrey Hinton. 2014. Grammar as a foreign language. arXiv:1412.7449.
    Findings
  • Jason Weston, Sumit Chopra, and Antoine Bordes. 2014. Memory networks. arXiv:1410.3916.
    Findings
  • Jason Weston, Antoine Bordes, Sumit Chopra, and Tomas Mikolov. 2015. Towards ai-complete question answering: A set of prerequisite toy tasks. arXiv:1502.05698.
    Findings
  • Mo Yu, Matthew Gormley, and Mark Dredze. 2014. Factor-based compositional embedding models. In Advances in Neural Information Processing Systems Workshop on Learning Semantics.
    Google ScholarLocate open access versionFindings
  • Xiang Zhang and Yann LeCun. 2015. Text understanding from scratch. arXiv:1502.01710.
    Findings
Author
0
Your rating :

No Ratings

Tags
Comments
avatar
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn