AI帮你理解科学

AI 生成解读视频

AI抽取解析论文重点内容自动生成视频


pub
生成解读视频

AI 溯源

AI解析本论文相关学术脉络


Master Reading Tree
生成 溯源树

AI 精读

AI抽取本论文的概要总结


微博一下
We propose A-Graph convolutional networks for combinatory categorial grammar supertagging, with its graph built from chunks extracted from a lexicon

Supertagging Combinatory Categorial Grammar with Attentive Graph Convolutional Networks

EMNLP 2020, pp.6037-6044, (2020)

被引用0|浏览238
下载 PDF 全文
引用
微博一下

摘要

Supertagging is conventionally regarded as an important task for combinatory categorial grammar (CCG) parsing, where effective modeling of contextual information is highly important to this task. However, existing studies have made limited efforts to leverage contextual features except for applying powerful encoders (e.g., bi-LSTM). In th...更多

代码

数据

0
简介
  • Combinatory categorial grammar (CCG) is a lexicalized grammatical formalism, where the lexical categories of the words in a sentence provide informative syntactic and semantic knowledge for text understanding.
  • CCG parse often provides useful information for many downstream natural language processing (NLP) tasks such as logical reasoning (Yoshikawa et al, 2018) and semantic parsing (Beschke, 2019).
  • Graph convolutional networks (GCN) is demonstrated to be an effective approach to model such contextual information between words in many NLP tasks (Marcheggiani and Titov, 2017; Huang and Carley, 2019; De Cao et al, 2019; Huang et al, 2019); the authors want to determine whether this approach can help CCG supertagging
重点内容
  • Combinatory categorial grammar (CCG) is a lexicalized grammatical formalism, where the lexical categories of the words in a sentence provide informative syntactic and semantic knowledge for text understanding
  • To explore the effectiveness of our approach, we run combinatory categorial grammar (CCG) taggers with and without A-Graph convolutional networks (GCN), and try two ways to construct the graph: one is a fully connected GCN where edges are built between every two words; the other is our proposed approach with the chunk-based graph
  • Experimental results on supertagging accuracy (TAG) and labeled F-scores (LF) for parsing on the development set of CCGbank are reported in Table 3, with the number of trainable parameters of all models presented
  • We propose attentive GCN (A-GCN) for CCG supertagging, with its graph built from chunks extracted from a lexicon
  • We use two types of edges for the graph, namely, in-chunk and cross-chunk edges for word pairs within and across chunks, respectively, and propose an attention mechanism to distinguish the important word pairs according to their contribution to CCG supertagging
  • Experimental results and the ablation study on the English CCGbank demonstrate the effectiveness of our approach to CCG supertagging, where state-of-the-art performance is obtained on both CCG supertagging and parsing
方法
  • 3.1 Settings

    The authors run experiments on the English CCGbank (Hockenmaier and Steedman, 2007)5 and follow Clark and Curran (2007) to split it into train/dev/test sets, whose statistics are reported in Table 1.
  • The authors follow previous studies (Lewis and Steedman, 2014a; Lewis et al, 2016; Yoshikawa et al, 2017) to evaluate the model on both the tagging accuracy of the most frequent 425 supertags and the labeled F-scores (LF) of the dependencies converted from CCG parse7.
  • For other hyper-parameter settings, the authors test their values as shown in Table 2 when training the models.
  • With the best hyper-parameters, the best performance is achieved with warm-up rate 0.1, batch size 16, and learning rate 1e-5
结果
  • To explore the effectiveness of the approach, the authors run CCG taggers with and without A-GCN, and try two ways to construct the graph: one is a fully connected GCN where edges are built between every two words; the other is the proposed approach with the chunk-based graph.
  • Stanojevicand Steedman (2019) performed CCG parsing directly without the suppertagging step, whereas the rest all did supertagging first.
  • Regardless of this difference, the approach performs the best on CCGbank in both supertagging accuracy and parsing LF
结论
  • The authors propose A-GCN for CCG supertagging, with its graph built from chunks extracted from a lexicon.
  • The authors use two types of edges for the graph, namely, in-chunk and cross-chunk edges for word pairs within and across chunks, respectively, and propose an attention mechanism to distinguish the important word pairs according to their contribution to CCG supertagging.
  • Further analysis is performed to investigate using different types of edges, which reveals their quality and confirms the necessity of introducing attention to GCN for CCG supertagging
表格
  • Table1: The train/dev/test splits of English CCGBank and the statistics of sentences and words in them
  • Table2: The list of hyper-parameters tested in our experiments. We run all models with the combination of those hyper-parameters and use the one achieving the highest supertagging results in our final experiments
  • Table3: Results (supertagging accuracy and labeled F scores) of different models with BERT-Large encoder on the development set of CCGbank. “PARM” is the number of trainable parameters in the models; “Full” uses the fully connected graph and “Chunk” uses the graph built based on chunks
  • Table4: Comparison of our models with uncased BERT encoder and previous studies on the test set of CCGbank. Models with “†” use the EasyCCG parser to generate CCG parse trees from the predicted supertags
  • Table5: Experimental results of models with uncased BERT-Large encoder on the test set of CCGbank, where the in-chunk, cross-chunk edges or the attention mechanism in our A-GCN module is ablated
Download tables as Excel
引用论文
  • Sebastian Beschke. 2019. Exploring graph-algebraic CCG combinators for syntactic-semantic AMR parsing. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019), pages 112–121, Varna, Bulgaria.
    Google ScholarLocate open access versionFindings
  • Kevin Clark, Minh-Thang Luong, Christopher D Manning, and Quoc V Le. 2018. Semi-supervised Sequence Modeling with Cross-view Training. arXiv preprint arXiv:1809.08370.
    Findings
  • Stephen Clark and James R. Curran. 2007. WideCoverage Efficient Statistical Parsing with CCG and Log-Linear Models. Computational Linguistics, 33(4):493–552.
    Google ScholarLocate open access versionFindings
  • Nicola De Cao, Wilker Aziz, and Ivan Titov. 2019. Question Answering by Reasoning Across Documents with Graph Convolutional Networks. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 2306–2317, Minneapolis, Minnesota.
    Google ScholarLocate open access versionFindings
  • Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186.
    Google ScholarLocate open access versionFindings
  • Kilian Evang, Lasha Abzianidze, and Johan Bos. 2019. CCGweb: a New Annotation Tool and a First Quadrilingual CCG Treebank. In Proceedings of the 13th Linguistic Annotation Workshop, pages 37–42, Florence, Italy.
    Google ScholarLocate open access versionFindings
  • Julia Hockenmaier and Mark Steedman. 200CCGbank: A corpus of CCG derivations and dependency structures extracted from the Penn treebank. Computational Linguistics, 33(3):355–396.
    Google ScholarLocate open access versionFindings
  • Binxuan Huang and Kathleen Carley. 2019. SyntaxAware Aspect Level Sentiment Classification with Graph Attention Networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 5469–5477, Hong Kong, China.
    Google ScholarLocate open access versionFindings
  • Chang-Ning Huang and Yan Song. 2015. Chinese CCGbank Construction from Tsinghua Chinese Treebank. Journal of Chinese Linguistics Monograph Series, (25):274–311.
    Google ScholarLocate open access versionFindings
  • Lianzhe Huang, Dehong Ma, Sujian Li, Xiaodong Zhang, and Houfeng Wang. 2019. Text Level Graph Neural Network for Text Classification. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3444–3450, Hong Kong, China.
    Google ScholarLocate open access versionFindings
  • Shonosuke Ishiwatari, Jingtao Yao, Shujie Liu, Mu Li, Ming Zhou, Naoki Yoshinaga, Masaru Kitsuregawa, and Weijia Jia. 2017. Chunk-based Decoder for Neural Machine Translation. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1901–1912, Vancouver, Canada.
    Google ScholarLocate open access versionFindings
  • Jonathan K. Kummerfeld, Jessika Roesner, Tim Dawborn, James Haggerty, James R. Curran, and Stephen Clark. 2010. Faster Parsing by Supertagger Adaptation. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pages 345–355.
    Google ScholarLocate open access versionFindings
  • Mike Lewis, Kenton Lee, and Luke Zettlemoyer. 2016. LSTM CCG Parsing. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 221–231, San Diego, California.
    Google ScholarLocate open access versionFindings
  • Mike Lewis and Mark Steedman. 2014a. A* CCG parsing with a supertag-factored model. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 990– 1000, Doha, Qatar.
    Google ScholarLocate open access versionFindings
  • Mike Lewis and Mark Steedman. 2014b. Improved CCG Parsing with Semi-supervised Supertagging. Transactions of the Association for Computational Linguistics, 2:327–338.
    Google ScholarLocate open access versionFindings
  • Diego Marcheggiani and Ivan Titov. 2017. Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 1506–1515, Copenhagen, Denmark.
    Google ScholarLocate open access versionFindings
  • Yan Song, Chang-Ning Huang, and Chunyu Kit. 2012. Construction of Chinese CCGbank. Journal of Chinese Information Processing, (3):2.
    Google ScholarLocate open access versionFindings
  • Yan Song, Chunyu Kit, and Xiao Chen. 2009. Transliteration of Name Entity via Improved Statistical Translation on Character Sequences. In Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration (NEWS 2009), Suntec, Singapore.
    Google ScholarLocate open access versionFindings
  • Yan Song and Fei Xia. 2012. Using a Goodness Measurement for Domain Adaptation: A Case Study on Chinese Word Segmentation. In LREC, pages 3853– 3860.
    Google ScholarLocate open access versionFindings
  • Milos Stanojevicand Mark Steedman. 2019. CCG parsing algorithm with incremental tree rotation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 228–239, Minneapolis, Minnesota.
    Google ScholarLocate open access versionFindings
  • Yuanhe Tian, Yan Song, Xiang Ao, Fei Xia, Xiaojun Quan, Tong Zhang, and Yonggang Wang. 2020a. Joint Chinese Word Segmentation and Partof-speech Tagging via Two-way Attentions of Autoanalyzed Knowledge. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 8286–8296, Online.
    Google ScholarLocate open access versionFindings
  • Yuanhe Tian, Yan Song, Fei Xia, and Tong Zhang. 2020b. Improving Constituency Parsing with Span Attention. In Findings of the 2020 Conference on Empirical Methods in Natural Language Processing.
    Google ScholarLocate open access versionFindings
您的评分 :
0

 

标签
评论
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科