AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
We introduce incremental learning into event detection and propose a knowledge consolidation network to preserve previously learned knowledge

Incremental Event Detection via Knowledge Consolidation Networks

EMNLP 2020, pp.707-717, (2020)

Cited by: 0|Views239
Full Text
Bibtex
Weibo

Abstract

Conventional approaches to event detection usually require a fixed set of pre-defined event types. Such a requirement is often challenged in real-world applications, as new events continually occur. Due to huge computation cost and storage budge, it is infeasible to store all previous data and re-train the model with all previous data and...More

Code:

Data:

0
Introduction
  • Event detection (ED) is an important task of information extraction, which aims to identify event triggers and classify them into specific types (Ahn, 2006; Chen et al, 2015).
  • The authors consider a more realistic incremental learning setting (Ring, 1994; Thrun, 1998), where a learning system learns from class-incremental data streams in which examples of different event classes arrive at different times, which can be shown in Figure 1
  • In such scenarios, it is often infeasible to combine the new data with all previous data to re-train the model, due to various issues such as huge computation cost, storage budget and data privacy (McMahan et al, 2017)
Highlights
  • Event detection (ED) is an important task of information extraction, which aims to identify event triggers and classify them into specific types (Ahn, 2006; Chen et al, 2015)
  • We consider a more realistic incremental learning setting (Ring, 1994; Thrun, 1998), where a learning system learns from class-incremental data streams in which examples of different event classes arrive at different times, which can be shown in Figure 1
  • We propose a Knowledge Consolidation Network (KCN) for incremental event detection, which is illustrated in Figure 3
  • Every time the model finishes training on the new classes data, we report the F1 score on the test data of all observed classes
  • We introduce incremental learning into event detection and propose a knowledge consolidation network to preserve previously learned knowledge
  • Experimental results show that our method outperforms previous state-of-the-art models, achieving 19% and 13.4% improvements of whole F1 score on the Automatic Content Extraction (ACE) benchmark and TAC KBP benchmark, respectively
  • To mitigate the adverse effect of class imbalance, we propose the hierarchical distillation to learn the previous knowledge from the original model
Methods
  • The authors propose a Knowledge Consolidation Network (KCN) for incremental event detection, which is illustrated in Figure 3.
  • The model consists of three important components: 1) Trigger Extractor, 2) Prototype Enhanced Retrospection and 3) Hierarchical Distillation.
  • The hierarchical distillation transfers the previous knowledge from the original model to the current model.
  • The authors will detail these three components.
  • The gap between these incremental learning methods and the UpperBound model indicates that incremental event detection is a very challenging task
Results
  • Evaluation Metrics and Implementation Details

    For conventional event detection, Precision (P), Recall (R) and F1 score are used as evaluation metrics.

    For incremental event detection, every time the model finishes training on the new classes data, the authors report the F1 score on the test data of all observed classes.
  • Precision (P), Recall (R) and F1 score are used as evaluation metrics.
  • Every time the model finishes training on the new classes data, the authors report the F1 score on the test data of all observed classes.
  • After time step k, the result is denoted as F 1k.
  • These results can be plotted as a curve.
Conclusion
  • 4.6.1 The effect of the number of reserved samples

    To reserve a few samples has proven much helpful to maintain the performance on old classes (Rebuffi et al, 2017; Wang et al, 2019a).
  • The authors sample some reserved examples of Die event type of the approach and EMR during training to conduct a case study for qualitatively analyzing the effects of the method.
  • It is shown in Figure 5.
  • Experimental results demonstrate that the model outperforms previous state-of-the-art methods
Summary
  • Introduction:

    Event detection (ED) is an important task of information extraction, which aims to identify event triggers and classify them into specific types (Ahn, 2006; Chen et al, 2015).
  • The authors consider a more realistic incremental learning setting (Ring, 1994; Thrun, 1998), where a learning system learns from class-incremental data streams in which examples of different event classes arrive at different times, which can be shown in Figure 1
  • In such scenarios, it is often infeasible to combine the new data with all previous data to re-train the model, due to various issues such as huge computation cost, storage budget and data privacy (McMahan et al, 2017)
  • Objectives:

    The authors treat every token in it as a trigger candidate, and the authors aim to classify each candidate into pre-defined event classes.
  • Methods:

    The authors propose a Knowledge Consolidation Network (KCN) for incremental event detection, which is illustrated in Figure 3.
  • The model consists of three important components: 1) Trigger Extractor, 2) Prototype Enhanced Retrospection and 3) Hierarchical Distillation.
  • The hierarchical distillation transfers the previous knowledge from the original model to the current model.
  • The authors will detail these three components.
  • The gap between these incremental learning methods and the UpperBound model indicates that incremental event detection is a very challenging task
  • Results:

    Evaluation Metrics and Implementation Details

    For conventional event detection, Precision (P), Recall (R) and F1 score are used as evaluation metrics.

    For incremental event detection, every time the model finishes training on the new classes data, the authors report the F1 score on the test data of all observed classes.
  • Precision (P), Recall (R) and F1 score are used as evaluation metrics.
  • Every time the model finishes training on the new classes data, the authors report the F1 score on the test data of all observed classes.
  • After time step k, the result is denoted as F 1k.
  • These results can be plotted as a curve.
  • Conclusion:

    4.6.1 The effect of the number of reserved samples

    To reserve a few samples has proven much helpful to maintain the performance on old classes (Rebuffi et al, 2017; Wang et al, 2019a).
  • The authors sample some reserved examples of Die event type of the approach and EMR during training to conduct a case study for qualitatively analyzing the effects of the method.
  • It is shown in Figure 5.
  • Experimental results demonstrate that the model outperforms previous state-of-the-art methods
Tables
  • Table1: The average F1 (%) on all observed classes (“Avg” column), and whole F1 (%) on the whole testing data (“Whole” column) after the last time step
  • Table2: Ablation studies by removing the main components, where “w/o” indicates without. “PS”, “FD”, “PD” and “HD” refer to “prototype-based selection”, “features-level distillation”, “predictions-level distillation” and “hierarchical distillation”, respectively. Actually, “HD” is the combination of “FD” and “PD”
  • Table3: The effect of the number of reserved samples. We compare our method KCN with the replay-based method EMR on the ACE benchmark
Download tables as Excel
Related work
Funding
  • This work is supported by the Natural Key R&D Program of China (No.2017YFB1002101), the National Natural Science Foundation of China (No.61533018, No.61976211, No.61806201) and the Key Research Program of the Chinese Academy of Sciences (Grant NO
  • This work is also supported by the CCF-Tencent Open Research Fund, a grant from Ant Group and independent research project of National Laboratory of Pattern Recognition
Reference
  • David Ahn. 2006. The stages of event extraction. In Proceedings of the Workshop on Annotating and Reasoning about Time and Events, pages 1–8.
    Google ScholarLocate open access versionFindings
  • to forget. In Proceedings of the European Conference on Computer Vision, pages 139–154.
    Google ScholarLocate open access versionFindings
  • Francisco M Castro, Manuel J Marın-Jimenez, Nicolas Guil, Cordelia Schmid, and Karteek Alahari. 2018. End-to-end incremental learning. In Proceedings of the European Conference on Computer Vision, pages 233–248.
    Google ScholarLocate open access versionFindings
  • Gert Cauwenberghs and Tomaso Poggio. 2001. Incremental and decremental support vector machine learning. In Advances in neural information processing systems, pages 409–415.
    Google ScholarLocate open access versionFindings
  • Yubo Chen, Liheng Xu, Kang Liu, Daojian Zeng, and Jun Zhao. 201Event extraction via dynamic multipooling convolutional neural networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 167–176.
    Google ScholarLocate open access versionFindings
  • Yubo Chen, Hang Yang, Kang Liu, Jun Zhao, and Yantao Jia. 2018. Collective event detection via a hierarchical and bias tagging networks with gated multilevel attention mechanisms. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 1267–1276.
    Google ScholarLocate open access versionFindings
  • Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1, pages 4171–4186.
    Google ScholarLocate open access versionFindings
  • Ning Ding, Ziran Li, Zhiyuan Liu, Haitao Zheng, and Zibo Lin. 2019. Event detection with trigger-aware lattice neural network. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, pages 347–356.
    Google ScholarLocate open access versionFindings
  • Robert M French. 199Catastrophic forgetting in connectionist networks. Trends in cognitive sciences, 3(4):128–135.
    Google ScholarLocate open access versionFindings
  • Xu Han, Yi Dai, Tianyu Gao, Yankai Lin, Zhiyuan Liu, Peng Li, Maosong Sun, and Jie Zhou. 2020. Continual relation learning via episodic memory activation and reconsolidation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 6429–6440.
    Google ScholarLocate open access versionFindings
  • Haibo He and Edwardo A Garcia. 2009. Learning from imbalanced data. IEEE Transactions on knowledge and data engineering, 21(9):1263–1284.
    Google ScholarLocate open access versionFindings
  • Rahaf Aljundi, Francesca Babiloni, Mohamed Elho- Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015.
    Google ScholarFindings
  • 2018. Memory aware synapses: Learning what (not)
    Google ScholarFindings
  • Yu Hong, Jianfeng Zhang, Bin Ma, Jianmin Yao, Guodong Zhou, and Qiaoming Zhu. 2011. Using cross-entity inference to improve event extraction. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Volume 1, pages 1127– 1136. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Saihui Hou, Xinyu Pan, Chen Change Loy, Zilei Wang, and Dahua Lin. 2018. Lifelong learning via progressive distillation and retrospection. In Proceedings of the European Conference on Computer Vision, pages 437–452.
    Google ScholarLocate open access versionFindings
  • Saihui Hou, Xinyu Pan, Chen Change Loy, Zilei Wang, and Dahua Lin. 2019. Learning a unified classifier incrementally via rebalancing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 831–839.
    Google ScholarLocate open access versionFindings
  • Heng Ji and Ralph Grishman. 2008. Refining event extraction through cross-document inference. In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics, pages 254–262.
    Google ScholarLocate open access versionFindings
  • James Kirkpatrick, Razvan Pascanu, Neil Rabinowitz, Joel Veness, Guillaume Desjardins, Andrei A Rusu, Kieran Milan, John Quan, Tiago Ramalho, Agnieszka Grabska-Barwinska, et al. 2017. Overcoming catastrophic forgetting in neural networks. Proceedings of the national academy of sciences, 114(13):3521–3526.
    Google ScholarLocate open access versionFindings
  • Ilja Kuzborskij, Francesco Orabona, and Barbara Caputo. 2013. From n to n+ 1: Multiclass transfer incremental learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3358–3365.
    Google ScholarLocate open access versionFindings
  • Qi Li and Heng Ji. 2014. Incremental joint extraction of entity mentions and relations. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, Volume 1, pages 402–412.
    Google ScholarLocate open access versionFindings
  • Qi Li, Heng Ji, and Liang Huang. 2013. Joint event extraction via structured prediction with global features. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, Volume 1, pages 73–82.
    Google ScholarLocate open access versionFindings
  • Zhizhong Li and Derek Hoiem. 2017. Learning without forgetting. IEEE transactions on pattern analysis and machine intelligence, 40(12):2935–2947.
    Google ScholarLocate open access versionFindings
  • Shasha Liao and Ralph Grishman. 2010. Using document level cross-event inference to improve event extraction. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pages 789–797. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Jian Liu, Yubo Chen, Kang Liu, and Jun Zhao. 2019. Neural cross-lingual event detection with minimal parallel resources. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language
    Google ScholarLocate open access versionFindings
  • Shaobo Liu, Rui Cheng, Xiaoming Yu, and Xueqi Cheng. 2018a. Exploiting contextual information via dynamic memory network for event detection. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 1030–1035.
    Google ScholarLocate open access versionFindings
  • Shulin Liu, Yubo Chen, Kang Liu, and Jun Zhao. 2017. Exploiting argument information to improve event detection via supervised attention mechanisms. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Volume 1, pages 1789–1798.
    Google ScholarLocate open access versionFindings
  • Wei Liu, Gang Hua, and John R Smith. 2014. Unsupervised one-class learning for automatic outlier removal. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3826–3833.
    Google ScholarLocate open access versionFindings
  • Xiao Liu, Zhunchen Luo, and He-Yan Huang. 2018b. Jointly multiple events extraction via attentionbased graph information aggregation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 1247–1256.
    Google ScholarLocate open access versionFindings
  • Yaojie Lu, Hongyu Lin, Xianpei Han, and Le Sun. 2019. Distilling discrimination and generalization knowledge for event detection via deltarepresentation learning. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 4366–4376.
    Google ScholarLocate open access versionFindings
  • Michael McCloskey and Neal J Cohen. 1989. Catastrophic interference in connectionist networks: The sequential learning problem. In Psychology of learning and motivation, volume 24, pages 109–165. Elsevier.
    Google ScholarLocate open access versionFindings
  • Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. 2017. Communication-efficient learning of deep networks from decentralized data. In Artificial Intelligence and Statistics, pages 1273–1282.
    Google ScholarLocate open access versionFindings
  • Thien Huu Nguyen, Kyunghyun Cho, and Ralph Grishman. 2016. Joint event extraction via recurrent neural networks. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 300–309.
    Google ScholarLocate open access versionFindings
  • Thien Huu Nguyen and Ralph Grishman. 2015. Event detection and domain adaptation with convolutional neural networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing Volume 2, pages 365–371.
    Google ScholarLocate open access versionFindings
  • Sylvestre-Alvise Rebuffi, Alexander Kolesnikov, Georg Sperl, and Christoph H Lampert. 2017. icarl: Incremental classifier and representation learning. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pages 2001–2010.
    Google ScholarLocate open access versionFindings
  • Mark Bishop Ring. 1994. Continual learning in reinforcement environments.
    Google ScholarFindings
  • Jake Snell, Kevin Swersky, and Richard Zemel. 2017. Prototypical networks for few-shot learning. In Advances in neural information processing systems, pages 4077–4087.
    Google ScholarLocate open access versionFindings
  • Sebastian Thrun. 1998. Lifelong learning algorithms. In Learning to learn, pages 181–209. Springer.
    Google ScholarFindings
  • Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems, pages 5998–6008.
    Google ScholarLocate open access versionFindings
  • Hong Wang, Wenhan Xiong, Mo Yu, Xiaoxiao Guo, Shiyu Chang, and William Yang Wang. 2019a. Sentence embedding alignment for lifelong relation extraction. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1, pages 796–806.
    Google ScholarLocate open access versionFindings
  • Xiaozhi Wang, Xu Han, Zhiyuan Liu, Maosong Sun, and Peng Li. 2019b. Adversarial training for weakly supervised event detection. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1, pages 998– 1008.
    Google ScholarLocate open access versionFindings
  • Yue Wu, Yinpeng Chen, Lijuan Wang, Yuancheng Ye, Zicheng Liu, Yandong Guo, and Yun Fu. 2019. Large scale incremental learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 374–382.
    Google ScholarLocate open access versionFindings
  • Hong-Ming Yang, Xu-Yao Zhang, Fei Yin, and ChengLin Liu. 2018. Robust classification with convolutional prototype learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3474–3482.
    Google ScholarLocate open access versionFindings
  • Sen Yang, Dawei Feng, Linbo Qiao, Zhigang Kan, and Dongsheng Li. 2019. Exploring pre-trained language models for event extraction and generation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 5284– 5294.
    Google ScholarLocate open access versionFindings
  • Friedemann Zenke, Ben Poole, and Surya Ganguli. 2017. Continual learning through synaptic intelligence. In Proceedings of the 34th International Conference on Machine Learning, Volume 70, pages 3987–3995. JMLR. org.
    Google ScholarLocate open access versionFindings
Author
Pengfei Cao
Pengfei Cao
Taifeng Wang
Taifeng Wang
Your rating :
0

 

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科