AI helps you reading Science
AI generates interpretation videos
AI extracts and analyses the key points of the paper to generate videos automatically
AI parses the academic lineage of this thesis
AI extracts a summary of this paper
We conduct intrinsic and extrinsic evaluations to show that Analogous Process Structure Induction can generate meaningful sub-event sequences for unseen processes, which can help predict the missing events
Analogous Process Structure Induction for Sub event Sequence Prediction
EMNLP 2020, pp.1541-1550, (2020)
Computational and cognitive studies of event understanding suggest that identifying, comprehending, and predicting events depend on having structured representations of a sequence of events and on conceptualizing (abstracting) its components into (soft) event categories. Thus, knowledge about a known process such as “buying a car” can be ...More
PPT (Upload PPT)
- Understanding events has long been a challenging task in NLP, to which many efforts have been devoted by the community.
- Examples include predicting the event given an observed event sequence (Radinsky et al, 2012) and identifying the effect of a biological process on involved entities (Berant et al, 2014).
- These tasks mostly focus on predicting related events in a procedure based on their statistical correlations in previously observed text.
- Selecting the most frequently cooccurring event can offer acceptable performance on the event prediction task (Granroth-Wilding and Clark, 2016)
- Understanding events has long been a challenging task in NLP, to which many efforts have been devoted by the community
- We conduct intrinsic and extrinsic evaluations to show that Analogous Process Structure Induction (APSI) can generate meaningful sub-event sequences for unseen processes, which can help predict the missing events
- Similar to ROUGE, which evaluates the generation quality based on N-gram token occurrence, we evaluate how much percentage of the sub-event and time-ordered sub-event pairs in the induced sequence is covered by the human-provided references
- In the rest of the intrinsic evaluation, we present more detailed analysis based on the advanced setting and a case study to help better understand the performance of APSI
- Our APSI framework is motivated by the notion of analogous processes, and attempts to transfer knowledge from familiar processes to a new one
- The intrinsic evaluation demonstrates the effectiveness of APSI and the quality of the predicted sub-event sequences
- The authors compare with the following baseline methods: Sequence to sequence (Seq2seq): One intuitive solution to the sub-event sequence prediction task would be modeling it as a sequence to sequence problem, where the process is treated as the input and the sub-event sequence the output.
- For each process or sub-event, the authors leverage pre-trained word embeddings (i.e., GloVe-6b-300d (Pennington et al, 2014)) or language models (i.e., RoBERTa-base (Liu et al, 2019)) as the representation, which are denoted as Seq2seq (GloVe) and Seq2seq (RoBERTa).
- The authors can use the sub-event sequence of the observed process as the prediction.
- The authors denote them as Top one similar process (Jaccard), (GloVe), and (RoBERTa), respectively
- The authors conduct intrinsic and extrinsic evaluations to show that APSI can generate meaningful sub-event sequences for unseen processes, which can help predict the missing events. 3.1 Dataset
The authors collect process graphs from the WikiHow website7 (Koupaee and Wang, 2018).
- In WikiHow, each process is associated with a sequence of temporally ordered human-created steps.
- Motivated by the ROUGE score (Lin, 2004), the authors propose an event-based ROUGE (E-ROUGE) to evaluate the quality of the predicted sub-event sequence.
- Similar to ROUGE, which evaluates the generation quality based on N-gram token occurrence, the authors evaluate how much percentage of the sub-event and time-ordered sub-event pairs in the induced sequence is covered by the human-provided references.
- The authors try to understand events vertically by viewing them as processes and predicting their sub-event sequences.
- The authors' APSI framework is motivated by the notion of analogous processes, and attempts to transfer knowledge from familiar processes to a new one.
- The intrinsic evaluation demonstrates the effectiveness of APSI and the quality of the predicted sub-event sequences.
- The extrinsic evaluation shows that, even with a naive application method, the process knowledge can help better predict missing events
- Table1: Intrinsic evaluation results of the induced process structures. On average, we have 1.7 human-generated sub-event sequences as the references for each test process. Best performing models are marked with the bold font
- Table2: Performance of different merging methods
- Table3: Results on the event prediction task. † and ‡ indicate the statistical significance over the baseline with p-value smaller than 0.01 and 0.001 respectively
- Throughout history, considering the importance of events in understanding human language (e.g., commonsense knowledge (Zhang et al, 2020a)), many efforts have been devoted to define, represent, and understand events. For example, VerbNet (Schuler, 2005) created a verb lexicon to represent the semantic relations among verbs. After that, FrameNet (Baker et al, 1998) proposed to represent the event semantics with schemas, which has one predicate and several arguments. Apart from the structure of events, understanding events by predicting relations among them also becomes a popular research topic (e.g., TimeBank (Pustejovsky et al, 2003) for temporal relations and Event2Mind (Rashkin et al, 2018) for causal relations). Different from these horizontal relations between events, in this paper, we propose to understand event vertically by treating each event as a process and trying to understand what is happening (i.e., sub-event) inside the target event. Such knowledge is also referred to as event schemata (Zacks and Tversky, 2001) and shown crucial for how humans understand events (Abbott et al, 1985). One line of related works in the NLP community is extracting super-sub event relations from textual corpus (Hovy et al, 2013; Glavas et al, 2014). The difference between this work and them is that we are trying to understand events by directly generating the sub-event sequences rather than extracting such information from text. Another line of related works is the narrative schema prediction (Chambers and Jurafsky, 2008), which also holds the assumption that event schemata can help understand events. But their research focus is using the overall process implicitly to help predict future events while this work tries to understand events by knowing the relation between processes and their sub-event sequences explicitly.
- This research is supported by the Office of the Director of National Intelligence (ODNI), Intelligence Advanced Research Projects Activity (IARPA), via IARPA Contract No 201919051600006 under the BETTER Program, and by contract FA8750-19-2-1004 with the US Defense Advanced Research Projects Agency (DARPA)
- This paper is also partially supported by Early Career Scheme (ECS, No 26206717), General Research Fund (GRF, No 16211520), and Research Impact Fund (RIF, No R6020-19) from the Research Grants Council (RGC) of Hong Kong
- Valerie Abbott, John B Black, and Edward E Smith. 1985. The representation of scripts in memory. Journal of memory and language, pages 179–199.
- Collin F. Baker, Charles J. Fillmore, and John B. Lowe. 1998. The Berkeley FrameNet Project. In Proceedings of COLING-ACL 1998, pages 86–90.
- Jonathan Berant, Vivek Srikumar, Pei-Chun Chen, Abby Vander Linden, Brittany Harding, Brad Huang, Peter Clark, and Christopher D. Manning. 2014. Modeling biological processes for reading comprehension. In Proceedings of EMNLP 2014.
- Alexander Budanitsky and Graeme Hirst. 2006. Evaluating wordnet-based measures of lexical semantic relatedness. Comput. Linguistics, 32(1):13–47.
- Nathanael Chambers and Daniel Jurafsky. 2008. Unsupervised learning of narrative event chains. In Proceedings of ACL 2008, pages 789–797.
- Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: pre-training of deep bidirectional transformers for language understanding. In Proceedings of NAACL-HLT 2019, pages 4171–4186.
- Goran Glavas, Jan Snajder, Marie-Francine Moens, and Parisa Kordjamshidi. 2014. Hieve: A corpus for extracting event hierarchies from news stories. In Proceedings of LREC 2014, pages 3678–3683.
- Mark Granroth-Wilding and Stephen Clark. 2016. What happens next? event prediction using a compositional neural network model. In Proceedings of AAAI 2016, pages 2727–2733.
- Eduard H. Hovy, Teruko Mitamura, Felisa Verdejo, Jun Araki, and Andrew Philpot. 2013. Events are not simple: Identity, non-identity, and quasi-identity. In Proceedings of EVENTS@NAACL-HLT 2013, pages 21–28.
- Richard M. Karp. 1972. Reducibility among combinatorial problems. In Proceedings of a symposium on the Complexity of Computer Computations 1972, pages 85–103.
- Mahnaz Koupaee and William Yang Wang. 2018. Wikihow: A large scale text summarization dataset. CoRR, abs/1810.09305.
- Chin-Yew Lin. 2004. ROUGE: A package for automatic evaluation of summaries. In Proceedings of Text Summarization Branches Out 2004, pages 74– 81.
- Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A robustly optimized BERT pretraining approach. CoRR, abs/1907.11692.
- Songjian Lu and Xinghua Lu. 20An exact algorithm for the weighed mutually exclusive maximum set cover problem. CoRR, abs/1401.6385.
- George A Miller. 1998. WordNet: An electronic lexical database. MIT press.
- Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. Glove: Global vectors for word representation. In Proceedings of EMNLP 2014, pages 1532–1543.
- James Pustejovsky, Patrick Hanks, Roser Sauri, Andrew See, Robert Gaizauskas, Andrea Setzer, Dragomir Radev, Beth Sundheim, David Day, Lisa Ferro, et al. 2003. The timebank corpus. In Corpus linguistics, page 40.
- Kira Radinsky, Sagie Davidovich, and Shaul Markovitch. 2012. Learning causality for news events prediction. In Proceedings of the 21st World Wide Web Conference 2012, WWW 2012, Lyon, France, April 16-20, 2012, pages 909–918.
- Hannah Rashkin, Maarten Sap, Emily Allaway, Noah A. Smith, and Yejin Choi. 2018. Event2mind: Commonsense inference on events, intents, and reactions. In Proceedings of ACL 2018, pages 463– 473.
- DE Rumelhart. 1975. Notes on a schema for stories language, thought, and culture. Representation and understanding, pages 211–236.
- Roger C Schank and Robert P Abelson. 1977.
- Karin Kipper Schuler. 2005. Verbnet: A broadcoverage, comprehensive verb lexicon.
- Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to sequence learning with neural networks. In Proceedings of NeurIPS 2014, pages 3104–3112.
- Jeffrey M Zacks and Barbara Tversky. 2001. Event structure in perception and conception. Psychological bulletin, 127(1):3.
- Rowan Zellers, Ari Holtzman, Yonatan Bisk, Ali Farhadi, and Yejin Choi. 2019. Hellaswag: Can a machine really finish your sentence? In Proceedings of ACL 2019, pages 4791–4800.
- Hongming Zhang, Daniel Khashabi, Yangqiu Song, and Dan Roth. 2020a. Transomcs: From linguistic graphs to commonsense knowledge. In Proceedings of IJCAI 2020, pages 4004–4010.
- Hongming Zhang, Xin Liu, Haojie Pan, Yangqiu Song, and Cane Wing-Ki Leung. 2020b. ASER: A largescale eventuality knowledge graph. In Proceedings of WWW 2020, pages 201–211.