Keywords-Guided Abstractive Sentence Summarization
national conference on artificial intelligence, 2020.
Weibo:
Abstract:
We study the problem of generating a summary for a given sentence. Existing researches on abstractive sentence summarization ignore that keywords in the input sentence provide significant clues for valuable content, and humans tend to write summaries covering these keywords. In this paper, we propose an abstractive sentence summarization ...More
Code:
Data:
Introduction
- Sentence summarization is a task that creates a condensed version of a long sentence1.
- Abstractive summarization is much closer to the way human make a summary, while it is more challenging.
- Observation: Input sentence: France and Germany called on world leaders Monday to take rapid action to press for the closure of Ukraine 's Chernobyl nuclear plant , site of the world 's worst ever nuclear disaster.
- Extracting keywords: world leaders closure Chernobyl Step2.
- Generating summary guided by the keywords: World leaders called for action on Chernobyl closure
Highlights
- Sentence summarization is a task that creates a condensed version of a long sentence1
- We propose an abstractive sentence summarization method guided by the keywords in the original sentence
- This paper addresses the sentence summarization task, namely, how to transform a sentence into a short-length summary
- We propose a dual-copy mechanism to copy the words from the input sentence and the keywords
- Experimental results on standard dataset verify the effectiveness of keywords for sentence summarization task
- The results are shown in Table 2. With this oracle setting, ROUGE score improvements are more than 20% over seq2seq model
- Oracle testing with ground-truth keywords leads to absolute 20% ROUGE-2 score improvement over baseline, indicating a promising future direction based on keyword extraction for sentence summarization task
Methods
- ABS. Rush, Chopra, and Weston (2015) use an attentive CNN encoder and neural network language model decoder to summarize a sentence.
SEASS. - Rush, Chopra, and Weston (2015) use an attentive CNN encoder and neural network language model decoder to summarize a sentence.
- Zhou et al (2017) present a selective encoding model to control the information flow from the encoder to the decoder.
- See, Liu, and Manning (2017) introduce a hybrid pointer-generator model that can copy words from the source sentence via pointing
- PG. See, Liu, and Manning (2017) introduce a hybrid pointer-generator model that can copy words from the source sentence via pointing
Results
- Table 1 shows that the proposed models perform better than the models without keyword guidance.
- Among the models with different selective mechanisms, the experimental results are improved steadily as more keyword guidance signals added into the models, from Self-selective to CoSelective.
- The models with Hierarchical fusion exhibit advantages over those with other fusions.
- The model with Co-Selective encoding, Hierarchical fusion decoding, and DualCopy obtains the highest ROUGE score, which outperforms S2S-Sentence absolute 2.05% ROUGE-1 score, Method R-1 R-2 R-L.
- S2SKeywords with only the keywords as the input degrades the performance, showing that missing information from the keywords is necessary
Conclusion
- To get better insights into the model, the authors conduct further analysis on (1) upper bound performance, (2) multi-task learning, (3) fine-tuning, and (4) selective encoding mechanism.
Upper Bound Performance
The authors explore the upper bound performance for the keywordguided sentence summarization model. - To get better insights into the model, the authors conduct further analysis on (1) upper bound performance, (2) multi-task learning, (3) fine-tuning, and (4) selective encoding mechanism.
- The authors explore the upper bound performance for the keywordguided sentence summarization model.
- The authors do this by directly using the ground-truth keywords for both training and.
- Oracle testing with ground-truth keywords leads to absolute 20% ROUGE-2 score improvement over baseline, indicating a promising future direction based on keyword extraction for sentence summarization task
Summary
Introduction:
Sentence summarization is a task that creates a condensed version of a long sentence1.- Abstractive summarization is much closer to the way human make a summary, while it is more challenging.
- Observation: Input sentence: France and Germany called on world leaders Monday to take rapid action to press for the closure of Ukraine 's Chernobyl nuclear plant , site of the world 's worst ever nuclear disaster.
- Extracting keywords: world leaders closure Chernobyl Step2.
- Generating summary guided by the keywords: World leaders called for action on Chernobyl closure
Methods:
ABS. Rush, Chopra, and Weston (2015) use an attentive CNN encoder and neural network language model decoder to summarize a sentence.
SEASS.- Rush, Chopra, and Weston (2015) use an attentive CNN encoder and neural network language model decoder to summarize a sentence.
- Zhou et al (2017) present a selective encoding model to control the information flow from the encoder to the decoder.
- See, Liu, and Manning (2017) introduce a hybrid pointer-generator model that can copy words from the source sentence via pointing
- PG. See, Liu, and Manning (2017) introduce a hybrid pointer-generator model that can copy words from the source sentence via pointing
Results:
Table 1 shows that the proposed models perform better than the models without keyword guidance.- Among the models with different selective mechanisms, the experimental results are improved steadily as more keyword guidance signals added into the models, from Self-selective to CoSelective.
- The models with Hierarchical fusion exhibit advantages over those with other fusions.
- The model with Co-Selective encoding, Hierarchical fusion decoding, and DualCopy obtains the highest ROUGE score, which outperforms S2S-Sentence absolute 2.05% ROUGE-1 score, Method R-1 R-2 R-L.
- S2SKeywords with only the keywords as the input degrades the performance, showing that missing information from the keywords is necessary
Conclusion:
To get better insights into the model, the authors conduct further analysis on (1) upper bound performance, (2) multi-task learning, (3) fine-tuning, and (4) selective encoding mechanism.
Upper Bound Performance
The authors explore the upper bound performance for the keywordguided sentence summarization model.- To get better insights into the model, the authors conduct further analysis on (1) upper bound performance, (2) multi-task learning, (3) fine-tuning, and (4) selective encoding mechanism.
- The authors explore the upper bound performance for the keywordguided sentence summarization model.
- The authors do this by directly using the ground-truth keywords for both training and.
- Oracle testing with ground-truth keywords leads to absolute 20% ROUGE-2 score improvement over baseline, indicating a promising future direction based on keyword extraction for sentence summarization task
Tables
- Table1: Main results (%). Concat, Gated, and Hier denote Concatenation, Gated, and Hierarchical Fusion, respectively. Our Co-Selective+Hier+DualPG model performs significantly better than other baselines by the 95% confidence interval in the ROUGE script
- Table2: Upper bound performances with the ground-truth keywords for both training and testing
- Table3: Heat maps for our model with co-selective and self-selective encoding
- Table4: Comparison of MTL and two-stage learning for keyword extraction. Accuracy is for all the words, and F1 score is for the keywords
- Table5: Manual evaluation
Related work
- Abstractive Text Summarization
Seq2seq model is the dominating framework for abstractive text summarization. Rush, Chopra, and Weston (2015) are the first to apply the seq2seq model to abstractive sentence summarization. They propose an attentive CNN encoder and a neural network language model (Bengio et al 2003) decoder. Chopra, Auli, and Rush (2016) and Nallapati et al (2016) further extend the RNN-based summarization model. Gu et al (2016), Zeng et al (2016) and yt-1 yt s0 ... st-1 st Vocabulary ct Distribution Dual Copy
Sentence Attention Distribution
Dual crt Attention Keyword ckt Attention
... hr'1 hr'2 hr'3 hr'n
Keyword Extractor hr1 hr2 x1 x2
Funding
- The research work descried in this paper has been supported by the National Key Research and Development Program of China under Grant No 2017YFC0820700
Reference
- Following Zhou et al. (2017), we visualize the selective gate values with salience heat maps shown in Table 3. For our model with co-selective gate, the important words are selected correctly by the aid of keywords, while the output of the model with self-selective gate is mismatched with the reference summary because of inaccurate selective values.
- Bahdanau, D.; Cho, K.; and Bengio, Y. 2015. Neural machine translation by jointly learning to align and translate. In Proceedings of ICLR.
- Bengio, Y.; Ducharme, R.; Vincent, P.; and Janvin, C. 200A neural probabilistic language model. JMLR 3:1137–1155.
- Cao, Z.; Wei, F.; Li, W.; and Li, S. 2018. Faithful to the original: Fact aware neural abstractive summarization. In Proceedings of AAAI.
- Chen, Q.; Zhu, X.; Ling, Z.; Wei, S.; and Jiang, H. 2016. Distraction-based neural networks for modeling documents. In Proceedings of IJCAI, 2754–2760.
- Cheng, J., and Lapata, M. 201Neural summarization by extracting sentences and words. In Proceedings of ACL.
- Chopra, S.; Auli, M.; and Rush, A. M. 2016. Abstractive sentence summarization with attentive recurrent neural networks. In Proceedings of NAACL, 93–98.
- Clarke, J. 200Global inference for sentence compression: an integer linear programming approach. Ph.D. Dissertation, University of Edinburgh, UK.
- Dong, Y.; Shen, Y.; Crawford, E.; van Hoof, H.; and Cheung, J. C. K. 2018. Banditsum: Extractive summarization as a contextual bandit. In Proceedings of EMNLP, 3739–3748.
- Gehrmann, S.; Deng, Y.; and Rush, A. 2018. Bottom-up abstractive summarization. In Proceedings of EMNLP.
- Gu, J.; Lu, Z.; Li, H.; and Li, V. O. 2016. Incorporating copying mechanism in sequence-to-sequence learning. In Proceedings of ACL, 1631–1640.
- Gulcehre, C.; Ahn, S.; Nallapati, R.; Zhou, B.; and Bengio, Y. 2016. Pointing the unknown words. In Proceedings of ACL, 140– 149.
- Huang, Z.; Xu, W.; and Yu, K. 2015. Bidirectional LSTM-CRF models for sequence tagging. arXiv:1508.01991.
- Jadhav, A., and Rajan, V. 2018. Extractive summarization with SWAP-NET: Sentences and words from alternating pointer networks. In Proceedings of ACL.
- Lebanoff, L.; Song, K.; and Liu, F. 2018. Adapting the neural encoder-decoder framework from single to multi-document summarization. In Proceedings of EMNLP.
- Li, C.; Xu, W.; Li, S.; and Gao, S. 2018a. Guiding generation for abstractive text summarization based on key information guide network. In Proceedings of NAACL.
- Li, H.; Zhu, J.; Zhang, J.; and Zong, C. 2018b. Ensure the correctness of the summary: Incorporate entailment knowledge into abstractive sentence summarization. In Proceedings of COLING, 1430–1441.
- Lin, C.-Y. 2004. Rouge: A package for automatic evaluation of summaries. In Text Summarization Branches Out.
- Mihalcea, R., and Tarau, P. 2004. Textrank: Bringing order into text. In Proceedings of EMNLP.
- Nallapati, R.; Zhou, B.; dos Santos, C.; Gulcehre, C.; and Xiang, B. 2016. Abstractive text summarization using sequence-to-sequence rnns and beyond. In Proceedings of CoNLL, 280–290.
- Narayan, S.; Cohen, S. B.; and Lapata, M. 2018. Don’t give me the details, just the summary! topic-aware convolutional neural networks for extreme summarization. In Proceedings of EMNLP.
- Pascanu, R.; Mikolov, T.; and Bengio, Y. 2013. On the difficulty of training recurrent neural networks. In Proceedings of ICML, 1310–1318.
- Pasunuru, R.; Guo, H.; and Bansal, M. 2017. Towards improving abstractive summarization via entailment generation. In Proceedings of the Workshop on New Frontiers in Summarization.
- Rush, A. M.; Chopra, S.; and Weston, J. 2015. A neural attention model for abstractive sentence summarization. In Proceedings of EMNLP, 379–389.
- Saggion, H., and Lapalme, G. 2002. Generating indicativeinformative summaries with SumUM. Computational Linguistics 28(4):497–526.
- See, A.; Liu, P. J.; and Manning, C. D. 2017. Get to the point: Summarization with pointer-generator networks. In Proceedings of ACL, 1073–1083.
- Srivastava, N.; Hinton, G. E.; Krizhevsky, A.; Sutskever, I.; and Salakhutdinov, R. 2014. Dropout: a simple way to prevent neural networks from overfitting. JMLR.
- Takase, S.; Suzuki, J.; Okazaki, N.; Hirao, T.; and Nagata, M. 2016. Neural headline generation on abstract meaning representation. In Proceedings of EMNLP.
- Tan, J.; Wan, X.; and Xiao, J. 2017. Abstractive document summarization with a graph-based attentional neural model. In Proceedings of ACL, 1171–1181.
- Vinyals, O.; Fortunato, M.; and Jaitly, N. 2015. Pointer networks. In NeurIPS, 2692–2700.
- Wan, X.; Yang, J.; and Xiao, J. 2007. Towards an iterative reinforcement approach for simultaneous document summarization and keyword extraction. In Proceedings of ACL.
- Wang, L., and Cardie, C. 2013. Domain-independent abstract generation for focused meeting summarization. In Proceedings of ACL.
- Zeng, W.; Luo, W.; Fidler, S.; and Urtasun, R. 2016. Efficient summarization with read-again and copy mechanism. arXiv:1611.03382.
- Zhang, X.; Lapata, M.; Wei, F.; and Zhou, M. 2018. Neural latent extractive document summarization. In Proceedings of EMNLP.
- Zhang, Y.; Zincir-Heywood, A. N.; and Milios, E. E. 2004. World wide web site summarization. Web Intelligence and Agent Systems 2(1):39–53.
- Zhou, Q.; Yang, N.; Wei, F.; and Zhou, M. 2017. Selective encoding for abstractive sentence summarization. In Proceedings of ACL, 1095–1104.
- Zhu, J.; Wang, Q.; Wang, Y.; Zhou, Y.; Zhang, J.; Wang, S.; and Zong, C. 2019. NCLS: Neural cross-lingual summarization. In Proceedings of EMNLP, 3045–3055.
Full Text
Tags
Comments