Keywords-Guided Abstractive Sentence Summarization

national conference on artificial intelligence, 2020.

Cited by: 0|Bibtex|Views138
Other Links: academic.microsoft.com
Weibo:
We propose a dual-copy mechanism to copy the words from the input sentence and the keywords

Abstract:

We study the problem of generating a summary for a given sentence. Existing researches on abstractive sentence summarization ignore that keywords in the input sentence provide significant clues for valuable content, and humans tend to write summaries covering these keywords. In this paper, we propose an abstractive sentence summarization ...More

Code:

Data:

0
Introduction
  • Sentence summarization is a task that creates a condensed version of a long sentence1.
  • Abstractive summarization is much closer to the way human make a summary, while it is more challenging.
  • Observation: Input sentence: France and Germany called on world leaders Monday to take rapid action to press for the closure of Ukraine 's Chernobyl nuclear plant , site of the world 's worst ever nuclear disaster.
  • Extracting keywords: world leaders closure Chernobyl Step2.
  • Generating summary guided by the keywords: World leaders called for action on Chernobyl closure
Highlights
  • Sentence summarization is a task that creates a condensed version of a long sentence1
  • We propose an abstractive sentence summarization method guided by the keywords in the original sentence
  • This paper addresses the sentence summarization task, namely, how to transform a sentence into a short-length summary
  • We propose a dual-copy mechanism to copy the words from the input sentence and the keywords
  • Experimental results on standard dataset verify the effectiveness of keywords for sentence summarization task
  • The results are shown in Table 2. With this oracle setting, ROUGE score improvements are more than 20% over seq2seq model
  • Oracle testing with ground-truth keywords leads to absolute 20% ROUGE-2 score improvement over baseline, indicating a promising future direction based on keyword extraction for sentence summarization task
Methods
  • ABS. Rush, Chopra, and Weston (2015) use an attentive CNN encoder and neural network language model decoder to summarize a sentence.

    SEASS.
  • Rush, Chopra, and Weston (2015) use an attentive CNN encoder and neural network language model decoder to summarize a sentence.
  • Zhou et al (2017) present a selective encoding model to control the information flow from the encoder to the decoder.
  • See, Liu, and Manning (2017) introduce a hybrid pointer-generator model that can copy words from the source sentence via pointing
  • PG. See, Liu, and Manning (2017) introduce a hybrid pointer-generator model that can copy words from the source sentence via pointing
Results
  • Table 1 shows that the proposed models perform better than the models without keyword guidance.
  • Among the models with different selective mechanisms, the experimental results are improved steadily as more keyword guidance signals added into the models, from Self-selective to CoSelective.
  • The models with Hierarchical fusion exhibit advantages over those with other fusions.
  • The model with Co-Selective encoding, Hierarchical fusion decoding, and DualCopy obtains the highest ROUGE score, which outperforms S2S-Sentence absolute 2.05% ROUGE-1 score, Method R-1 R-2 R-L.
  • S2SKeywords with only the keywords as the input degrades the performance, showing that missing information from the keywords is necessary
Conclusion
  • To get better insights into the model, the authors conduct further analysis on (1) upper bound performance, (2) multi-task learning, (3) fine-tuning, and (4) selective encoding mechanism.

    Upper Bound Performance

    The authors explore the upper bound performance for the keywordguided sentence summarization model.
  • To get better insights into the model, the authors conduct further analysis on (1) upper bound performance, (2) multi-task learning, (3) fine-tuning, and (4) selective encoding mechanism.
  • The authors explore the upper bound performance for the keywordguided sentence summarization model.
  • The authors do this by directly using the ground-truth keywords for both training and.
  • Oracle testing with ground-truth keywords leads to absolute 20% ROUGE-2 score improvement over baseline, indicating a promising future direction based on keyword extraction for sentence summarization task
Summary
  • Introduction:

    Sentence summarization is a task that creates a condensed version of a long sentence1.
  • Abstractive summarization is much closer to the way human make a summary, while it is more challenging.
  • Observation: Input sentence: France and Germany called on world leaders Monday to take rapid action to press for the closure of Ukraine 's Chernobyl nuclear plant , site of the world 's worst ever nuclear disaster.
  • Extracting keywords: world leaders closure Chernobyl Step2.
  • Generating summary guided by the keywords: World leaders called for action on Chernobyl closure
  • Methods:

    ABS. Rush, Chopra, and Weston (2015) use an attentive CNN encoder and neural network language model decoder to summarize a sentence.

    SEASS.
  • Rush, Chopra, and Weston (2015) use an attentive CNN encoder and neural network language model decoder to summarize a sentence.
  • Zhou et al (2017) present a selective encoding model to control the information flow from the encoder to the decoder.
  • See, Liu, and Manning (2017) introduce a hybrid pointer-generator model that can copy words from the source sentence via pointing
  • PG. See, Liu, and Manning (2017) introduce a hybrid pointer-generator model that can copy words from the source sentence via pointing
  • Results:

    Table 1 shows that the proposed models perform better than the models without keyword guidance.
  • Among the models with different selective mechanisms, the experimental results are improved steadily as more keyword guidance signals added into the models, from Self-selective to CoSelective.
  • The models with Hierarchical fusion exhibit advantages over those with other fusions.
  • The model with Co-Selective encoding, Hierarchical fusion decoding, and DualCopy obtains the highest ROUGE score, which outperforms S2S-Sentence absolute 2.05% ROUGE-1 score, Method R-1 R-2 R-L.
  • S2SKeywords with only the keywords as the input degrades the performance, showing that missing information from the keywords is necessary
  • Conclusion:

    To get better insights into the model, the authors conduct further analysis on (1) upper bound performance, (2) multi-task learning, (3) fine-tuning, and (4) selective encoding mechanism.

    Upper Bound Performance

    The authors explore the upper bound performance for the keywordguided sentence summarization model.
  • To get better insights into the model, the authors conduct further analysis on (1) upper bound performance, (2) multi-task learning, (3) fine-tuning, and (4) selective encoding mechanism.
  • The authors explore the upper bound performance for the keywordguided sentence summarization model.
  • The authors do this by directly using the ground-truth keywords for both training and.
  • Oracle testing with ground-truth keywords leads to absolute 20% ROUGE-2 score improvement over baseline, indicating a promising future direction based on keyword extraction for sentence summarization task
Tables
  • Table1: Main results (%). Concat, Gated, and Hier denote Concatenation, Gated, and Hierarchical Fusion, respectively. Our Co-Selective+Hier+DualPG model performs significantly better than other baselines by the 95% confidence interval in the ROUGE script
  • Table2: Upper bound performances with the ground-truth keywords for both training and testing
  • Table3: Heat maps for our model with co-selective and self-selective encoding
  • Table4: Comparison of MTL and two-stage learning for keyword extraction. Accuracy is for all the words, and F1 score is for the keywords
  • Table5: Manual evaluation
Download tables as Excel
Related work
  • Abstractive Text Summarization

    Seq2seq model is the dominating framework for abstractive text summarization. Rush, Chopra, and Weston (2015) are the first to apply the seq2seq model to abstractive sentence summarization. They propose an attentive CNN encoder and a neural network language model (Bengio et al 2003) decoder. Chopra, Auli, and Rush (2016) and Nallapati et al (2016) further extend the RNN-based summarization model. Gu et al (2016), Zeng et al (2016) and yt-1 yt s0 ... st-1 st Vocabulary ct Distribution Dual Copy

    Sentence Attention Distribution

    Dual crt Attention Keyword ckt Attention

    ... hr'1 hr'2 hr'3 hr'n

    Keyword Extractor hr1 hr2 x1 x2
Funding
  • The research work descried in this paper has been supported by the National Key Research and Development Program of China under Grant No 2017YFC0820700
Reference
  • Following Zhou et al. (2017), we visualize the selective gate values with salience heat maps shown in Table 3. For our model with co-selective gate, the important words are selected correctly by the aid of keywords, while the output of the model with self-selective gate is mismatched with the reference summary because of inaccurate selective values.
    Google ScholarFindings
  • Bahdanau, D.; Cho, K.; and Bengio, Y. 2015. Neural machine translation by jointly learning to align and translate. In Proceedings of ICLR.
    Google ScholarLocate open access versionFindings
  • Bengio, Y.; Ducharme, R.; Vincent, P.; and Janvin, C. 200A neural probabilistic language model. JMLR 3:1137–1155.
    Google ScholarLocate open access versionFindings
  • Cao, Z.; Wei, F.; Li, W.; and Li, S. 2018. Faithful to the original: Fact aware neural abstractive summarization. In Proceedings of AAAI.
    Google ScholarLocate open access versionFindings
  • Chen, Q.; Zhu, X.; Ling, Z.; Wei, S.; and Jiang, H. 2016. Distraction-based neural networks for modeling documents. In Proceedings of IJCAI, 2754–2760.
    Google ScholarLocate open access versionFindings
  • Cheng, J., and Lapata, M. 201Neural summarization by extracting sentences and words. In Proceedings of ACL.
    Google ScholarLocate open access versionFindings
  • Chopra, S.; Auli, M.; and Rush, A. M. 2016. Abstractive sentence summarization with attentive recurrent neural networks. In Proceedings of NAACL, 93–98.
    Google ScholarLocate open access versionFindings
  • Clarke, J. 200Global inference for sentence compression: an integer linear programming approach. Ph.D. Dissertation, University of Edinburgh, UK.
    Google ScholarFindings
  • Dong, Y.; Shen, Y.; Crawford, E.; van Hoof, H.; and Cheung, J. C. K. 2018. Banditsum: Extractive summarization as a contextual bandit. In Proceedings of EMNLP, 3739–3748.
    Google ScholarLocate open access versionFindings
  • Gehrmann, S.; Deng, Y.; and Rush, A. 2018. Bottom-up abstractive summarization. In Proceedings of EMNLP.
    Google ScholarLocate open access versionFindings
  • Gu, J.; Lu, Z.; Li, H.; and Li, V. O. 2016. Incorporating copying mechanism in sequence-to-sequence learning. In Proceedings of ACL, 1631–1640.
    Google ScholarLocate open access versionFindings
  • Gulcehre, C.; Ahn, S.; Nallapati, R.; Zhou, B.; and Bengio, Y. 2016. Pointing the unknown words. In Proceedings of ACL, 140– 149.
    Google ScholarLocate open access versionFindings
  • Huang, Z.; Xu, W.; and Yu, K. 2015. Bidirectional LSTM-CRF models for sequence tagging. arXiv:1508.01991.
    Findings
  • Jadhav, A., and Rajan, V. 2018. Extractive summarization with SWAP-NET: Sentences and words from alternating pointer networks. In Proceedings of ACL.
    Google ScholarLocate open access versionFindings
  • Lebanoff, L.; Song, K.; and Liu, F. 2018. Adapting the neural encoder-decoder framework from single to multi-document summarization. In Proceedings of EMNLP.
    Google ScholarLocate open access versionFindings
  • Li, C.; Xu, W.; Li, S.; and Gao, S. 2018a. Guiding generation for abstractive text summarization based on key information guide network. In Proceedings of NAACL.
    Google ScholarLocate open access versionFindings
  • Li, H.; Zhu, J.; Zhang, J.; and Zong, C. 2018b. Ensure the correctness of the summary: Incorporate entailment knowledge into abstractive sentence summarization. In Proceedings of COLING, 1430–1441.
    Google ScholarLocate open access versionFindings
  • Lin, C.-Y. 2004. Rouge: A package for automatic evaluation of summaries. In Text Summarization Branches Out.
    Google ScholarFindings
  • Mihalcea, R., and Tarau, P. 2004. Textrank: Bringing order into text. In Proceedings of EMNLP.
    Google ScholarLocate open access versionFindings
  • Nallapati, R.; Zhou, B.; dos Santos, C.; Gulcehre, C.; and Xiang, B. 2016. Abstractive text summarization using sequence-to-sequence rnns and beyond. In Proceedings of CoNLL, 280–290.
    Google ScholarLocate open access versionFindings
  • Narayan, S.; Cohen, S. B.; and Lapata, M. 2018. Don’t give me the details, just the summary! topic-aware convolutional neural networks for extreme summarization. In Proceedings of EMNLP.
    Google ScholarLocate open access versionFindings
  • Pascanu, R.; Mikolov, T.; and Bengio, Y. 2013. On the difficulty of training recurrent neural networks. In Proceedings of ICML, 1310–1318.
    Google ScholarLocate open access versionFindings
  • Pasunuru, R.; Guo, H.; and Bansal, M. 2017. Towards improving abstractive summarization via entailment generation. In Proceedings of the Workshop on New Frontiers in Summarization.
    Google ScholarLocate open access versionFindings
  • Rush, A. M.; Chopra, S.; and Weston, J. 2015. A neural attention model for abstractive sentence summarization. In Proceedings of EMNLP, 379–389.
    Google ScholarLocate open access versionFindings
  • Saggion, H., and Lapalme, G. 2002. Generating indicativeinformative summaries with SumUM. Computational Linguistics 28(4):497–526.
    Google ScholarLocate open access versionFindings
  • See, A.; Liu, P. J.; and Manning, C. D. 2017. Get to the point: Summarization with pointer-generator networks. In Proceedings of ACL, 1073–1083.
    Google ScholarLocate open access versionFindings
  • Srivastava, N.; Hinton, G. E.; Krizhevsky, A.; Sutskever, I.; and Salakhutdinov, R. 2014. Dropout: a simple way to prevent neural networks from overfitting. JMLR.
    Google ScholarLocate open access versionFindings
  • Takase, S.; Suzuki, J.; Okazaki, N.; Hirao, T.; and Nagata, M. 2016. Neural headline generation on abstract meaning representation. In Proceedings of EMNLP.
    Google ScholarLocate open access versionFindings
  • Tan, J.; Wan, X.; and Xiao, J. 2017. Abstractive document summarization with a graph-based attentional neural model. In Proceedings of ACL, 1171–1181.
    Google ScholarLocate open access versionFindings
  • Vinyals, O.; Fortunato, M.; and Jaitly, N. 2015. Pointer networks. In NeurIPS, 2692–2700.
    Google ScholarLocate open access versionFindings
  • Wan, X.; Yang, J.; and Xiao, J. 2007. Towards an iterative reinforcement approach for simultaneous document summarization and keyword extraction. In Proceedings of ACL.
    Google ScholarLocate open access versionFindings
  • Wang, L., and Cardie, C. 2013. Domain-independent abstract generation for focused meeting summarization. In Proceedings of ACL.
    Google ScholarLocate open access versionFindings
  • Zeng, W.; Luo, W.; Fidler, S.; and Urtasun, R. 2016. Efficient summarization with read-again and copy mechanism. arXiv:1611.03382.
    Findings
  • Zhang, X.; Lapata, M.; Wei, F.; and Zhou, M. 2018. Neural latent extractive document summarization. In Proceedings of EMNLP.
    Google ScholarLocate open access versionFindings
  • Zhang, Y.; Zincir-Heywood, A. N.; and Milios, E. E. 2004. World wide web site summarization. Web Intelligence and Agent Systems 2(1):39–53.
    Google ScholarLocate open access versionFindings
  • Zhou, Q.; Yang, N.; Wei, F.; and Zhou, M. 2017. Selective encoding for abstractive sentence summarization. In Proceedings of ACL, 1095–1104.
    Google ScholarLocate open access versionFindings
  • Zhu, J.; Wang, Q.; Wang, Y.; Zhou, Y.; Zhang, J.; Wang, S.; and Zong, C. 2019. NCLS: Neural cross-lingual summarization. In Proceedings of EMNLP, 3045–3055.
    Google ScholarLocate open access versionFindings
Full Text
Your rating :
0

 

Tags
Comments