Unsupervised Summarization for Chat Logs with Topic-Oriented Ranking and Context-Aware Auto-Encoders

Cited by: 0|Bibtex|Views216
Other Links: arxiv.org
Weibo:
We propose a novel unsupervised framework for chat summarization

Abstract:

Automatic chat summarization can help people quickly grasp important information from numerous chat messages. Unlike conventional documents, chat logs usually have fragmented and evolving topics. In addition, these logs contain a quantity of elliptical and interrogative sentences, which make the chat summarization highly context depende...More

Code:

Data:

0
Introduction
  • The goal of text summarization is to generate a succinct summary while retaining a document’s essential information.
  • Despite the considerable research on similar dialogues, like meetings and telephone records (Zechner 2001; Gurevych and Strube 2004; Gillick et al 2009; Shang et al 2018), chat summarization has its own characteristics.
  • Compared with other dialogue forms, chat logs are generally pure text without audio or transcription information and tend to be much shorter, more unstructured, and contain more spelling mistakes, hyperlinks, and acronyms (Uthus and Aha 2011; Koto 2016).
Highlights
  • The goal of text summarization is to generate a succinct summary while retaining a document’s essential information
  • To tackle the topic shift problem in chat logs and the information integrity problem of individual utterances, in this work we introduce a novel unsupervised neural framework called RankAE that benefits from both extractive and abstractive paradigms
  • 16.90 17.28 17.62 17.85 all metrics, which validates the effectiveness of the topicoriented ranking strategy for chat summarization
  • Compared to RankAE(Ext.), the full framework with denoising autoencoder (DAE) generator improves the results by a large margin (+2.53, +1.72, +3.22 on ROUGE-1/2/L)
  • Beyond the extractive paradigm, our model is capable of integrating context information and generating summaries that are more relevant to original chat logs
  • We propose a novel unsupervised framework for chat summarization
Methods
  • The authors applied several comparison methods for chat summarization, which were all designed in unsupervised scenarios.
  • Lead (Nallapati et al 2017) extracts the first several sentences in a document as the summary, which can represent the lower bound of extractive methods.
  • Oracle (Nallapati et al 2017) uses a greedy algorithm to select the best performing sentences against the gold summary.
  • It represents the upper bound of extractive methods.
  • LEAD ORACLE TextRank / tf-idf TextRank / BERT PacSum / tf-idf PacSum / BERT MMR / tf-idf MMR / BERT RankAE (Ext.) / tf-idf RankAE (Ext.) / BERT MeanSum / RNN MeanSum / TRF SummAE / RNN SummAE / TRF SummAE - critic / RNN SummAE - critic / TRF RankAE - BERT RankAE
Results
  • Results and Analysis

    the authors show the results of RankAE and other unsupervised methods for chat summarization.
  • The second part is extractive methods, where the authors experiment with two utterance representations to compute the score matrix M.
  • Compared to RankAE(Ext.), the full framework with DAE generator improves the results by a large margin (+2.53, +1.72, +3.22 on ROUGE-1/2/L).
  • It manifests that, beyond the extractive paradigm, the model is capable of integrating context information and generating summaries that are more relevant to original chat logs.
  • The results of RankAE have a statistically significant difference from all other methods using a Wilcoxon signed-rank test with p <0.05, which verifies the effectiveness of RankAE that benefits from both extractive and abstractive paradigms
Conclusion
  • The authors propose a novel unsupervised framework for chat summarization.
  • A topic-oriented ranking strategy is designed to pick out utterances based on local centrality and topic diversity, while a denoising auto-encoder captures context information and discards nonessential content to produce succinct summaries.
  • Future directions may be the topic variation within an utterance, where a more finegrained ranking strategy on the word level can be explored.
  • The authors could consider other ways of noise addition, e.g., deletion and repeating, to introduce more kinds of noise
Summary
  • Introduction:

    The goal of text summarization is to generate a succinct summary while retaining a document’s essential information.
  • Despite the considerable research on similar dialogues, like meetings and telephone records (Zechner 2001; Gurevych and Strube 2004; Gillick et al 2009; Shang et al 2018), chat summarization has its own characteristics.
  • Compared with other dialogue forms, chat logs are generally pure text without audio or transcription information and tend to be much shorter, more unstructured, and contain more spelling mistakes, hyperlinks, and acronyms (Uthus and Aha 2011; Koto 2016).
  • Methods:

    The authors applied several comparison methods for chat summarization, which were all designed in unsupervised scenarios.
  • Lead (Nallapati et al 2017) extracts the first several sentences in a document as the summary, which can represent the lower bound of extractive methods.
  • Oracle (Nallapati et al 2017) uses a greedy algorithm to select the best performing sentences against the gold summary.
  • It represents the upper bound of extractive methods.
  • LEAD ORACLE TextRank / tf-idf TextRank / BERT PacSum / tf-idf PacSum / BERT MMR / tf-idf MMR / BERT RankAE (Ext.) / tf-idf RankAE (Ext.) / BERT MeanSum / RNN MeanSum / TRF SummAE / RNN SummAE / TRF SummAE - critic / RNN SummAE - critic / TRF RankAE - BERT RankAE
  • Results:

    Results and Analysis

    the authors show the results of RankAE and other unsupervised methods for chat summarization.
  • The second part is extractive methods, where the authors experiment with two utterance representations to compute the score matrix M.
  • Compared to RankAE(Ext.), the full framework with DAE generator improves the results by a large margin (+2.53, +1.72, +3.22 on ROUGE-1/2/L).
  • It manifests that, beyond the extractive paradigm, the model is capable of integrating context information and generating summaries that are more relevant to original chat logs.
  • The results of RankAE have a statistically significant difference from all other methods using a Wilcoxon signed-rank test with p <0.05, which verifies the effectiveness of RankAE that benefits from both extractive and abstractive paradigms
  • Conclusion:

    The authors propose a novel unsupervised framework for chat summarization.
  • A topic-oriented ranking strategy is designed to pick out utterances based on local centrality and topic diversity, while a denoising auto-encoder captures context information and discards nonessential content to produce succinct summaries.
  • Future directions may be the topic variation within an utterance, where a more finegrained ranking strategy on the word level can be explored.
  • The authors could consider other ways of noise addition, e.g., deletion and repeating, to introduce more kinds of noise
Tables
  • Table1: Results on the E-commerce chat log dataset. Methods are categorized into three groups: baseline, extractive and abstractive methods. TRF denotes the Transformer
  • Table2: Ablation Study. The first part includes variants of the extractor in RankAE. The second part shows the results under different settings of DAE. L-Rto. denotes the length ratio between system summaries and gold references. c denotes the window size
  • Table3: Human evaluation results in relevance and succinctness. The score represents the percentage of times each method is chosen as better in pairwise comparisons
  • Table4: An example of chat summarization with RankAE. Texts with red color represent nonessential or redundant content in the chat segment, which are excluded by RankAE to produce a more concise summary
Download tables as Excel
Related work
Funding
  • This work was partially funded by China National Key R&D Program (No 2018YFC0831105), National Natural Science Foundation of China (No 61751201, 62076069, 61976056), Shanghai Municipal Science and Technology Major Project (No.2018SHZDZX01), Science and Technology Commission of Shanghai Municipality Grant (No.18DZ1201000, 17JC1420200)
  • This work was supported by Alibaba Group through Alibaba Innovative Research Program
Study subjects and analysis
cases: 100
It shows that although RankAE integrates more context information, it is still capable of excluding irrelevant and redundant content and generating short summaries under the premise of a high performance. Considering automatic metrics like ROUGE and BLEU may not suitably represent the content to be evaluated, we randomly sample 100 cases in the test set and invite volunteers to evaluate the summaries. The process of human evaluation is designed similar to Narayan at al. (2018)

volunteers: 3
Specifically, volunteers are presented with one chat and two summaries produced from two different systems and are asked to decide which summary is better in terms of two dimensions: relevance (which summary captures more information relevant to the original chat? ) and succinctness (which summary contains fewer redundant content? ). In order to minimize the inter-human noise, we collect judgments from three volunteers for each comparison. We also randomize the order of summaries and chats for each volunteer

Reference
  • Arguello, J.; and Rose, C. 2006. Topic-segmentation of dialogue. In Proceedings of the Analyzing Conversations in Text and Speech, 42–49.
    Google ScholarLocate open access versionFindings
  • Carbonell, J.; and Goldstein, J. 1998. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, 335–336.
    Google ScholarLocate open access versionFindings
  • Chu, E.; and Liu, P. 2019. MeanSum: A Neural Model for Unsupervised Multi-Document Abstractive Summarization. In International Conference on Machine Learning, 1223– 1232.
    Google ScholarLocate open access versionFindings
  • Cui, Y.; Che, W.; Liu, T.; Qin, B.; Yang, Z.; Wang, S.; and Hu, G. 2019. Pre-training with whole word masking for chinese bert. arXiv preprint arXiv:1906.08101.
    Findings
  • Devlin, J.; Chang, M.-W.; Lee, K.; and Toutanova, K. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), 4171–4186.
    Google ScholarLocate open access versionFindings
  • Erkan, G.; and Radev, D. R. 2004. Lexrank: Graph-based lexical centrality as salience in text summarization. Journal of artificial intelligence research 22: 457–479.
    Google ScholarLocate open access versionFindings
  • Fevry, T.; and Phang, J. 2018. Unsupervised Sentence Compression using Denoising Auto-Encoders. In Proceedings of the 22nd Conference on Computational Natural Language Learning, 413–422.
    Google ScholarLocate open access versionFindings
  • Gillick, D.; Riedhammer, K.; Favre, B.; and Hakkani-Tur, D. 2009. A global optimization framework for meeting summarization. In 2009 IEEE International Conference on Acoustics, Speech and Signal Processing, 4769–4772. IEEE.
    Google ScholarLocate open access versionFindings
  • Gurevych, I.; and Strube, M. 2004. Semantic similarity applied to spoken dialogue summarization. In Proceedings of the 20th international conference on Computational Linguistics, 764. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Harabagiu, S.; and Lacatusu, F. 2005. Topic themes for multi-document summarization. In Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, 202–209.
    Google ScholarLocate open access versionFindings
  • He, Z.; Chen, C.; Bu, J.; Wang, C.; Zhang, L.; Cai, D.; and He, X. 2012. Document summarization based on data reconstruction. In Twenty-Sixth AAAI Conference on Artificial Intelligence.
    Google ScholarLocate open access versionFindings
  • Kingma, D. P.; and Ba, J. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
    Findings
  • Koto, F. 2016. A publicly available indonesian corpora for automatic abstractive and extractive chat summarization. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), 801–805.
    Google ScholarLocate open access versionFindings
  • Lin, C.-Y. 2004. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out.
    Google ScholarFindings
  • Liu, C.; Wang, P.; Xu, J.; Li, Z.; and Ye, J. 2019a. Automatic Dialogue Summary Generation for Customer Service. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.
    Google ScholarLocate open access versionFindings
  • Liu, H.; Yu, H.; Deng, Z.-H.; et al. 2015. Multi-document summarization based on two-level sparse representation model. In Twenty-ninth AAAI conference on artificial intelligence.
    Google ScholarLocate open access versionFindings
  • Liu, P. J.; Chung, Y.-A.; Ren, J.; et al. 2019b. SummAE: Zero-shot abstractive text summarization using lengthagnostic auto-encoders. arXiv preprint arXiv:1910.00998.
    Findings
  • Liu, Y.; and Lapata, M. 2019. Text Summarization with Pretrained Encoders. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 3721–3731.
    Google ScholarLocate open access versionFindings
  • McDonald, R. 2007. A study of global inference algorithms in multi-document summarization. In European Conference on Information Retrieval. Springer.
    Google ScholarLocate open access versionFindings
  • Mehdad, Y.; Carenini, G.; Ng, R.; et al. 2014. Abstractive summarization of spoken and written conversations based on phrasal queries. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 1220–1230.
    Google ScholarLocate open access versionFindings
  • Mehdad, Y.; Carenini, G.; Tompa, F.; and Ng, R. 2013. Abstractive meeting summarization with entailment and fusion. In Proceedings of the 14th European Workshop on Natural Language Generation, 136–146.
    Google ScholarLocate open access versionFindings
  • Miao, Y.; and Blunsom, P. 2016. Language as a Latent Variable: Discrete Generative Models for Sentence Compression. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 319–328.
    Google ScholarLocate open access versionFindings
  • Mihalcea, R.; and Tarau, P. 2004. Textrank: Bringing order into text. In Proceedings of the 2004 conference on empirical methods in natural language processing, 404–411.
    Google ScholarLocate open access versionFindings
  • Mikolov, T.; Sutskever, I.; Chen, K.; Corrado, G. S.; and Dean, J. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, 3111–3119.
    Google ScholarLocate open access versionFindings
  • Murray, G.; and Carenini, G. 2008. Summarizing spoken and written conversations. In Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, 773–782.
    Google ScholarLocate open access versionFindings
  • Nallapati, R.; Zhai, F.; Zhou, B.; et al. 2017. Summarunner: A recurrent neural network based sequence model for extractive summarization of documents. In Thirty-First AAAI Conference on Artificial Intelligence.
    Google ScholarLocate open access versionFindings
  • Narayan, S.; Cohen, S. B.; Lapata, M.; et al. 2018. Don’t Give Me the Details, Just the Summary! Topic-Aware Convolutional Neural Networks for Extreme Summarization. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 1797–1807.
    Google ScholarLocate open access versionFindings
  • Nenkova, A.; and Vanderwende, L. 2005. The impact of frequency on summarization. Microsoft Research, Redmond, Washington, Tech. Rep. 101.
    Google ScholarLocate open access versionFindings
  • Nikolov, N. I.; Pfeiffer, M.; Hahnloser, R. H.; et al. 2018. Data-driven Summarization of Scientific Articles. In Proc. of the 7th International Workshop on Mining Scientific Publications, LREC 2018.
    Google ScholarLocate open access versionFindings
  • Papineni, K.; Roukos, S.; Ward, T.; and Zhu, W.-J. 2002. BLEU: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting on association for computational linguistics, 311–318. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Passonneau, R. J.; and Litman, D. J. 1993. Intention-based segmentation: Human reliability and correlation with linguistic cues. In Proceedings of the 31st annual meeting on Association for Computational Linguistics, 148–155. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Radev, D. R.; Jing, H.; Stys, M.; and Tam, D. 2004. Centroid-based summarization of multiple documents. Information Processing & Management 40(6): 919–938.
    Google ScholarLocate open access versionFindings
  • Rambow, O.; Shrestha, L.; Chen, J.; and Lauridsen, C. 2004. Summarizing email threads. In Proceedings of HLT-NAACL 2004: Short Papers, 105–108.
    Google ScholarLocate open access versionFindings
  • Rossiello, G.; Basile, P.; Semeraro, G.; et al. 2017. Centroidbased text summarization through compositionality of word embeddings. In Proceedings of the MultiLing 2017 Workshop on Summarization and Summary Evaluation Across Source Types and Genres, 12–21.
    Google ScholarLocate open access versionFindings
  • Schuster, M.; and Paliwal, K. K. 1997. Bidirectional recurrent neural networks. IEEE transactions on Signal Processing 45(11): 2673–2681.
    Google ScholarLocate open access versionFindings
  • See, A.; Liu, P. J.; Manning, C. D.; et al. 2017. Get To The Point: Summarization with Pointer-Generator Networks. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 1073–1083.
    Google ScholarLocate open access versionFindings
  • Shang, G.; Ding, W.; Zhang, Z.; Tixier, A.; Meladianos, P.; Vazirgiannis, M.; and Lorre, J.-P. 2018. Unsupervised Abstractive Meeting Summarization with Multi-Sentence Compression and Budgeted Submodular Maximization. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 664–674.
    Google ScholarLocate open access versionFindings
  • Sood, A.; Mohamed, T. P.; Varma, V.; et al. 2013. Topicfocused summarization of chat conversations. In European Conference on Information Retrieval, 800–803. Springer.
    Google ScholarLocate open access versionFindings
  • Sood, A.; Thanvir, M.; Vasudeva, V.; et al. 2012. Summarizing Online Conversations: A Machine Learning Approach. In 24th International Conference on Computational Linguistics-(Coling-2012).
    Google ScholarLocate open access versionFindings
  • Uthus, D. C.; and Aha, D. W. 2011. Plans toward automated chat summarization. In Proceedings of the Workshop on Automatic Summarization for Different Genres, Media, and Languages, 1–7.
    Google ScholarLocate open access versionFindings
  • Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A. N.; Kaiser, Ł.; and Polosukhin, I. 2017. Attention is all you need. In Advances in neural information processing systems, 5998–6008.
    Google ScholarLocate open access versionFindings
  • Vincent, P.; Larochelle, H.; Bengio, Y.; and Manzagol, P.-A. 2008. Extracting and composing robust features with denoising autoencoders. In Proceedings of the 25th international conference on Machine learning, 1096–1103.
    Google ScholarLocate open access versionFindings
  • Xie, S.; Liu, Y.; Lin, H.; et al. 2008. Evaluating the effectiveness of features and sampling in extractive meeting summarization. In 2008 IEEE Spoken Language Technology Workshop, 157–160. IEEE.
    Google ScholarLocate open access versionFindings
  • Zechner, K. 2001. Automatic generation of concise summaries of spoken dialogues in unrestricted domains. In Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, 199–207.
    Google ScholarLocate open access versionFindings
  • Zheng, H.; and Lapata, M. 2019. Sentence Centrality Revisited for Unsupervised Summarization. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 6236–6247.
    Google ScholarLocate open access versionFindings
  • Zhou, L.; and Hovy, E. 2005. Digesting virtual “geek” culture: The summarization of technical internet relay chats. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05), 298–305.
    Google ScholarLocate open access versionFindings
Full Text
Your rating :
0

 

Tags
Comments