To What Extent Does Content Selection Affect Surface Realization In The Context Of Headline Generation?

COMPUTER SPEECH AND LANGUAGE(2021)

引用 1|浏览17
暂无评分
摘要
Headline generation is a task where the most important information of a news article is condensed and embodied into a single short sentence. This task is normally addressed by summarization techniques, ideally combining extractive and abstractive methods together with sentence compression or fusion techniques. Although Natural Language Generation (NLG) techniques have not been directly exploited for headline generation, they may provide better mechanisms than summarization techniques to paraphrase the information of a text. Therefore, this paper analyzes and evaluates the effectiveness of NLG techniques for generating headlines. In NLG, both content selection and surface realization are equally important-there is no point in generating text without knowing the topic. Considering this premise, we therefore take HanaNLG-a hybrid surface realization approach-as a basis, and we analyze the effect in the generated text when different content selection strategies are integrated at macroplanning stage. The experiments conducted show that, despite not using any sophisticated summarization method, the proposed approach provided the following benefits: i) it generated a coherent, linguistically structured headline; ii) it obtained results on standard datasets (i.e., DUC 2003 and DUC 2004) that were comparable to several competitive systems, in terms of the content of the generated headline; and, iii) the headlines generated by the whole approach (PLM-HanaNLG) were preferred by human assessors compared to those generated by the best performing system in DUC 2003. (C) 2020 Elsevier Ltd. All rights reserved.
更多
查看译文
关键词
Natural language generation, Headline generation, Positional language models, Factored language models, Content selection, Abstractive summarization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要