Learning to Prioritize: Precision-Driven Sentence Filtering for Long Text Summarization.

Alex Mei,Anisha Kabir,Rukmini Bapat, John Judge,Tony Sun,William Yang Wang

user-621f4a59e554229dc92c1403（2022）

引用 0|浏览21

暂无评分

摘要

Neural text summarization has shown great potential in recent years. However, current state-of-the-art summarization models are limited by their maximum input length, posing a challenge to summarizing longer texts comprehensively. As part of a layered summarization architecture, we introduce PURETEXT, a simple yet effective pre-processing layer that removes lowquality sentences in articles to improve existing summarization models. When evaluated on popular datasets likeWikiHow and Reddit TIFU, we show up to 3.84 and 8.57 point ROUGE- 1 absolute improvement on the full test set and the long article subset, respectively, for state-of-the-art summarization models such as BERTSUM and BART. Our approach provides downstream models with higher-quality sentences for summarization, improving overall model performance, especially on long text articles.

查看译文

关键词

text summarization, machine learning, natural language processing

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要