Learning to Prioritize: Precision-Driven Sentence Filtering for Long Text Summarization.

user-621f4a59e554229dc92c1403(2022)

引用 0|浏览21
暂无评分
摘要
Neural text summarization has shown great potential in recent years. However, current state-of-the-art summarization models are limited by their maximum input length, posing a challenge to summarizing longer texts comprehensively. As part of a layered summarization architecture, we introduce PURETEXT, a simple yet effective pre-processing layer that removes lowquality sentences in articles to improve existing summarization models. When evaluated on popular datasets likeWikiHow and Reddit TIFU, we show up to 3.84 and 8.57 point ROUGE- 1 absolute improvement on the full test set and the long article subset, respectively, for state-of-the-art summarization models such as BERTSUM and BART. Our approach provides downstream models with higher-quality sentences for summarization, improving overall model performance, especially on long text articles.
更多
查看译文
关键词
text summarization, machine learning, natural language processing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要