A probabilistic approach for extractive summarization based on clustering cum graph ranking method

Amreen Ahmad, Tanvir Ahmad,Sarfaraz Masood, Mohd. Khizir Siddiqui, Basma Abd El-Rahiem, Pawel Plawiak,Fahad Alblehai

IEEE Access(2024)

引用 0|浏览3
暂无评分
摘要
Online information has increased tremendously in today’s age of the Internet. As a result, the need has arisen to extract relevant content from the plethora of available information. Researchers are widely using automatic text summarization techniques for extracting useful and relevant information from voluminous available information. The summary obtained from the automatic text summarization often faces the issues of diversity and information coverage. Earlier researchers have used graph-based approaches for ranking and optimization. This research work introduces a probabilistic approach named as ClusRank for summary extraction, comprising of a two-stage sentence selection model involving clustering and then ranking of sentences. The initial stage involves clustering of sentences using a proposed overlapping clustering algorithm on the weighted network, and later selection of salient sentences using the introduced probabilistic approach. In the analysis of real-world networks, community structure development is essential because it provides strategic insights that help decision-makers make well-informed choices. Furthermore, methodologically strict community detection algorithms are required due to the occurrence of discontinuous, overlapping, and nested community patterns in such networks.This research work, an algorithm is presented for detecting overlapping communities based on the concept of rough set and granular information on links. The sentence selection algorithm based on budget maximum coverage approach supports the assumption that larger sub-topics in a document are of more importance than smaller subtopics. The performance of the proposed probabilistic ClusRank is validated on DUC2001, DUC 2002, DUC2004, and DUC 2006 data sets.
更多
查看译文
关键词
Automatic text summarization,clustering,graph ranking,diversity,information coverage
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要