Clustering Documents using the 3-Gram Graph Representation Model

Panhellenic Conference on Informatics(2014)

引用 3|浏览71
暂无评分
摘要
In this paper we illustrate an innovative clustering method of documents using the 3-Gram graphs representation model and deducing the problem of document clustering to graph partitioning. For the latter we employ the kernel k-means algorithm. We evaluated the proposed method using the Test Collections of Reuters-21578, and compared the results using the Latent Dirichlet Allocation (LDA) Algorithm. The results are encouraging demonstrating that the 3-Gram graph method has much better Recall and F1 score but worse Precision. Further changes that will further improve the results are identified.
更多
查看译文
关键词
algorithms,database applications,graph comparison,graph partitioning,graph theory,information search and retrieval,n-gram graph,text clustering
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要