A Short Text Spectrum Clustering Method Based on Frequent Itemsets

ieee international conference computer and communications(2018)

引用 0|浏览3
Short text datum, which contains a lot of useful information, could be easily found on a wide variety of self-media platform and social communication tools. on It is significant to use datum mining technology to automatically acquire knowledge from these datum. However, in short text datum, the word count is small, the text length is short and the information contained is little. Due to all these disadvantages, there are problems of feature sparseness and high information dimensionality. The accuracy of general text mining algorithms is usually unacceptable. This paper proposes a spectral clustering short text analysis method based on frequent itemsets. Firstly, the method cuts the short text datum, and compresses the datum using frequent itemsets. Secondly, a spectral algorithm based on Nyström sampling method is applied to perform clustering and topic extraction on the frequent itemsets. The method effectively solves the problems of feature sparseness and wide dimensions of short text datum. Experiments show that the methodhas better accuracy and performance than the existing algorithms when clustering short texts, and has a good clustering effect.
short text,frequent items,spectral clustering,Nyström samplin,session cutting
AI 理解论文
Chat Paper