Chrome Extension
WeChat Mini Program
Use on ChatGLM

A Knowledge-Based Feature Selection Method for Text Categorization

semanticscholar(2006)

Cited 0|Views0
No score
Abstract
A major difficulty of text categorization is the high dimensionality of the original feature space. Feature selection plays an important role in text categorization. Automatic feature selection methods such as document frequency thresholding (DF), information gain (IG), mutual information (MI), and so on are commonly applied in text categorization. Many existing experiments show IG is one of the most effective methods. In this paper, a method is proposed to measure attribute’s importance based on Rough Set theory. According to Rough set theory, knowledge about a universe of objects may be defined as classifications based on certain properties of the objects, i.e. Rough set theory assumes that knowledge is an ability to partition objects. We quantify the ability of partition objects, and call the amount of this ability as knowledge quantity, and than put forward a knowledge-based feature selection method called KG. Experimental results on NewsGroup and OHSUMED corpora show that KG performs much better than MI, DF, even than IG.
More
Translated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined