Classification for Text Data from the Power System Based on Improving Naive Bayes
ieee pes asia pacific power and energy engineering conference(2020)
摘要
After years of operation of the power system, a large amount of text data has been accumulated, and it is particularly important to analyze them such as the violation data. In this context, this paper introduced a novel classification method, Improving Naive Bayes Based on Improving Term Frequency-Inverse Document Frequency (ITF-IDF), which aims to categorize the text and reduce the costs of labor analysis. The classification of the violation data which including personal behavior, instrument, security activities, supervision and two-ticket data. To increase the classification accuracy, the proposed method improved the weight of Naive Bayes, namely ITF-TDF. In the experimental studies, the Improving Naive Bayes is evaluated on the test data of spam message which is a binary classification and the violation data from the power system as multi-classification, and is compared with the classifiers based on conventional Naive Bayes, the Logistic Regression and the Support Vector Machine (SVM), respectively. The results demonstrate that the proposed method has a better performance than the other methods.
更多查看译文
关键词
power system,classification,text data,bayes
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络