Using Morphological and Semantic Features for the Quality Assessment of Russian Wikipedia.

Communications in Computer and Information Science(2017)

引用 3|浏览5
暂无评分
摘要
Nowadays, the assessment of the quality and credibility of Wikipedia articles becomes increasingly important. We propose to use morphological and semantic features to estimate the quality of Wikipedia articles in Russian language. We distinguished over 150 linguistic features and divided them into four groups. In these groups, we considered the features of encyclopedic style, readability and subjectivism of the article's text. Based on Random Forest as a classification algorithm, we show the most importance linguistic features that affect the quality of Russian Wikipedia articles. We compare the classification results of our four linguistic features groups separately. We have achieved the F-measure of 89,75%.
更多
查看译文
关键词
Quality assessment of texts,Morphological and semantics features,Russian Wikipedia articles,Random forests classification,Encyclopedic,Readability,Subjectivism
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要