Authorship Attribution of The Golden Lotus Based on Text Classification Methods

Proceedings of the 2019 3rd International Conference on Innovation in Artificial Intelligence(2019)

引用 8|浏览0
暂无评分
摘要
In this paper, we explore the authorship attribution of The Golden Lotus using the traditional machine learning method of text classification. There are four candidate authors: Shizhen Wang, Wei Xu, Kaixian Li and Zhideng Wang. We choose The Golden Lotus's poems and four candidate authors' poems as data set. According to the characteristics of Chinese ancient poem, we choose Chinese character, rhyme, genre and overlapped word as features. We use six supervised machine learning algorithms, including Logistic Regression, Random Forests, Decision Tree and Naive Bayes, SVM and KNN classifiers respectively for text binary classification and multi-classification. According to two experiments results, the style of writing of Wei Xu's poems is the most similar to that of The Golden Lotus. It is proved that among four authors, Wei Xu most likely be the author of The Golden Lotus.
更多
查看译文
关键词
Authorship attribution, The Golden Lotus, machine learning, text classification
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要