A Comparative Study of Bandwidth Choice in Kernel Density Estimation for Naive Bayesian Classification

ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS(2009)

引用 33|浏览0
暂无评分
摘要
Kernel density estimation (KDE) is an important method in nonparametric learning. While KDE has been studied extensively in the context of accuracy of distribution estimation, it has not been studied extensively in the context of classification. This paper studies nine bandwidth selection schemes for kernel density estimation in Naive Bayesian classification context, using 52 machine learning benchmark datasets. The contributions of this paper are threefold. First, it shows that some commonly used and very sophisticated bandwidth selection schemes do not give good performance in Naive Bayes. Surprisingly, some very simple bandwidth selection schemes give statistically significantly better performance. Second, it shows that kernel density estimation can achieve statistically significantly better classification performance than a commonly used discretization method in Naive Bayes, but only when appropriate bandwidth selection schemes are applied. Third, this study gives bandwidth distribution patterns for the investigated bandwidth selection schemes.
更多
查看译文
关键词
simple bandwidth selection scheme,naive bayesian classification context,bandwidth choice,naive bayesian classification,appropriate bandwidth selection scheme,sophisticated bandwidth selection scheme,comparative study,bandwidth selection scheme,better classification performance,distribution estimation,bandwidth distribution pattern,kernel density estimation,naive bayes,machine learning,bayesian classification,kernel density estimate
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要