On finding the optimal parameters for probability estimation with m-estimate.
CompSysTech(2020)
摘要
The estimation of probabilities from empirical data samples has been established as a crucial part of many machine learning and knowledge discovery research projects and applications. In addition to simple probability estimation with relative frequency, more elaborated probability estimation methods were proposed and applied in practice (e.g. Laplace's rule, m-estimate, Piegat's estimate). In this paper we analyze the role of parameter m in m-estimate. In most practical applications that used m-estimate, m was often set to 2 or, in more complex settings, determined with a cross-validation procedures. In this study we evaluate the impact of various values of m to the absolute error of m-estimate in the context of a carefully designed experimental framework. The results of our analysis suggest that the optimal value of parameter m does not depend only on the size of the observed sample, but also, to a much greater extent, on the difference between hypothetical pa used in m-estimate and the authentic p of the sample.
更多查看译文
关键词
probability estimation,optimal parameters,m-estimate
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要