On Tail Decay Rate Estimation of Loss Function Distributions

JOURNAL OF MACHINE LEARNING RESEARCH(2024)

引用 0|浏览2
暂无评分
摘要
The study of loss -function distributions is critical to characterize a model's behaviour on a given machine -learning problem. While model quality is commonly measured by the average loss assessed on a testing set, this quantity does not ascertain the existence of the mean of the loss distribution. Conversely, the existence of a distribution's statistical moments can be verified by examining the thickness of its tails. Cross -validation schemes determine a family of testing loss distributions conditioned on the training sets. By marginalizing across training sets, we can recover the overall (marginal) loss distribution, whose tail -shape we aim to estimate. Small sample -sizes diminish the reliability and efficiency of classical tail -estimation methods like Peaks -OverThreshold, and we demonstrate that this effect is notably significant when estimating tails of marginal distributions composed of conditional distributions with substantial taillocation variability. We mitigate this problem by utilizing a result we prove: under certain conditions, the marginal-distribution's tail -shape parameter is the maximum tail -shape parameter across the conditional distributions underlying the marginal. We label the resulting approach as 'cross -tail estimation (CTE)'. We test CTE in a series of experiments on simulated and real data1, showing the improved robustness and quality of tail estimation as compared to classical approaches.
更多
查看译文
关键词
Extreme Value Theory,Tail Modelling,Peaks-Over-Threshold,Cross-Tail- Estimation,Model Ranking
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要