Learning Populations of Parameters
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017)(2017)
摘要
Consider the following estimation problem: there are n entities, each with an unknown parameter p_i ∈ [0,1], and we observe n independent random variables, X_1,…,X_n, with X_i ∼ Binomial(t, p_i). How accurately can one recover the "histogram" (i.e. cumulative density function) of the p_i's? While the empirical estimates would recover the histogram to earth mover distance Θ(1/√(t)) (equivalently, ℓ_1 distance between the CDFs), we show that, provided n is sufficiently large, we can achieve error O(1/t) which is information theoretically optimal. We also extend our results to the multi-dimensional parameter case, capturing settings where each member of the population has multiple associated parameters. Beyond the theoretical results, we demonstrate that the recovery algorithm performs well in practice on a variety of datasets, providing illuminating insights into several domains, including politics, sports analytics, and variation in the gender ratio of offspring.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络