Optimally Learning Populations of Parameters.

neural information processing systems(2017)

引用 23|浏览25
暂无评分
摘要
Consider the following fundamental estimation problem: there are $n$ entities, each with an unknown parameter $p_i [0,1]$, and we observe $n$ independent random variables, $X_1,ldots,X_n$, with $X_i sim $ Binomial$(t, p_i)$. How accurately can one recover the ``histogramu0027u0027 (i.e. cumulative density function) of the $p_i$s? While the empirical estimates would recover the histogram to earth mover distance $Theta(frac{1}{sqrt{t}})$ (equivalently, $ell_1$ distance between the CDFs), we show that, provided $n$ is sufficiently large, we can achieve error $O(frac{1}{t})$ which is information theoretically optimal. We also extend our results to the multi-dimensional parameter case, capturing settings where each member of the population has multiple associated parameters. Beyond the theoretical results, we demonstrate that the recovery algorithm performs well in practice on a variety of datasets, providing illuminating insights into several domains, including politics, and sports analytics.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要