Clusters with random size: maximum likelihood versus weighted estimation

STATISTICA SINICA(2018)

引用 7|浏览30
暂无评分
摘要
The analysis of hierarchical data that take the form of clusters with random size has received considerable attention. The focus here is on samples that are very large in terms of number of clusters and/or members per cluster, on the one hand, as well as on very small samples (e.g., when studying rare diseases), on the other. Whereas maximum likelihood inference is straightforward in medium to large samples, in samples of sizes considered here it may be prohibitive. We propose sample-splitting (Molenberghs, Verbeke and Iddi (2011)) as a way to replace iterative optimization of a likelihood that does not admit an analytical solution, with closed-form calculations. We use pseudo-likelihood (Molenberghs et al. (2014)), consisting of computing weighted averages over solutions obtained for each cluster size occurring. As a result, the statistical properties of this approach need to be investigated, especially because the minimal sufficient statistics involved are incomplete. The operational characteristics were studied using simulations. Simulations were also done to compare the proposed method to existing techniques developed to circumvent difficulties with unequal cluster sizes, such as multiple imputation. It follows that the proposed non-iterative methods have a strong beneficial impact on computation time; at the same time, the method is the most precise among its competitors considered. The findings are illustrated using data from a developmental toxicity study, where clusters are formed of fetuses within litters.
更多
查看译文
关键词
Likelihood inference,pseudo-likelihood,unequal cluster size
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要