Correcting Bias in Allele Frequency Estimates Due to an Observation Threshold: A Markov Chain Analysis.

Toni I Gossmann,David Waxman

Genome biology and evolution(2022)

引用 0|浏览2
暂无评分
摘要
There are many problems in biology and related disciplines involving stochasticity, where a signal can only be detected when it lies above a threshold level, while signals lying below threshold are simply not detected. A consequence is that the detected signal is conditioned to lie above threshold, and is not representative of the actual signal. In this work, we present some general results for the conditioning that occurs due to the existence of such an observational threshold. We show that this conditioning is relevant, for example, to gene-frequency trajectories, where many loci in the genome are simultaneously measured in a given generation. Such a threshold can lead to severe biases of allele frequency estimates under purifying selection. In the analysis presented, within the context of Markov chains such as the Wright-Fisher model, we address two key questions: (1) "What is a natural measure of the strength of the conditioning associated with an observation threshold?" (2) "What is a principled way to correct for the effects of the conditioning?". We answer the first question in terms of a proportion. Starting with a large number of trajectories, the relevant quantity is the proportion of these trajectories that are above threshold at a later time and hence are detected. The smaller the value of this proportion, the stronger the effects of conditioning. We provide an approximate analytical answer to the second question, that corrects the bias produced by an observation threshold, and performs to reasonable accuracy in the Wright-Fisher model for biologically plausible parameter values.
更多
查看译文
关键词
conditioned observations,missing values,random genetic drift,Wright-Fisher model,population genetics theory,stochastic population dynamics
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要