An analytical derivation of the distribution of distances between heterozygous sites in diploid species to efficiently infer demographic history

bioRxiv (Cold Spring Harbor Laboratory)(2023)

引用 0|浏览4
暂无评分
摘要
Heterozygous sites are not uniformly distributed along a diploid genome. Rather, their density varies as a result of recombination events, and their local density reflects the time to the last common ancestor of the maternal and paternal copies of a genomic region. The distribution of the density of heterozygous sites therefore carries information about the history of the population size. Despite previous efforts, an exact derivation of the distribution of heterozygous sites is still lacking. As a consequence, the estimation of population size variation is difficult and requires several simplifying assumptions. Using a novel theoretical framework, we are able to derive an analytical formula for the distribution of distances between heterozygous sites. Our theory can account for arbitrary demographic histories, including bottlenecks. In the case of a constant population size the distribution follows a simple function and exhibits a power-law tail proportional to r α with α =−3, where r is the distance between heterozygous sites. This prediction is accurately validated when considering heterozygous sites in individuals of African descent. Other populations migrated out of Africa and underwent at least one bottleneck which left a distinct mark on their interval distribution between heterozygous sites, i.e., an overrepresentation of intervals between 10 and 100 kbp in length. Our analytical theory for non-constant population sizes reproduces this behavior and can be used to study historical changes in population size with high accuracy. The simplicity of our approach facilitates the analysis of demographic histories for diploid species, requiring only a single unphased genome. ### Competing Interest Statement The authors have declared no competing interest.
更多
查看译文
关键词
diploid species,heterozygous sites
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要