An impossibility result for phylogeny reconstruction from k-mer counts

arxiv(2022)

引用 0|浏览4
暂无评分
摘要
We consider phylogeny estimation under a two-state model of sequence evolution by site substitution on a tree. In the asymptotic regime where the sequence lengths tend to infinity, we show that for any fixed k no statistically consistent phylogeny estimation is possible from k-mer counts over the full leaf sequences alone. Formally, we establish that the joint distribution of k-mer counts over the entire leaf sequences on two distinct trees have total variation distance bounded away from 1 as the sequence length tends to infinity. Our impossibility result implies that statistical consistency requires more sophisticated use of k-mer count information, such as block techniques developed in previous theoretical work.
更多
查看译文
关键词
Phylogenetics, k-mer, statsitical consistency, information-theoretic bounds, Markov models on trees
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要