谷歌浏览器插件
订阅小程序
在清言上使用

Dissecting Factors Underlying Phylogenetic Uncertainty Using Machine Learning Models

bioRxiv (Cold Spring Harbor Laboratory)(2023)

引用 0|浏览11
暂无评分
摘要
Phylogenetic inference can be influenced by both underlying biological processes and methodological factors. While biological processes can be modeled, these models frequently make the assumption that methodological factors do not significantly influence the outcome of phylogenomic analyses. Depending on their severity, methodological factors can introduce inconsistency and uncertainty into the inference process. Although search protocols have been proposed to mitigate these issues, many solutions tend to treat factors independently or assume a linear relationship among them. In this study, we capitalize on the increasing size of phylogenetic datasets, using them to train machine learning models. This approach transcends the linearity assumption, accommodating complex non-linear relationships among features. We examined two phylogenomic datasets for teleost fishes: a newly generated dataset for protacanthopterygians (salmonids, galaxiids, marine smelts, and allies), and a reanalysis of a dataset for carangarians (flatfishes and allies). Upon testing five supervised machine learning models, we found that all outperformed the linear model (p < 0.05), with the deep neural network showing the best fit for both empirical datasets tested. Feature importance analyses indicated that influential factors were specific to individual datasets. The insights obtained have the potential to significantly enhance decision-making in phylogenetic analyses, assisting, for example, in the choice of suitable DNA sequence models and data transformation methods. This study can serve as a baseline for future endeavors aiming to capture non-linear interactions of features in phylogenomic datasets using machine learning and complement existing tools for phylogenetic analyses. ### Competing Interest Statement The authors have declared no competing interest.
更多
查看译文
关键词
phylogenetic uncertainty,machine learning,models
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要