pLM-BLAST - distant homology detection based on direct comparison of sequence representations from protein language models

biorxiv(2022)

引用 4|浏览13
暂无评分
摘要
Homology detection by sequence comparison is a typical first step in the study of protein function and evolution. Here, we describe a new homology detection tool, pLM-BLAST, that uses a modified Smith-Waterman algorithm for unsupervised comparison of single-sequence representations obtained from a protein language model (such as ProtT5) trained on millions of sequences. In our benchmarks, pLM-BLAST has shown the ability to detect homology between highly divergent proteins, demonstrating its applicability to tasks such as protein classification, domain annotation, and function prediction. Availability and Implementation: pLM-BLAST is available as a web server in the MPI Bioinformatics Toolkit (https://toolkit.tuebingen.mpg.de/tools/pLM-BLAST), where it can be used to search precomputed databases. It is also available as a standalone tool to build custom databases and run batch searches (https://github.com/labstructbioinf/pLM-BLAST). ### Competing Interest Statement The authors have declared no competing interest.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要