Automatic Diverse Subset Selection From Enzyme Families by Solving the Maximum Diversity Problem

2022 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB)(2022)

引用 0|浏览7
暂无评分
摘要
Enzymes are being increasingly exploited in various industries for their potential as biocatalysts. Increasing the portfolio of available and useful biocatalysts depends on the reliable annotation of enzyme catalytic function. However, the required quality of such annotation can only be confidently guaranteed through experimental characterisation in the laboratory. The selection of catalytically diverse enzyme panels for experimentally characterisation is therefore an important step for shedding light on the currently unannotated proteins in enzyme families. As current selection methods lack efficiency and scalability, and are non systematic, we present a novel approach for the automatic selection of subsets from enzyme families. A tabu search algorithm solving the maximum diversity problem for sequence identity was designed and implemented, and applied on three diverse enzyme families. We show that this approach automatically selects panels of enzymes that contain high richness and relative abundance of the known catalytic functions, and outperforms other methods such as k-medoids.
更多
查看译文
关键词
biocatalysts,enzyme panels,maximum diversity problem
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要