Automatic Diverse Subset Selection From Enzyme Families by Solving the Maximum Diversity Problem

Christian Atallah,Katherine James,Zhen Ou,James Skelton,David Markham,James Finnigan,Simon Charnock,Anil Wipat

2022 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB)（2022）

引用 0|浏览7

暂无评分

摘要

Enzymes are being increasingly exploited in various industries for their potential as biocatalysts. Increasing the portfolio of available and useful biocatalysts depends on the reliable annotation of enzyme catalytic function. However, the required quality of such annotation can only be confidently guaranteed through experimental characterisation in the laboratory. The selection of catalytically diverse enzyme panels for experimentally characterisation is therefore an important step for shedding light on the currently unannotated proteins in enzyme families. As current selection methods lack efficiency and scalability, and are non systematic, we present a novel approach for the automatic selection of subsets from enzyme families. A tabu search algorithm solving the maximum diversity problem for sequence identity was designed and implemented, and applied on three diverse enzyme families. We show that this approach automatically selects panels of enzymes that contain high richness and relative abundance of the known catalytic functions, and outperforms other methods such as k-medoids.

查看译文

关键词

biocatalysts,enzyme panels,maximum diversity problem

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要