A Sub-Sequence Based Approach to Protein Function Prediction via Multi-Attention Based Multi-Aspect Network

IEEE/ACM Transactions on Computational Biology and Bioinformatics(2023)

引用 9|浏览3
暂无评分
摘要
Inferring the protein function(s) via the protein sub-sequence classification is often obstructed due to lack of knowledge about function(s) of sub-sequences in the protein sequence. In this regard, we develop a novel “ multi-aspect ” paradigm to perform the sub-sequence classification in an efficient way by utilizing the information of the parent sequence. The aspects are: (1) Multi-label : independent labelling of sub-sequences with more than one functions of the parent sequence, and (ii) Label-relevance : scoring the parent functions to highlight the relevance of performing a given function by the sub-sequence. The multi-aspect paradigm is used to propose the “Multi-Attention Based Multi-Aspect Network” for classifying the protein sub-sequences, where multi-attention is a novel approach to process sub-sequences at word-level. Next, the proposed Global-ProtEnc method is a sub-sequence based approach to encoding protein sequences for protein function prediction task, which is finally used to develop as ensemble methods, Global-ProtEnc-Plus . Evaluations of both the Global-ProtEnc and the Global-ProtEnc-Plus methods on the benchmark CAFA3 dataset delivered a outstanding performances. Compared to the state-of-the-art DeepGOPlus, the improvements in $F_{max}$ with the Global-ProtEnc-Plus for the biological process is +6.50 percent and cellular component is +1.90 percent.
更多
查看译文
关键词
Multi-attention mechanism,protein sub-sequence,multi-attention based multi-aspect network
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要