A Sub-Sequence Based Approach to Protein Function Prediction via Multi-Attention Based Multi-Aspect Network
IEEE/ACM Transactions on Computational Biology and Bioinformatics(2023)
摘要
Inferring the protein function(s) via the protein sub-sequence classification is often obstructed due to lack of knowledge about function(s) of sub-sequences in the protein sequence. In this regard, we develop a novel “
multi-aspect
” paradigm to perform the sub-sequence classification in an efficient way by utilizing the information of the parent sequence. The aspects are: (1)
Multi-label
: independent labelling of sub-sequences with more than one functions of the parent sequence, and (ii)
Label-relevance
: scoring the parent functions to highlight the relevance of performing a given function by the sub-sequence. The
multi-aspect
paradigm is used to propose the “Multi-Attention Based Multi-Aspect Network” for classifying the protein sub-sequences, where
multi-attention
is a novel approach to process sub-sequences at word-level. Next, the proposed
Global-ProtEnc
method is a sub-sequence based approach to encoding protein sequences for protein function prediction task, which is finally used to develop as ensemble methods,
Global-ProtEnc-Plus
. Evaluations of both the
Global-ProtEnc
and the
Global-ProtEnc-Plus
methods on the benchmark CAFA3 dataset delivered a outstanding performances. Compared to the state-of-the-art DeepGOPlus, the improvements in
$F_{max}$
with the
Global-ProtEnc-Plus
for the biological process is +6.50 percent and cellular component is +1.90 percent.
更多查看译文
关键词
Multi-attention mechanism,protein sub-sequence,multi-attention based multi-aspect network
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要