DeepAdd: Protein function prediction from k-mer embedding and additional features.

Computational biology and chemistry(2020)

引用 15|浏览14
暂无评分
摘要
With the application of new high throughput sequencing technology, a large number of protein sequences is becoming available. Determination of the functional characteristics of these proteins by experiments is an expensive endeavor that requires a lot of time. Furthermore, at the organismal level, such kind of experimental functional analyses can be conducted only for a very few selected model organisms. Computational function prediction methods can be used to fill this gap. The functions of proteins are classified by Gene Ontology (GO), which contains more than 40,000 classifications in three domains, Molecular Function (MF), Biological Process (BP), and Cellular Component (CC). Additionally, since proteins have many functions, function prediction represents a multi-label and multi-class problem. We developed a new method to predict protein function from sequence. To this end, natural language model was used to generate word embedding of sequence and learn features from it by deep learning, and additional features to locate every protein. Our method uses the dependencies between GO classes as background information to construct a deep learning model. We evaluate our method using the standards established by the Computational Assessment of Function Annotation (CAFA) and have noticeable improvement over several algorithms, such as FFPred, DeepGO, GoFDR and other methods compared on the CAFA3 datasets.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要