EV1ncRNA-Dpred: improved prediction of experimentally validated lncRNAs by deep learning

BRIEFINGS IN BIOINFORMATICS(2023)

引用 0|浏览6
暂无评分
摘要
Long non -coding RNAs (lncRNAs) played essential roles in nearly every biological process and disease. Many algorithms were developed to distinguish lncRNAs from mRNAs in transcriptomic data and facilitated discoveries of more than 600 000 of lncRNAs. However, only a tiny fraction (<1%) of 1ncRNA transcripts (similar to 4000) were further validated by low -throughput experiments (EV1ncRNAs). Given the cost and labor-intensive nature of experimental validations, it is necessary to develop computational tools to prioritize those potentially functional lncRNAs because many lncRNAs from high -throughput sequencing (HT1ncRNAs) could be resulted from transcriptional noises. Here, we employed deep learning algorithms to separate EV1ncRNAs from HT1ncRNAs and mRNAs. For overcoming the challenge of small datasets, we employed a three -layer deep -learning neural network (DNN) with a K-mer feature as the input and a small convolutional neural network (CNN) with one -hot encoding as the input. Three separate models were trained for human (h), mouse (m) and plant (p), respectively. The final concatenated models (EV1ncRNA-Dpred (h), EV1ncRNA-Dpred (m) and EV1ncRNA-Dpred (p)) provided substantial improvement over a previous model based on support-vector-machines (EV1ncRNA-pred). For example, EV1ncRNA-Dpred (h) achieved 0.896 for the area under receiver-operating characteristic curve, compared with 0.582 given by sequence -based EV1ncRNA-pred model. The models developed here should be useful for screening 1ncRNA transcripts for experimental validations. EV1ncRNA-Dpred is available as a web server at https://www.sdklab-biophysics-dzu.net/EV1naNA-Dprediindex.html, and the data and source code can be freely available along with the web server.
更多
查看译文
关键词
experimentally validated lncRNAs,deep learning,prediction
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要