DPProm: A Two-Layer Predictor for Identifying Promoters and Their Types on Phage Genome Using Deep Learning

IEEE Journal of Biomedical and Health Informatics(2022)

引用 3|浏览6
暂无评分
摘要
With the number of phage genomes increasing, it is urgent to develop new bioinformatics methods for phage genome annotation. Promoter, a DNA region, is important for gene transcriptional regulation. In the era of post-genomics, the availability of data makes it possible to establish computational models for promoter identification with robustness. In this work, we introduce DPProm, a two-layer model composed of DPProm-1L and DPProm-2L, to predict promoters and their types for phages. On the first layer, as a dual-channel deep neural network ensemble method fusing multi-view features (sequence feature and handcrafted feature), the model DPProm-1L is proposed to identify whether a DNA sequence is a promoter or non-promoter. The sequence feature is extracted with convolutional neural network (CNN). And the handcrafted feature is the combination of free energy, GC content, cumulative skew, and Z curve features. On the second layer, DPProm-2L based on CNN is trained to predict the promoters' types (host or phage). For the realization of prediction on the whole genomes, the model DPProm, combines with a novel sequence data processing workflow, which contains sliding window and merging sequences modules. Experimental results show that DPProm outperforms the state-of-the-art methods, and decreases the false positive rate effectively on whole genome prediction. Furthermore, we provide a user-friendly web at http://bioinfo.ahu.edu.cn/DPProm . We expect that DPProm can serve as a useful tool for identification of promoters and their types.
更多
查看译文
关键词
Deep learning,multi-view features,phage,whole genome
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要