Novel tools for the prediction of promoters in plants and bacteria

Faktori eksperimental'noi evolucii organizmiv(2017)

引用 0|浏览0
暂无评分
摘要
© SHAHMURADOV I.A. The promoter is a chromosome region that determines where the transcription of a particular DNA region is initiated. Promoter recognition is important in defining the transcription units responsible for specific pathways and gene regulation. Initiation of transcription is a dynamic partnership between RNA polymerase (RNAP) and promoter. In nuclear genomes of eukaryote organisms, transcription process is conducted by multiple types of RNA polymerases. In particular, all protein genes and most noncoding RNA genes, as well as DNA regions of unknown functions are transcribed by RNA polymerase II. 30–50 % of all known promoters contain a TATA-box located from 40 to 18 bp upstream of the TSS. However, promoters of many large groups of genes (e.g. housekeeping genes) lack the TATA-box; the corresponding promoters are referred to as TATAless promoters [1-3]. In contrast to eukaryotes, bacteria have a single form of the RNAP core enzyme [4]. However, this RNAP alone is not able to recognize and bind to promoters to initialize transcription. Different σfactors are required that temporarily binds the RNAP core enzyme, determine the RNAPpromoter binding specificity and transcription start site (TSS), depending on nutritional or environmental conditions or developmental stage [5, 6]. Bacterial σ factors are classified into two families with distinct structure and function, termed as σ and σ in Escherichia coli. While most bacteria possess multiple members of the σ family, they contain a single representative of the σ family, which is involved in nitrogen metabolism. Cyanobacteria lack any σ-like factors [5, 7, 8]. Due to the development of advanced experimental technologies a great progress was made in analysis of gene regulatory sequences [9–11]. However, a detailed experimental exploration of transcripts is still a quite expensive and difficult procedure. Therefore, in addition to experimental efforts, accurate computational identification of putative promoter regions remains an important task of genomics and post-genomics studies. Over the last decade various promoter prediction programs have been developed. Recent studies indicate that there is often no single TSS, but rather a whole transcription start region (TSR) with multiple TSSs [12, 13]. However, for genome annotation projects predicting TSRs spanning several hundred (from 250 up to 1000) nucleotides is less useful to identify a gene start point. For such tasks, finding the TSSs seems to be more informative. To date, various computer programs aimed to predict plant promoters have been developed [14– 18]. In particular, previously we developed the TSSP-TCM program that showed a quite high accuracy of TSS prediction in the test sequences with experimentally validated TSSs: 87.5 % and 84 % for TATA and TATA-less promoters, respectively. The first attempt to predict bacterial promoters was by position weight matrices (PWM), which relied on the conservation of the -35 and the -10 elements for σ, combined with the distribution of the distance between them [19, 20]. Later, more accurate bacterial promoter prediction tools have been developed [21–28]. Despite these efforts, all these tools tend to produce many false positives or show poor sensitivity, particularly when they are applied to long sequences or whole genomes. Another restriction of these tools is that they are limited to the prediction of 70 promoters in the model organism E. coli, and very rarely can extend to other bacterial species. Therefore, novel, more accurate and efficient tools are required for the computational recognition of different classes of promoters in a broader taxonomical scope. In this paper, two new computer tools, TSSPlant for prediction of plant promoters for RNA polymerase II, and and bTSSfinder for predicting TSSs in E. coli and three cyanobacterial species are briefly described.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要