The Majority Of Total Nuclear Encoded Non-Ribosomal Rna In Human Normal And Tumor Cells Is "Dark Matter" Unannotated Rna

CANCER RESEARCH(2011)

引用 2|浏览15
暂无评分
摘要
Transcriptional output of human genome is far more complex than predicted by the current set of protein-coding annotations and most of the novel RNAs being produced appear to not encode proteins. This has transformed our understanding of genome complexity in recent years and suggested new paradigms of genome regulation. However, the fraction of the genome that is utilized to produce cellular RNA whose function we do not understand and even more so, their relative mass in a cell remains a controversial issue. RNA from normal human liver and brain, the K562 leukemia cell line and 6 paired Ewing primary and metastatic tumors was converted into cDNA using random hexamers and sequenced using single-molecule sequencing (SMS). No amplification, ligation, or size selection were used thus minimizing methodological biases. PolyA+ RNA, total RNA, and total RNA depleted of ribosomal RNA were studied. The SMS reads were aligned to the complete human genome and uniquely mapping reads from human tissue sources were further filtered to exclude sequences aligning to rDNA sequences, the mitochondrial genome, as well as to genomic repeats annotated by the RepeatMasker program as rRNA. After filtering, the remaining informative reads were used for subsequent analyses, including comparison to known annotations defined by the exons of UCSC Genes. This investigation makes the following key observations. 1. We show clearly that the so-called “dark matter RNAs”, which represent mostly non-coding RNA, not only exist in human cells but can comprise the majority of total non-ribosomal, non-mitochondrial RNA. In fact, we estimate that half to two-thirds of all such RNAs in a human cell is non-coding. 2. It shows a significant loss of this complexity if only polyA+ RNA is profiled. In this respect, most, if not all, contemporary RNA-seq papers continue to focus on this type of RNA and thus report significantly skewed results in terms of the true complexity of human RNA. 3. We show the presence of a large number of very long (1009s of kbs), abundant intergenic transcribed regions located in areas of the genome that are devoid of protein-coding annotations. We show evidence that these very long and likely non-coding RNA transcripts are expressed during normal development, silenced in adult tissues and are then re-activated during cancer progression. Our understanding of the repertoire of human RNAs remains far from complete, and almost all RNA-Seq studies have missed this complexity due to the limited view obtained when using only the polyA+ RNA fraction. Moreover, many novel genomic regions give rise to RNAs differentially expressed in different tumor types and also in primary vs metastatic tumors derived from the same patient. This brings a tantalizing possibility that a great number of hitherto uncharacterized RNAs are involved in tumoregenesis and they could be used as both diagnostic and potentially therapeutic targets. Citation Format: {Authors}. {Abstract title} [abstract]. In: Proceedings of the 102nd Annual Meeting of the American Association for Cancer Research; 2011 Apr 2-6; Orlando, FL. Philadelphia (PA): AACR; Cancer Res 2011;71(8 Suppl):Abstract nr 1177. doi:10.1158/1538-7445.AM2011-1177
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要