Dissecting chronic myeloid leukaemia overlapping transcriptome with TIF-Seq2

biorxiv(2020)

引用 0|浏览11
暂无评分
摘要
Eukaryotic transcriptomes are complex involving thousands of overlapping transcripts. The interleaved nature of the transcriptome limits our ability to identify regulatory regions and, in some cases, can lead to misinterpretation of gene expression. To improve the understanding of the overlapping transcriptome, we have developed an optimized method, TIF-Seq2, able to sequence simultaneously the 5’ and 3’ ends of individual RNA molecules at single-nucleotide resolution. We investigated the transcriptome of a well characterized human cell line (K562) and identify thousands of unannotated transcript isoforms. By focusing on transcripts which are challenging to be investigated with RNA-seq, we accurately defined boundaries of lowly expressed unannotated and read-though transcripts putatively encoding fusion genes. We validated our results by targeted long-read sequencing and standard RNA-Seq for chronic myeloid leukaemia patient samples. Taking the advantage of TIF-Seq2, we explore transcription regulation among the overlapping units and investigate their crosstalk. We show that most overlapping upstream transcripts use poly(A) sites within the first 2 kb of the downstream transcription unit. Our work shows that, by paring the 5’ and 3’ end of each RNA, TIF-Seq2 can improve the annotation of complex genomes, facilitates accurate assignment of promoters to genes and easily identify transcriptionally fused genes.
更多
查看译文
关键词
transcription complexity,overlapping transcriptome,read-through transcript,fusion gene,full-length isoform
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要