Sequencing of individual barcoded cDNAs on Pacific Biosciences and Oxford Nanopore reveals platform-specific error patterns

bioRxiv (Cold Spring Harbor Laboratory)(2022)

引用 0|浏览5
暂无评分
摘要
Long-read transcriptomics requires understanding error sources inherent to technologies. Current approaches cannot compare methods for an individual RNA molecule. Here, we present a novel platform comparison method that combined barcoding strategies and long-read sequencing to sequence cDNA copies representing an individual RNA molecule on both Pacific Biosciences and Oxford Nanopore. We compared these long reads pairs in terms of sequence content and splicing structure. Although individual read pairs show high similarity, we found differences in (i) aligned length, (ii) TSS and (iii) polyA-site assignment, and (iv) exon-intron structures. Overall 25% of read pairs disagreed on either TSS, polyA-site, or a splice site. Intron-chain disagreement typically arises from alignment errors of microexons and complicated splice sites. Our single-molecule technology comparison revealed that inconsistencies are often caused by sequencing-error induced inaccurate ONT alignments, especially to downstream GTNNGT donor motifs. However, annotation-disagreeing upstream shifts in NAGNAG acceptors in ONT are often confirmed by PacBio and thus likely real. In both barcoded and non-barcoded ONT reads, we found that intron number and proximity of other GT/AGs better predict inconsistency with the annotation than read quality alone. We summarized these findings in an annotation-based algorithm for spliced alignment correction that improves subsequent transcript construction with ONT reads.
更多
查看译文
关键词
pacific biosciences,cdnas,oxford nanopore,platform-specific
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要