Detection and Quantification of 5moU RNA Modification from Direct RNA Sequencing Data

Jiayi Li, Feiyang Sun, Kunyang He,Lin Zhang,Jia Meng,Daiyun Huang,Yuxin Zhang

CURRENT GENOMICS(2024)

引用 0|浏览1
暂无评分
摘要
Background Chemically modified therapeutic mRNAs have gained momentum recently. In addition to commonly used modifications (e.g., pseudouridine), 5moU is considered a promising substitution for uridine in therapeutic mRNAs. Accurate identification of 5-methoxyuridine (5moU) would be crucial for the study and quality control of relevant in vitro-transcribed (IVT) mRNAs. However, current methods exhibit deficiencies in providing quantitative methodologies for detecting such modification. Utilizing the capabilities of Oxford nanopore direct RNA sequencing, in this study, we present NanoML-5moU, a machine-learning framework designed specifically for the read-level detection and quantification of 5moU modification for IVT data.Method Nanopore direct RNA sequencing data from both 5moU-modified and unmodified control samples were collected. Subsequently, a comprehensive analysis and modeling of signal event characteristics (mean, median current intensities, standard deviations, and dwell times) were performed. Furthermore, classical machine learning algorithms, notably the Support Vector Machine (SVM), Random Forest (RF), and XGBoost were employed to discern 5moU modifications within NNUNN (where N represents A, C, U, or G) 5-mers.Result Notably, the signal event attributes pertaining to each constituent base of the NNUNN 5-mers, in conjunction with the utilization of the XGBoost algorithm, exhibited remarkable performance levels (with a maximum AUROC of 0.9567 in the "AGTTC" reference 5-mer dataset and a minimum AUROC of 0.8113 in the "TGTGC" reference 5-mer dataset). This accomplishment markedly exceeded the efficacy of the prevailing background error comparison model (ELIGOs AUC 0.751 for site-level prediction). The model's performance was further validated through a series of curated datasets, which featured customized modification ratios designed to emulate broader data patterns, demonstrating its general applicability in quality control of IVT mRNA vaccines. The NanoML-5moU framework is publicly available on GitHub (https://github.com/JiayiLi21/NanoML-5moU).Conclusion NanoML-5moU enables accurate read-level profiling of 5moU modification with nanopore direct RNA-sequencing, which is a powerful tool specialized in unveiling signal patterns in in vitro-transcribed (IVT) mRNAs.
更多
查看译文
关键词
Oxford nanopore sequencing,RNA modification,machine learning,5-methoxyuridine,read-level modification profiling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要