Next-Generation Sequencing Markup Language (NGSML): A Medium for the Representation and Exchange of NGS Data

IEEE/ACM Transactions on Computational Biology and Bioinformatics(2023)

引用 3|浏览16
暂无评分
摘要
With the increasing demand for low-cost high-throughput sequencing of large genomes, next-generation sequencing (NGS) technology has developed rapidly. NGS can not only be used in basic scientific research but also in clinical diagnostics and healthcare. Numerous software systems and tools have been developed to analyze NGS data, and various data formats have been produced to accommodate different sequencing equipment providers or analytical software. However, the data interoperability between these tools brings great challenges to researchers. A generic format that could be shared by most of the software and tools in the NGS field would make data interoperability and sharing easier. In this paper, we defined a general XML-based NGS markup language (NGSML) format for the representation and exchange of NGS data. We also developed a user-friendly GUI tool, NGSMLEditor, for presenting, creating, editing, and converting NGSML files. By using NGSML, various types of NGS data can be saved in one unified format. Compared with the unstructured plain text file, a structured data format based on XML technology solves the incompatibility of various NGS data formats. The NGSML specifications are freely available from http://www.sysbio.org.cn/NGSML . NGSMLEditor is open source under GNU GPL and can be downloaded from the website.
更多
查看译文
关键词
Next-generation sequencing,extensible markup language,data format,NGSML
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要