De novo genome assembly and transcriptome analysis for the drought and salt resistant Solanum sitiens
Bioinformatics(2020)
摘要
Solanum sitiens is a self-incompatible wild relative of tomato, characterised by salt and drought resistance traits, with the potential to contribute to crop improvement in cultivated tomato. This species has a distinct morphology, classification and ecotype compared to other stress resistant wild tomato relatives such as S. pennellii and S. chilense . Therefore, the availability of a high-quality reference genome for S. sitiens will facilitate the genetic and molecular understanding of salt and drought resistance. Here, we present a de novo genome and transcriptome assembly for S. sitiens (Accession LA1974). A hybrid assembly strategy was followed using Illumina short reads (∼159X coverage) and PacBio long reads (∼44X coverage), generating a total of ∼262 Gbp of DNA sequence; in addition, ∼2,670 Gbp of BioNano data was obtained. A reference genome of 1,245 Mbp, arranged in 1,481 scaffolds with a N50 of 1,826 Mbp was generated. Genome completeness was estimated at 95% using the Benchmarking Universal Single-Copy Orthologs (BUSCO) and the K-mer Analysis Tool (KAT); this is within the range of current high-quality reference genomes for other tomato wild relatives. Additionally, we identified three large inversions compared to S. lycopersicum , containing several drought resistance related genes, such as beta-amylase 1 and YUCCA7 .
In addition, ∼63 Gbp of RNA-Seq were generated to support the prediction of 31,164 genes from the assembly, and perform a de novo transcriptome. Some of the protein clusters unique to S. sitiens were associated with genes involved in drought and salt resistance, including GLO1 and FQR1 .
This first reference genome for S. sitiens will provide a valuable resource to progress QTL studies to the gene level, and will assist molecular breeding to improve crop production in water-limited environments.
### Competing Interest Statement
The authors have declared no competing interest.
* ### Glossary
ABA
: Abscisic Acid
bp
: basepair
bwa
: Burrow-Wheeler Aligner
BUSCO
: Benchmarking with Universal Single-Copy Orthologs
FR
: Forward Reverse
Gbp
: Gigabasepair
GC
: Guanine-Cytosine
GO
: Gene Ontology
IPS
: InterProScan
KAT
: K-mer Analysis Toolkit
Kbp
: Kilobasepair
Mbp
: Megabasepair
PacBio
: Pacific Biosciences
PE
: Paired-End
QTL
: Quantitative Trait Locus
SMRT
: Single Molecule, Real-Time
TAIR
: The Arabidopsis Information Resource
TPM
: Transcripts Per Million
WGS
: Whole Genome Sequencing
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要