SNP Mining by Genome Resequencing of 30 Apple Varieties in Shandong Province

Naibin Duan, Yumin Ma, Kun Wang,Xiaomu Wang,Kun Xie,Jing Bai, Yongyi Yang, Yanyan Pu,Yongchao Gong

Molecular plant breeding(2020)

引用 0|浏览0
暂无评分
摘要
In this article, we carried out genome resequencing and SNP mining for cultivated apples in Shandong Province for the sake of the rapid identification of apple varieties, germplasm evaluation, and utilization.Genomic DNA was extracted immediately from leaves of each sample, and Paired-end Illumina genomic libraries were prepared and sequenced on an Illumina Hiseq 4 000 platform following the manufacturer's instructions.Resequencing of the 31 apple genomes generated a total of 363 Gb high-quality cleaned sequences, with an average of 12.5 Gb per accession that represented approximately 15.9x coverage of the apple genome.The data volume fully meets the needs of downstream analysis and SNP mining.When we used the nucleotide mismatch parameter from 1~12, the mapping rate gradually increased to saturation.There was a highly significant correlation (p<0.0001) between the total mapping rate, mapping rate of pair-end data, and mismatch parameter.Univariate fourth-order equation (regression coefficient r>0.99) were predicted.As the mismatch rate increases, the accuracy of mapping decreases; the genome coverage gradually increases, and heterozygous sites' accuracy gradually increases.In this study, two algorithms were used in SNP mining.The intersection was further taken based on the 'chromosome+site information' as the eigenvalues to obtain a highly reliable single nucleotide variant dataset.A total of 374 404 SNP locus were detected.On average, one variation can be identified from 1 896 bp.The accuracy of the Sanger verification test is as high as 98.1%.Annotation analysis shows that among the 373 763 SNPs, 25 047 (6.7%) are located in the gene coding region, 143 269 (38.27%) are located in the intergenic region, and 179 426 (47.92%) are located in the 2 kb region upstream or downstream of the corresponding genes.Among the coding region SNPs, 13 422 are non-synonymous, while 11 625 are synonymous variations.The ratio of non-synonymous to synonymous SNP is 1.15: 1.Using the filtered 4DTV sites, population clustering analysis results constructed using neighbor-joining algorithms are in line with the trend of the classification of cultivated apples in Shandong province.
更多
查看译文
关键词
Phylogenetic Analysis
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要