Single molecule long-read real-time amplicon-based sequencing of CYP2D6: a proof-of-concept with hybrid haplotypes

bioRxiv (Cold Spring Harbor Laboratory)(2022)

引用 0|浏览3
暂无评分
摘要
Abstract CYP2D6 is a widely expressed human xenobiotic metabolizing enzyme, best known for its role in the hepatic phase I cytochrome P450 enzyme system, where it metabolizes ∼20% of medications. It is also expressed in other organs including the brain, where its potential role in physiology and mental health traits and disorders is under further investigation. Owing to the presence of homologous pseudogenes in the CYP2D locus and transposable repeat elements in the intergenic regions, the gene encoding the CYP2D6 enzyme, CYP2D6 , is one of the most hypervariable known human genes - with more than 165 core haplotypes. Haplotypes include structural variants, with a subtype of these known as hybrid haplotypes or fusion genes comprising part of CYP2D6 and part of its adjacent pseudogene, CYP2D7 . The fusion genes are particularly challenging to identify. High fidelity (HiFi) single molecule real-time (SMRT) long-read sequencing can cover whole CYP2D6 haplotypes in a single continuous sequence, and is therefore ideal for structural variant detection. In addition, it is highly accurate and suitable for novel haplotype identification, which is necessary as new CYP2D6 haplotypes are continuously being discovered, and many more likely remain to be identified in relatively understudied populations such as Indigenous Peoples. The aim of the present work was to develop an efficient and accurate HiFi SMRT amplicon-based method capable of detecting the full range of CYP2D6 haplotypes including fusion genes. We report proof-of-concept for 24 amplicons including three positive controls, aligned to fusion gene haplotypes, with prior cross-validation data. Amplicons with CYP2D7-D6 fusion genes, including positive controls, aligned to the *13 subhaplotypes predicted ( *13F, *13A2 ) with 100% accuracy, with the exception of one that aligned at 99.9%. Alignment of the *68 was 100% and above 99.9% to the CYP2D6*68 partial sequences EU5300606 and JF307779, respectively. The best alignments for the remaining CYP2D6-2D7 fusion genes were ≥99.7% (to 3 significant figures). Lower percentage alignment for CYP2D6-2D7 fusion genes may reflect imperfect PCR optimization and/or the possibility that we may have haplotypes not yet in public databases. Further work on these is in progress. Moreover, we have adapted this method for non-hybrid haplotypes. This technique could therefore suffice for the characterization of the full range of CYP2D6 haplotypes. The method that we have developed could be extended to other complex loci and to other species in a multiplexed high throughput assay.
更多
查看译文
关键词
hybrid haplotypes,sequencing,single molecule,long-read,real-time,amplicon-based,proof-of-concept
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要