谷歌浏览器插件
订阅小程序
在清言上使用

Machine Learning Prediction of Non-Coding Variant Impact in Human Retinal Cis-Regulatory Elements

TRANSLATIONAL VISION SCIENCE & TECHNOLOGY(2022)

引用 4|浏览19
暂无评分
摘要
Purpose: Prior studies have demonstrated the significance of specific cis-regulatory variants in retinal disease; however, determining the functional impact of regulatory variants remains a major challenge. In this study, we utilized a machine learning approach, trained on epigenomic data from the adult human retina, to systematically quantify the predicted impact of cis-regulatory variants. Methods: We used human retinal DNA accessibility data (ATAC-seq) to determine a set of 18.9k high-confidence, putative cis-regulatory elements. Eighty percent of these elements were used to train a machine learning model utilizing a gapped k-mer support vector machine-based approach. In silico saturation mutagenesis and variant scoring was applied to predict the functional impact of all potential single nucleotide variants within cis-regulatory elements. Impact scores were tested in a 20% hold-out dataset and compared to allele population frequency, phylogenetic conservation, transcription factor (TF) binding motifs, and existing massively parallel reporter assay data. Results: We generated a model that distinguishes between human retinal regulatory elements and negative test sequences with 95% accuracy. Among a hold-out test set of 3.7k human retinal CREs, all possible single nucleotide variants were scored. Variants with negative impact scores correlated with higher phylogenetic conservation of the reference allele, disruption of predicted TF binding motifs, and massively parallel reporter expression. Conclusions: We demonstrated the utility of human retinal epigenomic data to train a machine learning model for the purpose of predicting the impact of non-coding regulatory sequence variants. Our model accurately scored sequences and predicted putative transcription factor binding motifs. This approach has the potential to expedite the characterization of pathogenic non-coding sequence variants in the context of unexplained retinal disease.
更多
查看译文
关键词
Pathogenicity Prediction
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要