谷歌浏览器插件
订阅小程序
在清言上使用

Statistical Methods for Identifying Sequence Motifs Affecting Point Mutations.

GENETICS(2017)

引用 22|浏览25
暂无评分
摘要
Mutation processes differ between types of point mutation, genomic locations, cells, and biological species. For some point mutations, specific neighboring bases are known to be mechanistically influential. Beyond these cases, numerous questions remain unresolved, including: what are the sequence motifs that affect point mutations? How large are the motifs? Are they strand symmetric? And, do they vary between samples? We present new log-linear models that allow explicit examination of these questions, along with sequence logo style visualization to enable identifying specific motifs. We demonstrate the performance of these methods by analyzing mutation processes in human germline and malignant melanoma. We recapitulate the known CpG effect, and identify novel motifs, including a highly significant motif associated with A -> G mutations. We show that major effects of neighbors on germline mutation lie within +/- 2 of the mutating base. Models are also presented for contrasting the entire mutation spectra (the distribution of the different point mutations). We show the spectra vary significantly between autosomes and X-chromosome, with a difference in T -> C transition dominating. Analyses of malignant melanoma confirmed reported characteristic features of this cancer, including statistically significant strand asymmetry, and markedly different neighboring influences. The methods we present are made freely available as a Python library https://bitbucket.org/pycogent3/mutationmotif.
更多
查看译文
关键词
context dependent mutation,germline mutation,somatic mutation,sequence motif analysis,mutation spectrum,bioinformatics,5-methyl-cytosine,log-linear model
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要