Phase & Power in Genomic Harmonic Analysis

2022 IEEE 52nd International Symposium on Multiple-Valued Logic (ISMVL)(2022)

引用 0|浏览0
暂无评分
摘要
Genomic Fourier series are produced by applying the Fourier transform to encoded genomic sequences. In the past, the Fourier coefficients of genomic signals were shown to provide useful information in common bioinformatic operations, including sequence alignment (MAFFT) [1], phylogeny construction [2], coding region differentiation [3], and more [4]. In this paper, I apply the phase and power spectra to clustering tasks separately and together. I show that the phase spectra when computed and treated in a way similar to how the power spectra are treated in previous work provides some consistency in the produced genomic sequence clusters (in terms of the location of submission of the original sequences). Clusters are visualized using T-SNE and UMAP procedures, as well as a filtering technique followed by PCA bi-plots. These show well differentiated regional groups, but not consistent global distances between groups. An experiment where distances formed as a composition of phase/power spectra shows that dendrograms produced by the distances which were more heavily weighted on the phase appeared to have lower internal distances within produced cluster. In conclusion, I offer thoughts on the utility of the phase spectra over and in addition to the power spectra for clustering sequences, and notes on the current and future directions associated with this project.
更多
查看译文
关键词
Genetic,Genomic,Fourier Transform,Power Spectra,Phase Spectra,Multiple Valued Logic,Categorical Data,Sequence Analysis,Signals Processing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要