Frequent Patterns Mining In Dna Sequence

Na Deng,Xu Chen,Desheng Li,Caiquan Xiong

IEEE ACCESS（2019）

引用 7|浏览33

暂无评分

摘要

As a common biological sequence, DNA sequences contain important information. The discovery of frequent patterns in DNA sequences can help to study the evolution, function and variation of genes. The findings are of great significance to genetic and mutation analysis, analysis of disease causes and treatment of diseases. Traditional methods of frequent pattern discovery need to scan DNA sequences multiple times. To overcome this shortcoming, this article proposes a new method to discover frequent patterns from DNA sequences. This method is based on a two-level nested hash table data structure and set operation. All frequent patterns and their positions in DNA sequences can be found by scanning DNA sequences only once. Experimental results show that this method can correctly recognize all the frequent patterns in DNA sequences and their locations. The method can also be applied to discover frequent patterns in RNA, protein or other biological sequences.

查看译文

关键词

Big data, biological information, data mining, DNA sequence, frequent pattern, hash table

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要