A hybrid MPI/OpenMP parallel implementation of NSGA-II for finding patterns in protein sequences

David L. González-Álvarez,Miguel A. Vega-Rodríguez,Álvaro Rubio-Largo

The Journal of Supercomputing（2016）

引用 6|浏览7

暂无评分

摘要

Since the late 1970s, when the first DNA-based genome was sequenced, the field of biology is experiencing a significant growth in the amount of data that needs to be processed. Long ago it became impractical to analyze all this information manually, resulting in a great need for new techniques, algorithms and strategies to facilitate this work. Within the vast world of bioinformatics, we will focus on proteomics, more specifically, on the discovery of small repeated common patterns on sets of protein sequences that may represent some biological functionality. When we analyze a large number of sequences, the problem shows non-deterministic polynomial times, it implies that we could benefit from the combination of high-performance computing and computational intelligence techniques. In this paper, we address the discovery of repeated common patterns as a multiobjective optimization problem by means of a hybrid MPI/OpenMP approach which parallelizes a well-known multiobjective metaheuristic, the fast non-dominated sorting genetic algorithm (NSGA-II). Our main objective is to combine the benefits of shared-memory and distributed-memory programming paradigms to discover patterns in an accurate and efficient manner. Experiments conducted on six different datasets, comparisons with other well-known biological tools, and the obtained speed-ups and efficiencies show that our approach is able to achieve a significant performance in terms of parallel and biological results.

查看译文

关键词

Parallel computing,Evolutionary computation,Multiobjective optimization,Bioinformatics,Proteins

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要