谷歌浏览器插件
订阅小程序
在清言上使用

ReSLB: Load Balanced Workflow for Distributed Deep Learning Mass Spectrometry Database

2022 IEEE 24th Int Conf on High Performance Computing &amp Communications 8th Int Conf on Data Science &amp Systems 20th Int Conf on Smart City 8th Int Conf on Dependability in Sensor, Cloud &amp Big Data Systems &amp Application (HPCC/DSS/SmartCity/DependSys)(2022)

引用 0|浏览4
暂无评分
摘要
The proteomics data analysis pipeline based on the shotgun method requires efficient data processing methods. The parallel algorithm of mass spectrometry database search faces the problems of rapidly expanding database size but low scalability. The heterogeneous database search algorithm based on deep learning is an effective way to solve this problem. Still, the deep learning-based distributed parallel database search algorithm is lacking. This paper analyzes the database searching computational load using a neural network and designs ReSLB, a workflow for the restricted search of mass spectrometry data based on GPU cluster and neural network scoring algorithm. This work aims to ensure the high scalability of future deep learning-based mass spectrometry distributed databases. In the performance estimation of 256 GPUs, the load imbalance of less than 30% and the parallel efficiency of 60% are achieved. Compared with state-of-the-art, the time cost is reduced by 75%, and the parallel efficiency from 1 to 256 GPUs is 3.6x higher than that.
更多
查看译文
关键词
Proteomics,Mass Spectrometry Database,Deep Learning-Based Scoring Algorithm,GPU Cluster,Load Balance
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要