Deep learning architecture optimization with metaheuristic algorithms for predicting BRCA1 / BRCA2 pathogenicity NGS analysis

Research Square (Research Square)(2022)

引用 0|浏览1
暂无评分
摘要
Abstract BRCA1 and BRCA2 are genes with tumor suppressor activity, and they are involved ina considerable number of biological processes allowing the regulation of the cellreplication cycle. A mutation in one of these two genes has a significant probability ofcausing cancer. We have set up within the platform a machine learning algorithm basedon the random forest to predict pathogenicity in colorectal, melanoma, lung, and gliomacancer. but this algorithm has revealed its limits when we want to predict on morecomplex genes like BRCA1 and BRCA2. To help the biologist in the classification oftumors, we decided to develop a deep learning algorithm.The question we ask ourselves when we want to construct a neural network is howmany hidden layers and neurons should we use. If the number of inputs and outputs isdefined by the problem that we require to resolve, the number of hidden layers andneurons is difficult to define because there is no pre-established rule. The number ofhidden layers and neurons that make up each layer of the neural network has aninfluence on the performance of system predictions. There are different methods forfinding the optimal architecture like grid search or based on empirical equations. Allthese techniques can be very time-consuming. In this paper, we will present the twopackages that we have developed, the genetic algorithm (GA) and the particle swarmoptimization (PSO) to optimize the parameters of the neural network for the predictionof the pathogenicity of the BRCA1 and BRCA2 genes. We will compare the resultsobtained by the two algorithms. We used datasets collected from our NGS analysis ofBRCA1 and BRCA2 genes to train deep learning models. This represents a datacollection of 11,875 BRCA1 and BRCA2 variants (BRCA1 benign 2,632, BRCA1pathogenic 2,660, BRCA2 benign 3,446, BRCA2 pathogenic 3,137). Our preliminaryresults show that the PSO provided the most significant architecture in terms of hiddenlayers and the number of neurons compared to grid search and GA. The optimalarchitecture found by the PSO algorithm is composed of 6 hidden layers with 275 hiddennodes with an accuracy of 0.98, precision 0.99, recall 0.98, and a specificity of 0.99.
更多
查看译文
关键词
brca2,brca1,deep learning,metaheuristic algorithms,architecture optimization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要