Optimised Pre-Processing of Raman Spectra for Colorectal Cancer Detection using High-Performance Computing

APPLIED SPECTROSCOPY(2022)

引用 4|浏览1
暂无评分
摘要
Spectral pre-processing is an essential step in data analysis for biomedical diagnostic applications of Raman spectroscopy, allowing the removal of undesirable spectral contributions that could mask biological information used for diagnosis. However, due to the specificity of pre-processing for a given sample type and the vast number of potential pre-processing combinations, optimisation of pre-processing via a manual "trial and error" format is often time intensive with no guarantee that the chosen method is optimal for the sample type. Here we present the use of highperformance computing (HPC) to trial over 2.4 million pre-processing permutations to demonstrate the optimisation on the pre-processing of human serum Raman spectra for colorectal cancer detection. The effect of varying preprocessing order, using extended multiplicative scatter correction, spectral smoothing, baseline correction, binning and normalization was considered. Permutations were assessed on their ability to detect patients with disease using a random forest (RF) algorithm trained with 102 patients (510 spectra) and independently tested with a set of 439 patients (1317 spectra) in a primary care patient cohort. Optimising via HPC enables improved performance in diagnostic abilities, with sensitivity increasing by 14.6%, specificity increasing by 6.9%, positive predictive value increasing by 3.4%, and negative predictive value increasing by 2.4% when compared to a standard pre-processing optimisation. Ultimate values of these metrics are very important for diagnostic adoption, and once diagnostics demonstrate good accuracy these types of optimisations can make a significant difference to roll-out of a test and demonstrating advantages over existing tests. We also provide tips/recommendations for pre-processing optimisation without the use of HPC. From the HPC permutations, recommendations for appropriate parameter constraints for conducting a more basic pre-processing optimisation are also detailed, thus helping model development for researchers not having access to HPC.
更多
查看译文
关键词
Raman spectroscopy,high-performance computing,cancer,biospectroscopy,machine learning,pre-processing,optimisation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要