Exploring the Performance of Linear and Nonlinear Models of Time-of-Flight Secondary Ion Mass Spectrometry Spectra.

Analytical chemistry(2024)

引用 0|浏览1
暂无评分
摘要
Multivariate statistical tools and machine learning (ML) techniques can deconvolute hyperspectral data and control the disparity between the number of samples and features in materials science. Nevertheless, the importance of generating sufficient high-quality sample replicates in training data cannot be overlooked, as it fundamentally affects the performance of ML models. Here, we present a quantitative analysis of time-of-flight secondary ion mass spectrometry (ToF-SIMS) spectra of a simple microarray system of two food dyes using partial least-squares (PLS, linear) and random forest (RF, nonlinear) algorithms. This microarray was generated by a high-throughput sample preparation and analysis workflow for fast and efficient acquisition of quality and reproducible spectra via ToF-SIMS. We drew insights from the bias-variance trade-off, investigated the performances of PLS and RF regression models as a function of training data size, and inferred the amount of data needed to construct accurate and reliable regression models. In addition, we found that the spectral concatenation of positive and negative ToF-SIMS spectra improved the model performances. This study provides an empirical basis for future design of high-throughput microarrays and multicomponent systems, for the purpose of analysis with ToF-SIMS and ML.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要