Multivariate pattern analysis: a method and software to reveal, quantify, and visualize predictive association patterns in multicollinear data

BMC Bioinformatics(2024)

引用 0|浏览5
暂无评分
摘要
Background Strongly multicollinear covariates, such as those typically represented in metabolomics applications, represent a challenge for multivariate regression analysis. These challenges are commonly circumvented by reducing the number of covariates to a subset of linearly independent variables, but this strategy may lead to loss of resolution and thus produce models with poorer interpretative potential. The aim of this work was to implement and illustrate a method, multivariate pattern analysis (MVPA), which can handle multivariate covariates without compromising resolution or model quality. Results MVPA has been implemented in an open-source R package of the same name, mvpa. To facilitate the usage and interpretation of complex association patterns, mvpa has also been integrated into an R shiny app, mvpaShiny, which can be accessed on www.mvpashiny.org . MVPA utilizes a general projection algorithm that embraces a diversity of possible models. The method handles multicollinear and even linear dependent covariates. MVPA separates the variance in the data into orthogonal parts within the frame of a single joint model: one part describing the relations between covariates, outcome, and explanatory variables and another part describing the “net” predictive association pattern between outcome and explanatory variables. These patterns are visualized and interpreted in variance plots and plots for pattern analysis and ranking according to variable importance. Adjustment for a linear dependent covariate is performed in three steps. First, partial least squares regression with repeated Monte Carlo resampling is used to determine the number of predictive PLS components for a model relating the covariate to the outcome. Second, postprocessing of this PLS model by target projection provided a single component expressing the predictive association pattern between the outcome and the covariate. Third, the outcome and the explanatory variables were adjusted for the covariate by using the target score in the projection algorithm to obtain “net” data. We illustrate the main features of MVPA by investigating the partial mediation of a linearly dependent metabolomics descriptor on the association pattern between a measure of insulin resistance and lifestyle-related factors. Conclusions Our method and implementation in R extend the range of possible analyses and visualizations that can be performed for complex multivariate data structures. The R packages are available on github.com/liningtonlab/mvpa and github.com/liningtonlab/mvpaShiny.
更多
查看译文
关键词
Multivariate pattern analysis,Multicollinear covariates,Net association patterns,Latent variable projection,Covariate projection,Target projection
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要