Preprocessing peptide sequences for multivariate sequence-property analysis

Per Andersson,Michael Sjostrom,Torbjorn Lundstedt

CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS（1998）

引用 60|浏览5

暂无评分

摘要

The increasing number of peptide sequences with different lengths, available from synthesised peptide libraries and sequenced proteins are potentially valuable for evaluating structure-activity relationships. However, in order to apply multivariate classification or Quantitative Structure-Activity Relationship (QSAR) analyses on such sequences, it is necessary to have a preprocessing method that translates them into a uniform set of variables. By describing each amino acid by principal properties (z-scales) and then calculating auto cross covariances (ACCs) for each sequence, a new uniform matrix is generated, i.e., each sequence is described by a vector with equal length. The ACC approach has been used before for classification of peptides, but hen, a QSAR analysis based on 20 peptide sequences of different lengths is presented. The results show that it is possible to obtain a predictive multivariate QSAR model (R-2 Y-cum = 86.2%, Q(cum)(2) = 60.3%) based on the ACC preprocessing method, together with Orthogonal Signal Correction (OSC) and Partial Least Squares (PLS). The model generated was further validated by permutation tests and found to be valid. The new variables generated by ACCs can also be interpreted, i.e., used to identify important features in the original sequences. (C) 1998 Elsevier Science B.V. All rights reserved.

查看译文

关键词

peptide sequences,peptide libraries,z-scales,auto cross covariances,QSAR,PLS,OSC

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要