The refinement of PLS models by iterative ... - Semantic Scholar

Report 0 Downloads 67 Views
Chemometrics and Intelligent Laboratory Systems 68 (2003) 29 – 40 www.elsevier.com/locate/chemolab

The refinement of PLS models by iterative weighting of predictor variables and objects Michele Forina a,*, Chiara Casolino a, Eva M. Almansa b a

Dip. Chimica e Tecnologie Farmaceutiche ed Alimentari, University of Genova, Via Brigata Salerno (ponte) s/n, Genoa, Italy b Dep. Quimica Analitica, University of Granada, Spain

Abstract The flexibility of PLS algorithm can be used to assign suitable weights to predictors or to objects or to both predictors and objects. Weights of predictors are obtained from the regression coefficients and the standard deviation. Weights of objects are obtained from the prediction residuals. By iterative weighting, the regression models are refined and a steady state is attained, where useless predictors and anomalous objects are cancelled, and a very economical model is obtained. The predictive ability and stability of this final model are better than those of the original model with all the available predictors and objects. D 2003 Elsevier B.V. All rights reserved. Keywords: Multivariate regression; Outliers; Robust regression; Partial least squares

1. Introduction Regression techniques are widely used for multivariate chemical calibration and for the study of relationship between biological activity and molecular structure. Frequently, many predictors are useless, and they cause a worsening of the predictive ability of the regression model. Sometimes some objects are anomalous, outliers, e.g. because of a heavy error in the use of the reference analytical technique in the case of multivariate calibration. For this reason, many techniques have been developed to eliminate the outliers or the useless predictors. * Corresponding author. Tel.: +39-10-353-2630; fax: +39-10353-2684. E-mail address: [email protected] (M. Forina).

The techniques for the elimination of useless predictors can be classified in three categories. (a) Subset selection, e.g. by means of Genetic Algorithms [1] (GA). (b) Factor-wise selection (known also as Dimensionwise selection). Factor-wise techniques work on the single factor (principal component, latent variable) of the regression technique. Martens and Naes [2] suggested to replace with zero the small PLS weights in each latent variable, so that the corresponding predictors are cancelled from the latent variable, but can be used in one or more of the following. Frank [3] improved this procedure in the technique called Intermediate Least Squares (ILS). Other different strategies were used by Kettaneh-Wold et al. [4], Lindgren et al. [5] and Forina et al. [6].

0169-7439/03/$ - see front matter D 2003 Elsevier B.V. All rights reserved. doi:10.1016/S0169-7439(03)00085-6

30

M. Forina et al. / Chemometrics and Intelligent Laboratory Systems 68 (2003) 29–40

(c) Model-wise elimination. The regression model is developed with all the predictors. The useless predictors are eliminated on the base of the value of their regression coefficient b in the regression model: y ¼ b0 þ b1 x 1 þ . . . þ bv x v þ . . . þ bV x V

ð1Þ

A first procedure is based on the elimination of predictors with small regression coefficients in the regression model computed with autoscaled predictors [7]. Iterative stepwise elimination, ISE [8], is based on the importance of the predictors, defined as: zv ¼

jbv jsv V X jbv jsv

ð2Þ

v¼1

where sv is the standard deviation of the predictor v. In each elimination cycle, the predictor with the minimum importance is eliminated, and the model computed again with the remaining predictors. The final model is that with the maximum predictive ability. The Martens Uncertainty Test (MUT) [9] is based on the standard deviation of the regression coefficients bv, computed from their values in the leave-one-out cycles of cross-validation. The predictors for which the hypothesis bv = 0 is accepted at significance level 5% are eliminated. Uninformative Variable Elimination (UVE-PLS) [10] adds to the original predictors an equal number of random predictors, with very small value (range of about 10 10), so that their influence on the regression coefficients of the original predictors is negligible. The standard deviation of the regression coefficients, sbv, is obtained from the variation of the coefficients b by leave-one-out jack-knifing. The reliability of each predictor v, cv, is obtained by: cv ¼

bv sbv

ð3Þ

The maximum of the absolute value of the coefficient cv for the added artificial predictors is the cut-off value for the elimination of non-informative original predictors. There are some variants of UVE (e.g. aUVE where the cut-off value is the value of the a%

quantile of the coefficient cv of the added artificial predictors). We presented recently [11] a new procedure, Iterative Predictor Weighting PLS (IPW-PLS), for modelwise elimination of useless predictors, based on the cyclic repetition of the PLS algorithm, in each cycle multiplying the predictors by their current importance, 1 in the first cycle, than that computed at the end of the previous cycle by means of Eq. (2). Many strategies have been used to eliminate the anomalous objects. The prediction error can be used (but only in the calibration step) to identify the youtliers, characterised by an anomalous value of the response. The distance from the inner space of the significant latent variables and the leverage can be used to identify X-outliers, with the anomaly in the values of the predictors. Wakeling and Macfie [12] described a robust PLS procedure, RPLS, based on the iteration of the PLS algorithm with the objects weighted by a weight dependent upon the size of the regression residual. The median r˜ of the absolute values of the residuals jrij is computed and the object square weights are obtained by h i2 ui ¼ 1  ðri =k r˜ Þ2 for Ari A < k r˜ ui ¼ 0

for Ari A > k r˜

ð4Þ

where k is a ‘‘sensitivity factor’’ that increases or decreases the threshold at which a weight of zero is assigned to a particular object. The weighted regression coefficient of the straight line through the origin for the regression of a vector y on a vector x is: b¼

xT %y xT %x

ð5Þ

where % is the diagonal matrix of the object square weights. Eq. (5) can be rewritten as: b¼

ðxVÞT yV ðxVÞT ðxVÞ

and yV¼ %1=2 y

where xV¼ %1=2 x

M. Forina et al. / Chemometrics and Intelligent Laboratory Systems 68 (2003) 29–40

so that %1/2 can be interpreted as a set of weights for both the predictor and the response. Cummins and Andrews [13] in their Iteratively Reweighted PLS (IRPLS) use the cross-validated residuals instead of the fitting residuals apparently used by Wakeling and Macfie [12]. The leave-one-out prediction residuals can be obtained from the fitting residuals and from the leverage, and they can be very large for high-leverage outliers. So the use of prediction residuals seems very important. Moreover, Cummins and Andrews try several weight functions, besides Eq. (4), including the Huber weights: ui ¼ 1

for Ari A < k r˜

ui ¼ 0

for Ari A > k r˜

tion by its small importance, its contribution to the model decreases. With the weights (importance of predictors and object weights), the algorithm Iterative Predictors and Objects Weighting PLS (IPOW-PLS) becomes: [a] Set: Z = I/V (I: identity matrix, V: number of predictors); Set % = I (Identity matrix); [b] Set: X: original matrix of predictors, y: original vector of response; [c] Scale (autoscale, centre) predictors and response; [d] Multiply matrix of predictors by the diagonal matrices of object weights (square roots) and of importances:

ð6Þ

Xw %1=2 X Z

and other weights previously used in the iteratively reweighted ordinary least squares, IRLS. Finally, Cummins and Andrews use, instead of the median of the absolute values of the residuals, the median of the absolute deviations of the residuals from their median. The objective of this work is to evaluate the possibility to develop a robust PLS algorithm with the weights applied both to the predictors as in IPWPLS and to the objects as in RPLS.

2. Theory PLS-1 (one response variable) algorithm is based on the marginal regression through the origin of each predictor on the response (frequently after autoscaling, almost always after centring). The vector of slopes (the PLS weights) w(wT = yT X/yTy) is normalised (wnew = wold/NwoldN) to obtain the direction cosines, which identify the first PLS latent variable. The influence of outliers on the weights diminishes when the corresponding x vector and the response are multiplied by a small object weight. The weight of a given predictor depends on its correlation coefficient with the response and on its magnitude, so that it is a function of the pre-treatment. A useless predictor has always correlation coefficient slightly different from zero, so that it influences the latent variables and it has a small regression coefficient in the regression Eq. (1), and consequently, a small importance. By reducing its magnitude, by multiplica-

31

Multiply the vector of the response by the matrix of the square root of the objects weights: yw %1=2 y For the first latent variable (a = index of latent variable = 1): X wa ¼ Xw

ywa ¼ yw

Repeat steps e –k of ordinary PLS for each latent variable a: [e] [f] [g] [h] [i] [j] [k]

T T T wwa = ywa Xwa/ywa ywa, wwa,new = wwa,old/Nwwa,oldN, twa = Xwawwa, T T ca = twa ywa/twa twa, T T T pwa = twa Xwa/twa twa, T Xw,a+1 = Xwa  twa pwa , yw,a+1 = ywa  ca twa

The prediction ria errors are stored for each latent variable. End repeat (go to step [e] to compute the next PLS latent variable with the residuals of the response and of predictors). The optimum complexity A of the model is obtained by predictive optimisation, by repetition of steps [a –k] with the usual cross-validation procedure (leave-one-out or C cancellation groups). [l] Compute Z with the significant number of PLS components (the regression coefficients b are referred to the original predictors)

32

M. Forina et al. / Chemometrics and Intelligent Laboratory Systems 68 (2003) 29–40

The weights computed by Eq. (7) are very close to those computed by means of Eq. (4) of RPLS, when the constant k is 4, as shown in Fig. 1. Eq. (8) modifies the weights so that their sum is equal to the number of the objects in the training set: so some objects have weight >1.

3. Data

Fig. 1. Weights of objects as a function of the ratio between the absolute residual and the median of the absolute residuals. (A) Weights according Wakeling and Macfie [12]. (B) Weights from Eq. (7).

[m] If required, delete predictors with importance less than a cut-off value. Update the diagonal matrix Z. [n] Compute the object weights, from the prediction residuals riA. In the first implementation of the algorithm, we used: ui ¼ 1

for Ari A < 2˜r

ui ¼ ½2  Ari A=2˜r2 ui ¼ 0

for 2˜r < Ari A < 4˜r ð7Þ

for Ari A > 4˜r 1=2

1=2

and ui

u Z Xi 1=2 N ui

N ¼ Number of objects

N

ð8Þ Go to step [b] for the next IPOW cycle.

Four data set have been used to check the performances of IPOW-PLS. Always data were column centred. Data sets ART, reported in Table 1, are two artificial data set with four predictors and 10 objects. The response was originated from the equation y = 20X1 + 8X2 + error, with the predictors from a rectangular distribution (0 –1), and with the error from a normal distribution (0,0.1). The value of the response for the object was modified from the original value 23.375 to obtain an outlier, in 33.375 for ART1 and in 24.375 in ART2. The response variable of data set MOISTURE is the moisture measured on 60 samples of soy flour. The 19 predictors are the NIR absorbances from a filter instrument. These data have been described [14] and they are available [15]. Also the data in TWOINST have been described [14]. In the original paper, two spectrophotometers were used to record the NIR spectra of 60 soy flour samples (the same samples of MOISTURE). From the original 700 predictors, 175 predictors have been selected, one each four. To obtain a unified regression model [16] for the two instruments, the data matrices

Table 1 Data sets ART Object

Y (ART1)

Y (ART2)

X1

X2

X3

X4

Error (ART1)

Error (ART2)

1 2 3 4 5 6 7 8 9 10

18.263 15.694 4.146 26.119 16.598 10.824 33.375 10.095 10.149 2.595

18.263 15.694 4.146 26.119 16.598 10.824 24.375 10.095 10.149 2.595

0.705 0.774 0.045 0.961 0.524 0.298 0.829 0.226 0.106 0.100

0.533 0.014 0.414 0.871 0.767 0.622 0.824 0.695 0.999 0.103

0.579 0.760 0.862 0.056 0.053 0.647 0.589 0.980 0.676 0.798

0.289 0.814 0.790 0.949 0.592 0.263 0.986 0.243 0.015 0.284

 0.100 0.104  0.064  0.068  0.016  0.111 10.205 0.016 0.037  0.228

 0.100 0.104  0.064  0.068  0.016  0.111 1.205 0.016 0.037  0.228

M. Forina et al. / Chemometrics and Intelligent Laboratory Systems 68 (2003) 29–40

Fig. 2. Construction of the increased data matrix. Each object is described by a row vector of N+M predictors, those (N or M) of the instrument where the spectra was recorded and so many zeroes (M or N) as the absorbances of the other instrument are. In the example the same 60 samples were analyzed with both instruments, so that the increased data matrix has 120 rows (objects) and 350 columns.

of the two instruments were combined into an increased data matrix, as shown in Fig. 2.

4. Results and discussion 4.1. Data sets ART Figs. 3 – 5 and Tables 2 – 4 show some results obtained with data set ART. The starting model has regression coefficients 23.46, 13.66, 6.80 and 5.00. The four predictors have about the same standard deviation (0.31 –0.35), so

33

that it seems that the noisy predictors X3 and X4 have no negligible importance in the regression model. Fig. 3 has the same appearance as the typical plots of IPW-PLS [11]. The importance of uninformative predictors diminishes gradually, and finally, the two predictors are cancelled. The importance of retained predictors increases and then stabilizes. The general behavior is the same as what was observed with the IPW algorithm [11]. In the case of data sets ART, the retained predictors are X1 and X2, as expected, and their final regression coefficients are 19.913 and 8.289 (ART1) or 19.912 and 8.287 (ART2). The PLS characteristic plot is reported in Fig. 4 (left) as the prediction (leave-one-out) mean error as a function of the number of latent variables. Due to the small weight (cycle 2) of the anomalous object 7, and then (after cycle 3) to its elimination from the training set, the C.V. mean error (Cross Validated prediction error, the mean of the absolute value of the prediction errors) decreases noticeably, and the PLS model becomes simpler. The evolution of the curves in Fig. 4 (left) is also due to the elimination of the useless predictors. In the right part of Fig. 4, the PLS characteristic plot obtained without the anomalous object 7 is shown. The optimum prediction error is 0.173 in the first cycle, obtained with four latent variables. It decreases to 0.073 obtained with two latent variables in the second cycle, it is almost stable after three cycles, after nine cycles, it is constant (0.020). The

Fig. 3. Data sets ART—importance of the predictors in the IPOW cycles (ART1: left, ART2: right).

34

M. Forina et al. / Chemometrics and Intelligent Laboratory Systems 68 (2003) 29–40

Fig. 4. Data set ART1. (Left) Leave-one-out mean prediction error in the first three and in the last IPOW cycles. (Right) Without object 7.

simplicity of the PLS model, more than the better prediction due to the elimination of the outliers, is the true advantage of IPOW-PLS. Fig. 5 shows the degree of fitting and prediction obtained with IPOW after 10 cycles, compared with that in the first cycle (usual PLS) without the elimination of the anomalous object 7.

Table 2 Data sets ART—object weights before IPOW cycles

The elimination of object 2, as second outlier, has a reduced importance in that regards the model parameters. (In the case of data sets ART after the elimination of the true outlier, there are only nine objects. The probability that the absolute value of the residual of one of nine objects from the regression line be more than four times the median is

M. Forina et al. / Chemometrics and Intelligent Laboratory Systems 68 (2003) 29–40

35

Table 3 Data sets ART—importance of predictors before IPOW cycles

rather large, about 14%, so that the elimination of the second object is not surprising). However, the small decrease of the prediction error due to the elimination of this second objects seems to indicate a small degree of overfitting, frequently observed

Table 4 Data sets ART—absolute prediction errors after IPOW cycles Object

Cycle 1

2

3

4

5

6

7

ART1 1 0.610 1.183 0.134 0.088 0.031 0.035 0.035 2 3.330 0.656 0.334 0.383 0.365 0.382 0.382 3 6.285 0.981 0.188 0.021 0.042 0.049 0.049 4 5.688 3.376 0.261 0.161 0.038 0.050 0.050 5 2.673 1.969 0.206 0.014 0.032 0.027 0.027 6 0.271 1.008 0.035 0.067 0.057 0.057 0.057 7 10.086 10.710 10.003 10.212 10.260 10.253 10.253 8 3.244 0.373 0.023 0.087 0.063 0.061 0.061 9 0.102 0.793 0.192 0.153 0.062 0.076 0.076 10 5.252 0.838 0.153 0.247 0.089 0.089 0.089 Median 3.287 0.995 0.188 0.088 0.050 0.053 0.053 ART2 1 2 3 4 5 6 7 8 9 10 Median

0.176 0.079 0.766 0.764 0.486 0.019 1.086 0.345 0.140 0.433 0.389

0.268 0.197 0.001 0.569 0.187 0.174 1.232 0.002 0.015 0.212 0.192

0.114 0.377 0.052 0.161 0.002 0.063 1.183 0.090 0.152 0.234 0.114

0.048 0.381 0.008 0.121 0.006 0.089 1.212 0.043 0.017 0.186 0.048

0.028 0.347 0.038 0.023 0.038 0.058 1.267 0.064 0.045 0.087 0.042

0.035 0.381 0.048 0.049 0.027 0.057 1.254 0.062 0.075 0.089 0.053

0.035 0.382 0.049 0.050 0.027 0.057 1.253 0.061 0.076 0.089 0.054

with IPOW in the case of a small number of objects. 4.2. Data set MOISTURE Some results with data set MOISTURE are shown in Figs. 6– 10. Fig. 6 shows the regression coefficients of the predictors, their standard deviation and the resulting importance computed by means of Eq. (2). Fig. 7 shows on the right the PLS characteristic plot when the two outliers (shown in Fig. 8) are canceled from the training set. So, it is also possible here to identify the effect of the elimination of the outliers and that of the elimination of useless predictors, both on the prediction error and on the simplicity of the PLS model. The effect of the elimination of the outliers can be evaluated by comparing the first cycle in the two parts of Fig. 7. The effect of the elimination of useless predictors is in the right part of Fig. 7. Fig. 8 shows both the computed and the predicted response, after 10 cycles. The lines parallel to the central line with intercept 0 and slope 1 (Computed  Predicted = Measured) are the two internal lines jrij = 2r˜ and the two external lines jrij = 2r˜. The samples whose predicted value is within the two internal lines had weights 1 in the final regression model. In the case of the two outliers, only the predicted value is reported in Fig. 8 (because the two objects were not used in the development of the final model). After eight cycles, IPOW retains only six predictors, and the model remains stable in the next cycles. In Table 5, the results are compared with those of PLS and

36

M. Forina et al. / Chemometrics and Intelligent Laboratory Systems 68 (2003) 29–40

Fig. 5. Data set ART1—Fitting and prediction in the first cycle (left) as in ordinary PLS and after ten IPOW cycles (right). The index of object is reported (only one time when computed and predicted values overlap).

of some important techniques for the elimination of useless predictors applied on the reduced data set with 58 objects obtained after the elimination of the two outliers. The elimination of useless predictors increases the predictive ability of the models. Results have been compared with those obtained with the other techniques for the elimination of useless

predictors, applied after elimination of the two outliers. The two outliers are surely Y-outliers. The samples were analysed in different laboratories in different towns with different instruments, with always the same outliers. Y-outliers can be a consequence of the storage condition of the two samples or of unusual error in the determination of moisture with the reference technique.

Fig. 6. Data set MOISTURE. Regression coefficients in the first IPOW cycle, standard deviation of the predictors, and their importance computed from Eq. (2).

M. Forina et al. / Chemometrics and Intelligent Laboratory Systems 68 (2003) 29–40

37

Fig. 7. Data set MOISTURE (Left) Leave-one-out mean prediction error in the first two and in the last IPOW cycles. (Right) Without objects 39 and 40.

The leverage of the two outliers is not very important in their elimination: ranging the object according to the decreasing leverage, the two objects are in 18 and 29 position. Fig. 9 shows the Center reliability of the real predictors compare with that of the noisy added variables used to eliminate the useless predictors. Here we used UVE-PLS with 200 artificial noisy predictors (because of the small number of predictors in the data set). UVE-PLS retains only four predictors. Its standard error of prediction (SEP, really the standard deviation of the prediction error, frequently indicated by Root Mean Square Error of Prediction, for RMSEP) is not significantly worse. In many cases, UVE-PLS retains some predictors more than IPOW and the standard deviation of the prediction error is slightly smaller. Moreover, the results of IPOW and UVE depend on some parameters (number of C.V.

cancellation groups, criterion used to identify the number of significant components) and on the randomisation seed, the number of artificial predictors, the variant as a-UVE, in the case of UVE, e.g. with only 19 noisy predictors or with a different randomisation seed also predictor 7 (that in Fig. 9 shows a reliability just below the cut-off value) was accepted, with about the same value of SEP (0.898). Fig. 10 shows the range of the regression coefficients obtained in the leave-one-out validation cycles. The elimination method indicated as ‘‘RANGE’’ in Table 5 can be considered as a variant of the Martens Uncertainty Test. In the case that the interval of the regression coefficient contains zero, the predictor is eliminated. The result is rather bad: only six useless predictors are eliminated and SEP is about the same as in the case of PLS. MUT, with the 95% confidence interval of the regression coefficients as estimated by the leave-one-out cycles, eliminated eleven predictors, but the prediction ability is worse than that of IPOW and UVE. These results were obtained with homemade

Table 5 Data set MOISTURE

Fig. 8. Data set MOISTURE—fitting and prediction after 10 IPOW cycles. The vertical arrows show the two identified outliers.

Method

Objects

Number of predictors

Retained predictors

Sep

IPOW PLS UVE RANGE

60 58 58 58

6 19 4 13

1,7,10,13,15,18

0.8706 0.9510 0.8877 0.9459

MUT

58

8

8,13,15,18 1,3,5,6,7,8,9,10, 12,13,15,16,19 5,7,8,9,10,13,15,19

0.9320

Results of IPOW compared with those of PLS and of other techniques for the elimination of useless predictors used after the elimination of the two outliers.

38

M. Forina et al. / Chemometrics and Intelligent Laboratory Systems 68 (2003) 29–40

Fig. 9. Data set MOISTURE. Reliability (from Eq. (3)) of the 19 original predictors and of 200 artificial noisy predictors generated by UVE-PLS. The range of the reliability shown by two horizontal lines corresponds to the maximum absolute value of the reliability of the noisy predictors.

The PC plot in Fig. 11 shows clearly that the most important source of information in the X-block is associated with the difference between the two instruments. The second component instead is related to the within-instrument variance, i.e. to the response. The latent variables of PLS copy the clustered structure of the principal components, but the information on the first latent variable is more correlated with the response, but with a large effect of the difference between instruments. So the prediction error with the first latent variable is very large (Fig. 12), of the order of the before-regression error. Then, the prediction error decreases, and from the fourth to the tenth, latent variable is almost constant. Six outliers, shown in Fig. 13, are identified during the IPOW cycles. Four of them (39, 99, 40 and 100) correspond to objects 39 and 40 of data set MIXTURES. So these two objects must be considered as

software. The results with Unscrambler [9] MUT are different (SEP 0.947 with 6 predictors), probably due to a different, larger, probability cut-off in the estimation of the confidence interval. Because the details are not available in Unscrambler, the difference of the results is not interpretable. 4.3. Data set TWOINST Figs. 11– 15 show some results obtained with data set TWOINST.

Fig. 10. Data set MOISTURE—range of the regression coefficients from leave-one-out validation. N the case of the predictors shown with an arrow the range includes the value 0.

Fig. 11. Data set TWOINST (centered data). (Up) Principal component plot of X-block. (Down) Score plot of the two first PLS latent variables.

M. Forina et al. / Chemometrics and Intelligent Laboratory Systems 68 (2003) 29–40

39

Fig. 12. Data set TWOINST—Leave-one-out mean prediction error in the first two and in the last IPOW cycles.

Fig. 14. Data set TWOINST—importance of the predictors in the IPOW cycles.

Y-outliers because of the independence on the instrument. The anomalous object 45 is an outlier only for one of the two instruments of TWOINST, so that the anomaly can be due to a bad spectrum or to the deterioration of the sample. Object 75 is just at the limit of the interval with boundary jrij = 4r˜. Depending on the settings of PLS, frequently, it is retained in the training set, with a small weight. The effect of the elimination of the five outliers seems rather small in Fig. 12. PLS used the first component mainly to explain the difference between the two instruments. So the plots in Fig. 12 can be compared with those in Fig. 7 (left) after a translation of one component toward left. The residual difference (e.g. in Fig. 12, cycle 1 starts with 1.2 mean prediction

error with two latent variables, compared with 1.3 for cycle 1 in Fig. 7) is due to the different quality of the instruments. Fig. 14 shows the characteristic plot of IPOW. The importances are stable after 12 cycles. Only 6 – 10 predictors are retained, depending on the number of C.V. groups and on the criterion used to select the number of significant latent variables (minimum SEP, first minimum, Haaland-Thomas F-statistics [17], Osten statistics [18], VanderVoet test [19]). Fig. 15 shows the position of the 11 predictors retained in one or more combinations of C.V. groups and criterion. Six predictors in the interval from 100 to 130 were always

Fig. 13. Data set TWOINST—fitting and prediction after the IPOW cycles. Object indexes and horizontal arrows indicate the outliers.

Fig. 15. Spectra of one of the samples on the two instruments. The asterisks indicate the wavelengths retained.

40

M. Forina et al. / Chemometrics and Intelligent Laboratory Systems 68 (2003) 29–40

selected. They must be considered as the essential predictors in the data set, practically the same for the two instruments.

5. Conclusions The iterative re-weighting of both the predictors and the objects in the PLS algorithm can improve the calibration model with the reduction of the prediction error and the increase of the simplicity of the model. Although the algorithm have been applied only on a limited number of data sets, it seems that a steady state is reached after 10 –15 iterations, with the elimination of the outliers and of the useless predictors. IPOW retains the property of IPW-PLS [11] (where the weights were applied only to the predictors, without the possibility of elimination of outliers) to produce a very economical regression model, with a minimum number of predictors, perhaps with prediction performance a bit worse than those achieved by means of UVE-PLS [7] after the elimination of outliers with a separate technique. The use of other object weights and the action on some other settable parameters can improve the performances of IPOW-PLS. However, we think that also in the present status, the algorithm can have some utility in the better exploitation of PLS regression. Acknowledgements Study was developed with funds from the University of Genova, and from CNR (National Research Council of Italy). References [1] C.B. Lucasius, G. Kateman, Genetic algorithm for large-scale optimization in chemometrics: an application, Trends Anal. Chem. 10 (1991) 254 – 281. [2] H. Martens, T. Naes, Multivariate Calibration, Wiley, Chichester, 1989.

[3] I.E. Frank, Intermediate least squares regression method, Chemom. Intell. Lab. Syst. 1 (1987) 233 – 242. [4] N. Kettaneh-Wold, J.F. MacGregor, S. Wold, Multivariate design of process experiments (M-DOPE), Chemom. Intell. Lab. Syst. 23 (1994) 39 – 50. [5] F. Lindgren, P. Geladi, S. Rannar, S. Wold, Interactive Variable Selection (IVS) for PLS. Part 1: theory and algorithms, J. Chemom. 8 (1994) 349. [6] M. Forina, G. Drava, C. De La Pezuela, Automatic selection of predictors in PLS by means of statistical test on the correlation coefficient of the marginal least-squares regressions, VI CAC (Chemometrics in Analytical Chemistry Conference), Tarragona (Spain), June, 25 – 29, Abstract Book, PII-29. [7] A. Garido Frenich, D. Jouan-Rimbaud, D.L. Massart, S. Kuttatharmmakul, M. Martinez Galera, J.L. Martinez Vidal, Wavelength selection method for multicomponent spectrophotometric determination using partial least squares, Analyst 120 (1995) 2787 – 2792. [8] R. Boggia, M. Forina, P. Fossa, L. Mosti, Chemometrics study and validation strategies in the structure-activity relationships of a new class of cardiotonic agents, Quant. Struct.-Act. Relatsh. (QSAR) 16 (1997) 201 – 213. [9] The UnscramblerR, Camo Asa, Oslo. [10] V. Centner, D.L. Massart, O.E. de Noord, S. de Jong, B.M. Vandeginste, C. Sterna, Elimination of uninformative variables for multivariate calibration, Anal. Chem. 68 (1996) 3851 – 3858. [11] M. Forina, C. Casolino, C. Pizarro Millan, Iterative Predictor Weighting PLS (IPW): a technique for the elimination of useless predictors in regression problems, J. Chemom. 13 (1999) 165 – 184. [12] I.N. Wakeling, H.J.H. Macfie, A robust PLS procedure, J. Chemom. 6 (1992) 189 – 198. [13] D.J. Cummins, C.W. Andrews, Iteratively reweighted partial least squares: a performance analysis by Montecarlo simulation, J. Chemom. 9 (1995) 489 – 507. [14] M. Forina, G. Drava, et al., Transfer of calibration function in near-infrared spectroscopy, Chemom. Intell. Lab. Syst. 27 (1995) 189 – 203. [15] M. Forina, S. Lanteri, C. Armanino, Q-PARVUS Release 3.0, An extendable package of programs for explorative data analysis, classification and regression analysis, Dip. Chimica e Tecnologie Farmaceutiche, University of Genova, free available at http://parvus.unige.it. [16] M. Forina, C. Casolino, Joint PLS regression model for two instruments, 8th International Conference on Near-Infrared Spectroscopy, Essen, September 15 – 19, 1997. [17] E. Thomas, D. Haaland, Anal. Chem. 62 (1990) 1091. [18] D. Osten, J. Chemom. 2 (1988) 39. [19] H. van der Voet, Chemolab. 25 (1994) 313 – 323.