Professional Documents
Culture Documents
www.elsevier.com/locate/talanta
Received 6 January 2000; received in revised form 1 March 2000; accepted 3 March 2000
Abstract
A multivariate standardization procedure was used to extend the lifetime of a multivariate partial least squares
(PLS) calibration model for determining chromium in tanning sewage. The Kennard/Stone algorithm was used to
select the transfer samples and the F-test was used to decide whether slope/bias correction (SBC) or piecewise direct
standardization (PDS) had to be applied. Special attention was paid to the transfer samples since the process can be
invalidated if samples are selected which behave anomalously. The results of the F-test were extremely sensitive to
heterogeneity in the transfer set. In these cases, it should be taken as an interpretation tool. 2000 Elsevier Science
B.V. All rights reserved.
0039-9140/00/$ - see front matter 2000 Elsevier Science B.V. All rights reserved.
PII: S 0 0 3 9 - 9 1 4 0 ( 0 0 ) 0 0 3 6 6 - 0
330 F. Sales et al. / Talanta 52 (2000) 329336
reference methods. One way of extending the use to correct the deviated predictions and evaluate
of multivariate models is to apply standardization the information obtained from the diagnostic F-
procedures, which reduce the experimental work test. The number of transfer samples was also
needed to update the model if there is a drift in varied.
the predictions [5]. Several standardization strate-
gies have been successful in solving these analyti-
cal problems [6 8]. 2. Theoretical background
One of these strategies is to transfer the newly
measured data, where the experimental conditions 2.1. Multi6ariate calibration
in which the calibration model is developed are
considered as the reference, and the samples The prediction step can be formulated in in-
analysed after the change (second experimental verse multivariate calibration models [16] by Eq.
conditions) are corrected in the direction of the (1):
model conditions [9]. For this, a set of representa-
cunk = rTunkb (1)
tive samples in the initial experimental conditions
is selected by the algorithms e.g. Kennard/Stone where the concentration of an analyte in an un-
[10] or hat matrix calculation [11]. These samples known sample cunk is predicted by multiplying the
have to be analysed under the new experimental transposed response of the sample rTunk (one sam-
conditions and a transfer function must be estab- ple by m variables) by the matrix of the coeffi-
lished to correct the data from the new samples to cients of the model b (m variables by one analyte).
be predicted with the initial model. Under the new The model is constructed with both the matrix
experimental conditions, the samples are expected of responses R (n samples by m variables) and the
to behave like the transfer samples [12]. It is, matrix of concentrations c (n samples by one
therefore, important to detect possible outliers in analyte) of the calibration samples as follows:
the transfer samples, since standardizations could
b =R+c (2)
be wrong if they are included.
+
The transfer technique used depends on how and R is the pseudoinverse of the centered ma-
great the change among the experimental condi- trix R. The calculation of R+ characterizes the
tions is. Tests done before the standardization multivariate calibration method. For instance,
have been used to decide the most suitable tech- PLS tries to extract the most useful information
nique for solving the problem [13]. from R while at the same time considering the
This paper studies the validity over time of a concentration [17].
multivariate calibration model for quantifying Accuracy in quantification is assessed by the
chromium in residual samples from tanning treat- error of prediction [18], for example, as the mean
ments. The model was not used continuously and squared error of prediction (MSEP):
this means that its validation was only tested n
when the system was used. The model was built % (ci ci )2
i=1
with UVVis spectra. The Kennard/Stone al- MSEP= (3)
n
gorithm was used to select the samples which, in a
first step, were used to evaluate whether there had or, in percentage terms, as the relative root mean
been changes in experimental conditions. First, a square error of prediction (RRMSEP):
diagnostic F-test was applied before the transfer 100
to determine the best correction technique to use. RRMSEP=
MSEP (4)
c
The transfer step was then applied. Possible irreg-
ular tendencies of the selected samples under the In these expressions, ci is the predicted concen-
second experimental conditions were also consid- tration, ci is the real concentration, c is the mean
ered. Slope/bias correction (SBC) [14] and piece- of the real concentrations and n is the number of
wise direct standardization (PDS) [15] were used samples predicted.
F. Sales et al. / Talanta 52 (2000) 329336 331
Calibration models can be validated with an quired samples is complete, but the algorithm
internal validation, e.g. cross-validation and, does not indicate what the best number is. This
therefore, prediction errors are called MSEPCV depends on the data studied, although subsets of
and RRMSEPCV, respectively [17]. Calibration samples which form stable structures i.e. groups
samples can be used again as an external set to of points regularly distributed, can be a criterion
validate calibration models after a period of time. to use [10].
Although they are not completely independent,
their prediction shows whether models cope with 2.2.2. Standardization algorithms
the possible changes in the experimental condi-
tions [15]. Obviously, it is essential to have sam-
2.2.2.1. SBC [21]. This is a simple univariate
ples that are stable over time, otherwise this
standardization technique which establishes a lin-
methodology could lead to incorrect results.
ear regression between cT1 and cT2. These are the
Sometimes, the samples used are not stable, so
concentrations predicted by the initial model (Eq.
other strategies must be applied [9,12].
(1)) with the responses of the transfer samples
Finally, the joint confidence interval test for the
(small T) in the first and the second experimental
slope and the intercept (JCIT) can detect potential
conditions, respectively:
bias in the predictions, i.e. systematic error in the
predictions [19]. cT1 = bias+ slope cT2 (6)
2.2. Standardization Both the slope and the intercept (so-called bias
here) of the regression are applied to obtain the
2.2.1. Selecting transfer samples standardized concentration (c2,unk)std as follows:
The first step is to select the transfer samples, in
this case, by the Kennard/Stone sequential al- (c2,unk)std = bias+ slope c2,unk (7)
gorithm [20]. The selection tries to spread the
samples uniformly along the multivariate space 2.2.2.2. PDS [22]. This is a multivariate standard-
defined by the whole set of samples. The Ken- ization technique in which the response of each
nard/Stone algorithm selects one sample at a time variable in the first experimental conditions r1i is
depending on its Euclidean distance from the related to the response of a group of variables
other of samples. If ri is the response of a sample [r2i j, r2i + j ] in the second experimental condi-
i, the Euclidean distance,d12, between two samples tions. By extending this to all the transfer sam-
ples, the relation can be written as:
'
is:
m
r1i = R2i bi + b0i (8)
d12 = % (r1j r2j )2 (5)
j=1
where a multivariate regression, e.g. PCR can
The algorithm begins by selecting the two sam- calculate the regression coefficients bi and the
ples with the maximum Euclidean distance, i.e. offset term b0i. The number of columns in matrix
the furthest from each other. R2i is called the window size, i.e. the number of
The third sample is the furthest from these two. variables from second experimental conditions in-
To find this sample, the distance between each volved in the relation.
sample and the two selected ones is calculated. In When a local model is applied for each vari-
each case, the smallest distance is kept. Of this able, vectors bi are grouped in a diagonal matrix F
group of smallest distances, the sample which has and values b0i are grouped in vector bT0 . This
the greatest distance is selected. matrix and this vector are used to correct rT2,unk, a
The fourth sample is the furthest from the three sample measured in the new experimental condi-
selected ones, and so on. Each new sample is tions, to the standardized (rT2,unk)std:
chosen by means of the procedure explained
above. This is repeated until the number of re- (r2,unk)std = rT2,unkF+ bT0 (9)
332 F. Sales et al. / Talanta 52 (2000) 329336
which can be predicted with the initial model by Matlab [24] home-made functions and Wises
Eq. (1). Toolbox [25] were used for all the computations.
the coefficients of the model do not contain so analysis in the second experimental conditions
much noise. are plotted together in the plane of the two first
The scores for the samples of the calibration principal components on Fig. 2 (left), where over
set were plotted and used to check for outliers. 99% of the total variance of the data is retained.
Neither outliers nor sub-models were found. The There is a displacement in all 10 samples. This
accuracy of the models, evaluated by a cross-val- can also be seen by the unusual differences in the
idated relative error of prediction (RRM- spectra, which are larger than the normal ran-
SEPCV), is less than 5%, which is a very good dom noise. The change in the scores suggests a
value for this type of determination. Trueness deviation in the predictions, as detected when the
was assessed by linear regression. The joint test subset of Day2 samples is predicted with the
for the slope and intercept showed that there initial model. The prediction error rises to 16%,
were no systematic errors at a 95% confidence which suggests the need for the standardization
level. procedure.
Table 1 (S1 selection) shows the samples se- However, displacement is not regular for sam-
lected when the Kennard/Stone method was used ple 18, whose new score, due to an experimental
on the 26 initial samples. The whole set of initial error, is located much further from the others.
samples, the Kennard/Stone selected samples and As this transfer sample is an outlier, it may con-
the Kennard/Stone selected samples after their siderably influence the results. If sample 18 is
Fig. 1. (a) Prediction errors versus number of factors for the calibration model. (b) Coefficients of the model for four, five and six
factors.
Table 1
Transfer samples obtained in each selection
1st 2nd 3rd 4th 5th 6th 7th 8th 9th 10th
S1 16 18 6 23 20 22 19 7 1 21
S2 16 13 6 23 20 22 19 7 1 21
S3 13 16 26 23 14 17 22 7 1 25
334 F. Sales et al. / Talanta 52 (2000) 329336
Fig. 2. Scores of the set of samples analysed using the initial experimental conditions (crosses) and the first subset of 10 samples
measured using the second experimental conditions (circles) according to procedure S1 (left) and S3 (right). Only the sample subsets
are labelled.
Table 2
Results of the diagnostic F-test for each selection of samplesa
Standardization samples
3 4 5 6 7 8 9 10
Calculated F-value for S1 387.6 21.8 25.7 26.9 23.9 19.5 38.0 37.5
Calculated F-value for S2 17.79 5.83 5.46 7.16 7.27 6.79 26.28 25.70
Calculated F-value for S3 4.28 3.89 3.43 3.14 4.26 4.23 18.14 18.76
Critical F (0.99, t2, t2) 4052.2 99.0 29.4 16.0 11.0 8.5 7.0 6.0
a
Figures in bold indicate that the calculated F-values are higher than the critical F-values.
Both SBC and PDS were used for the standard- ences when either standardization technique is
ization of the Day2 data set. The transfer samples applied. Taking into account the simplicity of the
varied from 3 to 10 and selections S1, S2 and S3 technique, and the level of predictions, the best
were used (Fig. 3). standardizations with S2 and S3 could be found
S1 does not standardize Day2 data effectively. with SBC for five transfer samples. Also, based on
Neither SBC nor PDS removed the bias in the the number of new transfer samples to remeasure,
predictions at the usual 95% level of significance S2 is better than S3. A subset of five samples for
(1 a =0.95) due to the presence of sample S2 needs to collect only one sample more than the
18 which behaves as an outlier. However, if SBC initial S1, i.e. sample 13, while for S3 selection 3
new samples should be measured samples 13,
is applied, when the number of transfer samples
26 and 14 (Table 1).
increases the relative influence of sample 18 de-
The SBC standardization notably improves the
creases and results are better, although PDS was
prediction of the calibration samples in the second
diagnosed by the F-test. The influence of sample
experimental conditions. It reduces the MSEP and
18 is maintained with the PDS technique even removes the bias in the predictions of the Day2
when the number of transfer samples increases. data without standardization. The number of
As expected, both S2 and S3 provide very ac- transfer samples is not critical for this technique
ceptable standardization results, regardless of the since any variation in the results is minimal.
selection used since there are no noticeable differ- Moreover, the best PDS outputs do not signifi-
cantly improve the SBC results. SBC is therefore
better because it is simpler and requires fewer
transfer samples to reach similar results. Also,
PDS results depend more on the number of sam-
ples than SBC does, particularly when there are
very few transfer samples. Therefore, the F-test
results, which recommended PDS when nine and
10 transfer samples were used with S2 and S3
(Table 2), should not be interpreted as a conclu-
sion but as an indication. If there is any doubt, it
is better to apply the simplest algorithm.
Fig. 4 compares the residuals for the best case
of five transfer samples for S2 with those from the
initial and the uncorrected samples. Correction is
particularly effective for the more deviated predic-
tions of the Day2 second experimental conditions,
while the other samples are slightly corrected al-
though in some cases prediction is not improved.
5. Conclusions
Acknowledgements
References