You are on page 1of 8

Chemometrics and Intelligent Laboratory Systems 50 2000.

7582
www.elsevier.comrlocaterchemometrics

NIR calibration in non-linear systems: different PLS approaches


and artificial neural networks
M. Blanco ) , J. Coello, H. Iturriaga, S. Maspoch, J. Pages
`
Departament de Qumica,
Unitat de Qumica
Analtica,
Facultat de Ciencies,
Uniersitat Autonoma
de Barcelona, E-08193 Bellaterra,

`
`
Barcelona, Spain
Received 3 February 1999; accepted 15 July 1999

Abstract
The frequent non-linearity of the calibration models used in infrared reflectance spectroscopy NIRSS. is the main source
of large errors in analyte determinations with this technique. Non-linearity in this type of system arises from factors such as
the multiplicative effect of differences in particle size among samples or an intrinsically non-linear absorbanceconcentration relationship resulting from interactions between components, hydrogen bonding, etc. In this work, calibration methods
including partial least-squares PLS. regression, linear quadratic PLS LQ-PLS., quadratic PLS QPLS. and artificial neural
networks ANNs. were used in conjunction with the NIRRS technique to determine the moisture content of acrylic fibres,
the wide variability in linear density of which results in differential multiplicative effects among samples. Based on the results, PC-ANN is the best choice for the intended application. However, the joint use of an effective spectral pretreatment
and computational methods such as PLS and LQ-PLS, the optimization of which is much less labour-intensive, provides
comparable results. Standard normal variate SNV. was found to be the best of the spectral pretreatments compared with a
view to reducing the non-linearity introduced by scattering. The subsequent application of PLS provides accurate results with
linear systems absorption band at 1450 nm.. A non-linear calibration model must be applied instead, however, if the system
concerned is intrinsically non-linear. Under these conditions, the three methods tested for this purpose LQ-PLS, QPLS and
ANN. provide comparable results. q 2000 Elsevier Science B.V. All rights reserved.
Keywords: Non-linearity; Artificial neural networks; Quadratic PLS; NIR; SNV

1. Introduction
Near infrared NIR. spectra typically consist of
broad, weak, non-specific, extensively overlapped
bands. These characteristics hindered expansion of
the NIR technique until multivariate calibration
methods became widely available and accepted. Mul)
Corresponding author. Tel.: q34-935-81367; fax: q34-935812379; e-mail: iqan8@blues.uab.es

tiple linear regression MLR. w1x, principal component regression PCR. w2x and partial least-squares
regression PLSR. w3,4x are the most widely used
choices among such methods. These calibration approaches assume a linear relationship between the
measured parameters for the sample and the intensity
of its absorption bands; with PCR and PLSR, small
deviations from linearity are also acceptable as they
can readily be suppressed by including additional
principal components in the calibration model. In the

0169-7439r00r$ - see front matter q 2000 Elsevier Science B.V. All rights reserved.
PII: S 0 1 6 9 - 7 4 3 9 9 9 . 0 0 0 4 8 - 9

76

M. Blanco et al.r Chemometrics and Intelligent Laboratory Systems 50 (2000) 7582

presence of substantial non-linearity e.g., those that


arise from scattered light or intrinsic non-linearity in
the absorption bands., the models tend to give large
prediction errors and call for mathematical processing of spectra or the use of alternative calibration
procedures to correct non-linearity w5x.
In diffuse reflectance NIR spectroscopy NIRRS.,
the multiplicative effect changes the effective light
path arising from differences in particle size among
pulverulent or granulate samples causes a non-constant shift in spectra i.e., non-linearity.. These effects, which complicate calibration models, can be
minimized or avoided altogether by using wavelength selection w6x or mathematical pretreatments
such as spectrum derivation w7,8x, the standard normal variate SNV. w9,10x and multiplicative scattering correction MSC. w11x. Dhanoa et al. w12x showed
MSC and SNV to be linearly related when applied
over the same wavelength range.
Other calibration methods are insensitive to the
effects of non-linearity. Such is the case with the
model developed by Gnanadesikan w13x, which expands the X matrix with the squares of the variables;
the projection of the PLS components on a surface in
the expanded space corresponds to that of the original X matrix in a quadratic space, which allows one
to correct quadratic non-linearity. Although this system performs quite well, it entails using a large number of variables and its results are difficult to interpret. This type of regression, referred to as linearquadratic partial least squares LQ-PLS. regression,
preserves a linear internal relationship between the
scores of the X and Y matrices.
There are other, intrinsically non-linear calibration models such as quadratic PLS QPLS. w14,15x
and artificial neural networks ANNs. w16,17x.
QPLS uses a non-linear internal relation between
the scores of matrices X and Y. Thus, the relation established with a second-order polynomial is as follows:
u a s C0 a q C1 a t a q C2 a t a2 q h a
where a is the model dimensionality.
ANNs are also among the most widely used mathematical algorithms for overcoming non-linearity.
The networks are straightforward mathematical descriptions of what is currently known about the phys-

ical structure and mechanism of biological learning


and knowledge w16x. The ANN used in this work was
a perceptron multilayer network with error backpropagation as training scheme and the generalized
delta rule for weighting. The topology of this network type affords the use of a variable number of
layers. Each layer can contain one or more neurons
or nodes, which can act with a linear or non-linear
transfer function. The input layer contains as many
neurons as variables are to be handled; the output
layer, as many as parameters are to be determined. In
between the input and output layer, a variable number of hidden layers can be inserted containing also a
variable number of neurons. The number of data values used for training must exceed that of weights determined in the network; this entails using a large
number of samples for calibration if the number of
input variables is also large. This is a frequent problem with data recorded at several wavelengths that is
usually addressed by subjecting spectra to principal
component analysis PCA., computing the scores for
the principal components PCs. that describe the body
of spectra and using the scores as ANN input.
The most usual way of describing the architecture
of a network is by using the notation i, h1 , h 2 , o .,
where i is the number of nodes or neurons in the input layer; h1 and h 2 are the numbers of neurons in
the two hidden layers; and o is the number of neurons in the output layer. The number of neurons in the
input and hidden layers, the number of hidden layers
and the function transfer should be optimized.
As noted earlier, non-linearity in the signal-analyte concentration relation may arise from the scattering derived from a high variability in particle size
or intrinsic non-linearity in the absorption band of interest. The NIR spectrum for water exhibits five
bands with maxima at 1940, 1450, 1190, 970 and 760
nm; the position of these bands, however, can be
shifted by temperature changes or hydrogen bonding
interactions with sample components. The bands at
1450, 970 and 760 nm correspond to the first, second
and third OH stretching overtone; those at 1940 and
1190 nm are OH stretching and bending combination bands w18x. The bands at 1190, 970 and 760 nm
are too weak to be of use to quantify water. On the
other hand, those at 1450 and 1940 nm allow relatively low moisture contents in complex mixtures to
be accurately determined one should always bear

M. Blanco et al.r Chemometrics and Intelligent Laboratory Systems 50 (2000) 7582

in mind, however, the intrinsically non-linear nature


of the band at 1940 nm, which is a result of interactions with the sample matrix.
In this work, three different calibration methods
LQ-PLS, QPLS ANNs., and as many types of spectral pretreatments derivatives, SNV, MSC., were
tested with a view to establishing models of a high
predictive capacity for the determination of moisture
in acrylic fibre samples using the NIR spectroscopic
technique. This type of sample exhibits two different
sources of non-linearity, namely: differences in linear density among samples, which was studied by our
group in a previous work w19x, and water absorption,
which affects the band at 1940 nm preferentially.

77

Fig. 1. Water recovery by dried acrylic fibres.

2. Experimental
2.1. Samples
The samples used were cut threads of acrylic fibre
consisting of a 90:10 wrw acrylonitrilevinyl acetate copolymer. The linear density of the fibres
studied ranged from 1.0 to 18.0 dtex 1 dtex s 1 mg
per 10 m of fibre..
The amount of residual moisture in acrylic fibres
typically ranges from 0.8% to 1.2%; the lower the
linear density of the sample, the higher its specific
surface area and the higher its moisture content as a
result. In order to expand the natural moisture content range, some samples were stored in a wet ambient and others in a dry one for variable lengths of
time.
Thus, samples with a high moisture content were
obtained by allowing fibres to stand in a closed glass
container housing a small, water-filled beaker for a
variable time between 20 min and 4 days. On the
other hand, samples with a low moisture content were
obtained by placement in a desiccator containing
magnesium perchlorate over a period of 30 min to 3
days. The water content range thus spanned was
0.12%3.58%.
The moisture content in each sample was determined by drying at 1008C in a stove following
recording of the NIR spectrum.
Dried acrylic fibres regain part of the water lost;
also, wet fibres release most of the water absorbed.
As can be seen from the band at 1910 nm in Fig. 1,

samples rapidly recovered most of the moisture they


had previously lost; however, the second band in the
spectrum underwent no similar changes as it was due
to the second overtone of the ester group in the
copolymer sample matrix.. The water sorptiondesorption process was so rapid that the reference value
was difficult to establish particularly for those samples that were exposed to a wet or dry ambient for a
long time as they lost the moisture absorbed or regained that loss during the time their spectra were
recorded.. Despite the care exercised to avoid this
problem, the reference values used for calibration and
prediction were not too accurate.
2.2. Apparatus and software
The experimental assembly used comprised a
NIRSystems 6500 near infrared spectrophotometer
equipped with a reflectance detector and a spinning
module for recording spectra.
The following software was used:
a . Near Infrared Spectral Analysis Software
NSAS. v. 3.52, from NIRSystems, which allows
recording of spectra and their processing averaging, derivation..
b. Unscrambler v. 6.1, from CAMO, for PLSR
and identification of the spectral regions of interest.
c. Neural-UNSC v. 1.02, also from CAMO, for
construction of the ANN models.

M. Blanco et al.r Chemometrics and Intelligent Laboratory Systems 50 (2000) 7582

78

d. Mathlab 5.1, from The Mathworks, for computation of the QPLS models.

lowest root mean square error of prediction for the


validation set,

2.3. Recording of NIR spectra


RMSEPs
Spectra were recorded by pressing a small amount
of sample ca. 2 g. inside the cuvette, which was then
placed in the spectrophotometers spinning module.
Each spectrum was the average of 32 scans obtained
at 2 nm intervals over the wavelength range 1100
2500 nm. A reference spectrum porcelain plate. was
recorded prior to each sample spectrum.
2.4. Data processing
The calibration models tested were constructed by
using PLSR and LQ-PLS as linear models, and QPLS
and ANNs as non-linear calibration models.
Models were developed by using different spectral pretreatments and the working wavelength ranges
were 13801520 nm and 18402120 nm, corresponding to both absorption bands of water.
Second-derivatives were calculated for all the
spectra using NSAS v. 3.52 with a segment of 10 and
a gap of zero.
SNV treatment was applied to the absorbance
spectra in both wavelength ranges; in this way, variability arising from absorption in other regions of the
NIR spectrum was avoided
MSC was only applied in the working range
13801520. An average spectrum for the calibration
set in the spectral region from 1100 to 1300 nm was
calculated in this region there is no absorption of the
analyte. and a regression of each spectrum of the calibration set over the average spectra was calculated.
The slope and intercept were used to correct additive
and multiplicative effects of scattering in the working range 13801520 nm.
PLS, LQ-PLS and QPLS models were constructed
by cross-validation. Because the number of wavelengths used was too large for direct use as input layer
in an ANN, variables were compressed by principal
component analysis PCA. in order to identify the
principal components PCs. best describing the data
matrix; the scores of such PCs were used as input
variables for the ANN; the network leading to the

C NIR y CREF .
i

is1

was adopted as optimal.


The predictive capacity of the PLS, LQ-PLS,
QPLS and PC-ANN models tested was compared in
terms of the relative standard error for both the calibration and the validation sets
n

%RSEs

C NIR y CREF .
i

is1
n

= 100

2
CREF
i

is1

%RSEC and %RSEV, respectively., where n is the


number of samples included in the calibration matrix
or the calibration set, C REF the water concentration in
the sample as measured by the reference method and
C NIR the concentration as calculated by PLS, LQPLS, QPLS or PC-ANN from the NIR spectrum.

3. Results and discussion


A calibration set consisting of 24 samples was
used to establish the model, and a validation set
comprising 43 samples was used to validate the
PLSR, LQ-PLS, QPLS and PC-ANN models. In addition, other 20 samples were used as test set to build
PC-ANN models.
Samples were distributed between the two sets in
such a way as to ensure variability in both linear
density and moisture content, and also that each subset would be representative of the entire set. Fig. 2
shows the variation of linear density with moisture
content for the samples in calibration and prediction
sets. Based on the graph, calibration and validation
samples were selected in order to minimize correlation between both parameters.
Fig. 3 shows the NIR spectrum for four samples
with an identical moisture content but different linear
density. Note the strong shift between spectra due to
scattering.

M. Blanco et al.r Chemometrics and Intelligent Laboratory Systems 50 (2000) 7582

79

Fig. 4. Dependence of the NIR spectra on moisture content.


Fig. 2. Linear density vs. percent moisture plot for the calibration
and validation samples.

tion model leading to the best most accurate. possible predictions.


Fig. 4 shows the NIR spectrum for four samples
with different moisture contents but the same linear
density 1.7 dtex.. As can be seen, the intensity of the
water absorption bands at 1450 and 1920 nm varied
markedly with the moisture content. The strong spectral absorption of water in these two regions led us to
construct individual calibration models based on the
two intervals spanned by such bands in order to be
able to compare their performance in both zones. The
wavelength ranges thus chosen were 13801520 nm
for the former band and 18402120 nm for the latter.
The study was undertaken with a twofold purpose, namely: to determine the effect of the mathematical pretreatment of spectra on the linear calibration models and to construct the non-linear calibra-

3.1. Spectral pretreatment


Direct application of PLS to absorbance spectra
resulted in high %RSE values owing to strong scattering in the two selected spectral ranges, both with
calibration and with validation samples see Table 1..
Using the second derivative to correct the effect of
scattering led to %RSE values similar to those obtained in the absorbance mode Table 1.; in fact, the
effect was of the multiplicative type, so it could not
be corrected by derivation.
On the other hand, after SNV treatment, the first
PLS component accounted for 97.6% of the variance
in the concentration matrix over the wavelength range
13801520 nm and for 98.3% of that in the 1840
2120 nm range; also, the model was much more simple than the previous one and RSEV was decreased

Table 1
Figures of merit for the different PLS models constructed

Fig. 3. Dependence of the NIR spectra on linear density.

Wavelength
range

Spectral mode

PLS
components

RSEC
%.

RSEV
%.

13801520
18402120
13801520
18402120
13801520
18402120
13801520

absorbance
absorbance
derivative 2
derivative 2
SNV
SNV
MSC

3
3
2
3
1
2
1

10.9
10.9
11.0
11.6
6.8
6.4
6.8

14.9
13.3
14.2
15.0
7.3
9.3
9.1

80

M. Blanco et al.r Chemometrics and Intelligent Laboratory Systems 50 (2000) 7582

to 7.3% with 1 PLS component for the lower range


and to 9.3% with two components for the higher.
The SNV treatment ensured subsequent accurate
analysis of the samples, whichever their linear density the contribution of which to the NIR spectrum was minimized by the treatment. Although the
errors might seem somewhat high, one should take
into account that the samples contained very little
moisture; thus, small differences between the NIRPLS and reference values resulted in markedly increased %RSEV values for example, RMSEP was
only 0.100 for the model involving SNV treatment
and the 13801520 nm wavelength range..
Fig. 5A and B shows the calibration sample residuals obtained by using PLS models based on the
13801520 and 18402120 nm ranges, and the use
of SNV with one PLS component. As can be seen, the
PLS model for the 18402120 region exhibited nonlinearity; however, using a second PLS component
suppressed it Fig. 5C..
The MSC treatment was applied over the range
11001300 nm where water absorbs negligibly
so as to correct the multiplicative effect of differences in linear density in the region from 1380 to
1520 nm. Although, as can be seen in Table 1, the
optimum number of PLS components was 1, the resulting RSEV exceeded that obtained with SNV. The
MSC treatment was not tested over the range 1840

2120 nm because no nearby spectral region for correcting scattering was available.
3.2. Application of LQ-PLS and the non-linear regression models (QPLS and PC-ANN)
Both expansion of the X matrix with the squares
of the variables LQ-PLS. and application of the intrinsically non-linear models QPLS and PC-ANN. to
the original data were tested.
As mentioned above, the neural networks examined were of the feed-forward type, in which parameter estimation is based on the back-propagation
algorithm. These networks transfer the information
held by the neurons in the input layer to one or several hidden layers; subsequently, the information
contained in the hidden layers is combined via nonlinear functions a sigmoidal type was used in this
work in order to obtain the input data i.e., the
target parameter or parameters.. This type of algorithm is highly suitable as it lends itself readily to supervised learning viz. to learning from data with
known responses and to using the acquired knowledge to predict the answers for other problems.. The
use of a non-linear transfer function allows one to
model non-linear relationships between the analytical
signal and the analyte or physical parameter of interest.

Fig. 5. Residuals for the calibration samples as obtained in the SNV mode, using one PLS component over the ranges 13801520 A. and
18402120 nm B., and two components in the region 18402120 nm C..

M. Blanco et al.r Chemometrics and Intelligent Laboratory Systems 50 (2000) 7582

In order to be able to compare the results with


those provided by the above-described PLS models,
the same calibration and validation samples were
used. Although non-linearity was clearly apparent
only in the 18402120 nm region Fig. 5B., the
models were applied to both wavelength ranges.
For each one of the wavelengths regions and spectral modes studied, a PCA was performed to compress the data and to use the scores of the significant
components as input data to the ANN. In all cases,
the number of nodes of the first hidden layer was increased in sequence from 1 to 6.. Next, a second
hidden layer was tested, but it resulted in no significantly improved results, so it was omitted in the
identification of the architecture.
Tables 2 and 3 show the results obtained in the
absorbance and SNV modes, respectively.
For absorbance spectra, in the 13801520 region,
QPLS uses less components in the model than linear
PLS but does not improve the results. The application of LQ-PLS slightly improves %RSEC and
%RSEV respect to the values obtained by PLS but
increasing the number of PLS components; that indicates that non-linearities are not well modeled. Similar results were also obtained in the 18402120 nm
region, QPLS uses one more component than linear
PLS but RSEV is only slightly smaller; LQ-PLS decreases RSEV values both for linear PLS models and
for the model built in the range 13801520. This indicates that both water bands present nonlinearity in
a different way. Only PC-ANN can model the nonlinearity effect and significantly improves %RSEC
and %RSEV in both wavelength ranges.
The results obtained by using second-derivative
spectra for both wavelength ranges in these calibraTable 2
Results obtained in mode absorbance
Wavelength
range
13801520

18402120
a

Model

PLS
components

RSEC
%.

RSEV
%.

LQ-PLS
QPLS
PC-ANN
LQ-PLS
QPLS
PC-ANN

5
2
3,3,1a
3
4
3,3,1a

4.6
11.2
4.7
4.7
9.9
4.9

9.5
15.2
8.0
8.8
13.0
7.7

Net architecture i, h, o .; i: neurons in the input layer; h: neurons in the hidden layer; o: neurons in the output layer.

81

Table 3
Results obtained with SNV treatment
Wavelength
range
13801520

18402120

Model

PLS
components

RSEC
%.

RSEV
%.

LQ-PLS
QPLS
PC-ANN
LQ-PLS
QPLS
PC-ANN

5
2
1,5,1a
3
3
3,3,1a

5.2
6.5
6.4
5.6
7.7
5.7

7.5
7.3
6.7
7.1
7.5
7.1

Net architecture i, h, o .; i: neurons in the input layer; h: neurons in the hidden layer; o: neurons in the output layer.

tion models were not better than those found in the


absorbance mode, so they are not reported neither
are those provided by MSC as this treatment led to
much greater %RSE values than SNV.
The models obtained following the SNV treatment and using nonlinear techniques Table 3. were
similar to those obtained with absorbance spectra but
the results %RSEC and %RSEV. were substantially
better. In the range 13801520 nm, QPLS has a simpler model than LQ-PLS, but the results did not improve those obtained using linear PLS. In the other
water band 18402120 nm., QPLS and LQ-PLS may
correct the nonlinearity with lower errors than linear
PLS. Again PC-ANN gave slightly better results
overall.
The results indicate that nonlinearity in the wavelength range 13801520 nm is mainly due to light
scattering and is corrected by SNV treatment, thus
linear PLS in this range produces a very simple model
one component. and low RSEV. nonlinearity in the
other water band, 18402120 nm, besides light scattering may be produced by interaction of water with
the fibres, as it is a combination band, and is not corrected by SNV; in this case, non-linear regression
models improve the results obtained by linear PLS.
PC-ANN is able to model nonlinearities in all cases.

4. Conclusions
The use of linear calibration models with systems
subject to scattering-related non-linearity is poorly
predictive; the results are not significantly better if
alternative linear or non-linear models such as LQPLS and QPLS, respectively, are used. Only with

82

M. Blanco et al.r Chemometrics and Intelligent Laboratory Systems 50 (2000) 7582

ANNs can models that result in low %RSE values be


constructed under these circumstances.
However, the strong non-linearity of the systems
can be partially suppressed by the SNV treatment,
which thus allows construction of PLS models of
much higher simplicity and predictive capacity. The
use of SNV also raises the predictive capacity of the
other three methods studied LQ-PLS, QPLS and
PC-ANN. and provides comparable results both in the
wavelength range where the absorbance is linearly
dependent on the concentration and in that where it
is not.
Acknowledgements
This research was conducted in cooperation with
the firm Courtaulds Espana,
which supplied technical
and practical information about their production and
analysis methods. The work was performed within the
framework of Project PB96-1180, funded by Spains
Direccion
y
General de Investigacion
Cientfica

DGICyT..
Tecnica

References
w1x M. Blanco, J. Coello, H. Iturriaga, S. Maspoch, E. Bertran,
Analyst 119 1994. 17791785.

w2x K. Esbensen, P. Geladi, S. Wold, Chem. Intell. Lab. Syst. 2


1987. 3752.
w3x D.M. Haaland, E.V. Thomas, Anal. Chem. 60 1988. 1193
1202.
w4x P. Geladi, B.R. Kowalski, Anal. Chim. Acta. 185 1986. 1
17.
w5x T. Naes, T. Isaksson, NIR News 5 1994. 411.
w6x H. Swierenga, P.J. de Groot, A.P. de Weijer, M.W.J. Derksen, L.M.C. Buydens, Chem. Intell. Lab. Syst. 41 1998.
237248.
w7x W. Windig, D.A. Stephenson, Anal. Chem. 64 1992. 2735
2742.
w8x A.M.C. Davies, NR News 4 4. 1993. 1011.
w9x M. Blanco, J. Coello, H. Iturriaga, S. Maspoch, C. de la
Pezuela, Appl. Spectrosc. 51 2. 1997. 240246.
w10x R.J. Barnes, M.S. Dhanoa, S.J. Lister, Appl. Spectrosc. 43 5.
1989. 772777.
w11x T. Isaksson, T. Naes, Appl. Spectrosc. 42 7. 1988. 1273
1284.
w12x M.S. Dhanoa, S.J. Lister, R. Sanderson, R.J. Barnes, J. Near
Infrared Spectrosc. 2 1994. 4347.
w13x R. Gnanadesikan. Methods for Statistical Data Analysis of
Multivariate Calibration, Wiley, New York, 1988.
w14x S. Wold, Chem. Intell. Lab. Syst. 14 1992. 7184.
w15x S. Wold, N. Kettaneh-Wold, B. Skagerberg, Chem. Intell.
Lab. Syst. 7 1989. 5365.
w16x J. Zupan, J. Gasteiger, Anal. Chim. Acta. 248 1991. 130.
w17x B.J. Wythoff, Chem. Intell. Lab. Syst. 18 1993. 115155.
w18x B.G. Osborne, T. Fearn, Near Infrared Spectroscopy in Food
Analysis, Wiley, New York. 1988.
w19x M. Blanco, J. Coello, H. Iturriaga, S. Maspoch, J. Pages,
`
Anal. Chim. Acta. 384 1999. 207214.

You might also like