Professional Documents
Culture Documents
Introduction
,,
Data preparation
An odour sensor has a response characteristic that approximates to that of a
first-order instrument, i.e. a simple RC network. It is common to ignore the
transitory response because it is determined by the physical time to deliver
the odour to the sensor (i.e. transfer
function of the odour delivery system)
and the dynamics of the odour source.
In some applications it is possible to
use an automated or robotic headspace
injection system and then the variations in the delivery system are suitably small to permit analysis of the
transient signals.
The choice of the data pre-processing algorithm has been shown to affect
significantly the performance ofPARC
methods. For example, in MOS odour
sensors, various sensor response parameters x 11.. have been used by workers,
see Table 1. It is our view that the
exact choice of parameter should
depend upon the underlying sensor
principle and the nature of the interfering signals. It has been shown that the
fractional change in conductance
( Godow_ Gai')/Gair helps to linearise the
sensor output (with concentration) and
to reduce its temperature sensitivity,
thus improving the performance of
chemometric 5 and neural networking
techniques 6 The concentration-dependence of the odour sensor can be
removed in linear sensors by normalising the sensor parameter according
to:
X
If
I)
~
I.x .
.I
"\J i==l
/}
= [x;; x r
2
Feed-forward multilayer
perceptron
An important class of neural network is
the multilayer feed-forward network
which typically consists of a set of
sensory units (note that in an electronic
nose there is a source or input node for
each odour sensor) that constitutes the
DIGITAL PROCESSING
OdOUf
s1gnal
Xo
ODOUR
DELIVERY
SYSTEM
l ayer
p n. Next the weighted sum is transformed by a non-linear activation function, usually a logistic or sigmoidal
function, where the output is _v,
v
l+exp(-l)
vk
=L w
I= 0
ke X e
Type
Log-fractional
Formulae
L1w11
= Y] o v
I"" k
where the constantY] is called the learning rate parameter (generally set to
1.0) and 01 is the local gradient and
given by the product of , with the
gradient of the activation function (in
most implementations an additional
'momentum' term is added to improve
the learning process). The process is
repeated for a number of steps until the
network error has converged to a suitably small value. The network error is
commonly defined as the sum of
squared errors in the output layer, e.g.
Fwst
Si~iicond
hidden
layer
Outer
layer
layer
ART1 algorithm
173
ORIENTING
- =
.....J i -
B -1
Dl
(0) >_I_
-,,
15
if iri-S
and
if iES
-~i-
L
L- 1 +lSI
if iES
if iri-S
16
10
' l
,.
M : number of nodes on F,
N :number of nodes on F,
A 1 : network parameter (A 1 > 0)
C 1 : network parameter (C 1 > 0)
D 1 : network parameter (0 1 > 0)
B 1 : network parameter
(max !D 1 , 1) <B 1 <D 1 )
L : network parameter (L > I J
p : attentional vigilance parameter
(O<p<l)
Bottom-up (F 1 ~ F):
174
ifF, is active
s- JI
12
l1 n V'J'
J I T, = max l T,:
=l0
fC\)
where
kE I
It(\
V(J)
,,,
<p
=0
17
Once a node is reset, it remains inactive for the duration of the trial. The
ART! algorithm is described in Appendix A.
ART2 algorithm
ART2 is a class of neural networks that
can self-organise recognition categories for an arbitrary sequences of analogue or binary inputs. Figure 4 shows
the system studied here. ART2 contains all the features of the ART 1 network:
(i) an attentional subsystem which contains an input representation field F 1
and a category representation field F2
l
13
ORIENTING
SUBSYSTEM
ATfENTIONAL
SUBSYSTEM
T = '2,:
J
1'=.\
1/
~F ):
fK
.
-,,
if iES andjC1) =I
-K[:,,SI]
ifi!l.Sand/(.r,)=l
()
lf/(.1)
r -:,, + 1
:,, =
=()
14
if iES andf(.r) =I
ifiri.Sandf(x)= 1
if f(.r)I = 0
Sensor No
Type
SMART2
The semi-supervised adaptive resonance theory (SMART2J is a modified
version of ART2. It has all the properties of ART2 but it is also capable of
supervised learning. This is achieved
by including two design principles.
Firstly in design principle 1 (DPl ),
when a training pattern is applied to the
network, it only allows categories of
the same class to compete for it. This
significantly reduces the problem of
classifying similar trainillg patterns
from different classes to the same category. Thus, the network is much less
likely to develop multiple class categories. Secondly in design principle 2
(DP2). it uses high vigilance only for
patterns which are difficult to classify.
This overcomes a shortcoming of ART2
namely that the attentional vigilance
parameter is not dynamically adjusted.
In other words, if a high vigilance value
is needed to distinguish patterns of
certain classes it will be retained and
may result in an excessive sensitivity
and so unnecessary expansion of the
number of categories associated with
patterns of a particular class. Hence.
two new rules are added to the orienting subsystem. The first is to reset all
F, nodes with class tags other than the
cfass of the current training pattern.
The second is to track the misclassified
training patterns during the test cycle
and to then increase the vigilance level
for just these patterns in the next training cycle. The only change to the attentional subsystem is in the category representation layer F,. SMART2 uses
class information during its training to
inhibit nodes associated with classes
other than those of the training set. As
each new category is formed, it is
tagged with the class of training pattern that fonned it. When a new pattern
Sensor No
Type
TGS 812
TGS 815
TGS 711
TGS 882
TGS 813
TGS 816
TGS 813
10
TGS 816
TGS 814
11
TGS 817
TGS 824
12
TGS 825
Experimental procedures
Data were collected from an electronic
nose which consisted of an an-ay of I 2
commercial metal oxide semi-conducting gas sensors (Figaro Engineering
Inc .. Japan). Table 2 is a list of the 12
sensors.
The response.\ from the sensors was
defined as the (steady-state) fractional
change in conductance as this was found
to reduce error; r'. The data was also
normalised according to equation I to
set the range of .r to [0. I].
Two electronic nose data-sets were
analysed. The first set consisted of simple odours: 5 PPM in air of methanoL
ethanol. butan-1-ol. propan-2-ol. and
2-methy-1-butanol. The process was
repeated to provide 8 identical samples
of each of the 5 classes making a set of
40 input vectors 11 The second set consisted of the headspace of three different roasted coffees. Two of the coffees
were of the same roasting level but
different blends while the third was a
different roast level but of the same
blend as one of the first two coffees.
This was a much more difficult problem to solve because the headspace of
coffee forms a complex odour. The
procedure was repeated to produce 30
samples of the first coffee and 29 samples for the second giving a total of 89
input vectors 12
The ART algorithms were written in
C++ so that the software simulator could
be run under UNIX or DOS.
Results
The ART! simulator was applied to
the alcohol data-set using a one-fold
validation procedure 1'. The input vectors were converted into their binary
equivalent using a gray code technique
to ensure that the Hamming distance
between successive numbers was one.
In these tests the vigilance parameter p
was set to 0.65.
Table 3 summarises the performance
of the ART! network as applied to
classify the alcohol samples.
The classification-rate of 50% is
rather disappointing and is believed
to be due to the loss of crucial features in the conversion of the analogue data into its binary equivalent.
175
Sample No.
I
2
3
4
5
6
7
No. of categories in
training
No. misclassified in
testing
Conclusions
12
12
11
13
10
12
13
12
3
3
1
2
1
1
2
Sample No.
2
3
4
5
6
7
8
ALCOHOL
ALCOHOL
12
12
12
12
12
12
12
12
9
10
COFFEE
35
35
36
35
31
34
34
34
35
32
0
0
0
0
0
COFFEE
6
4
5
3
6
4
4
3
4
4
Categories established
in training
Before
Methanol
Butan-1-ol
Propan-2-ol
1-methyl-2 butanol
Ethanol
Categories classified
in testing
After
0,6
!A
2,5,7
3,8
none
Before
0,6
1,4
2,5,7
3,8
9,10,11
After
0
2
3
None
2
3
9
Table 5. Pe1jormance of ART2 network on alcohol data-set hefou' and after the
ethanol input rectors are introduced
Algorithm
Accuracy%
Alcohol
ART2
SMART-I
SMART2-II
SMART2-III
92.5
97.5
97.5
97.5
Coffee
Alcohol
Coffee
52
75
60
80
12
12
6
6
35
55
38
44
References
,.,.
l
1997
of
Measurement + Control
1996-97
DECEMBER/JANUARY
Building Management
FEBRUARY
Open Feature
MARCH
& APRIL
177
START
Set vigilance p;
FOR (each training pattern)
WHILE (a training pattern is applied to network) DO
F0 and F, processing;
Apply outputs ofF, to bottom-up adaptive filter;
F, processing (choose the winning node} or add a new node);
Send top-down expectation from node } through the top-down adaptive filter;
IF (degree of match between bottom-up inputs and top-down expectation > p)
THEN Adjust the LTM trace to make the bottom-up and top-down adaptive filters look more like the training pattern;
IF (the number of weight adjustments is exceeded) remove the pattern from the network;
ELSE Activate the orienting subsystem to reset the fth node in F, (i.e. inhibit it from competing again while the current training
pattern remains on the network);
END
START
1. Set p == low vigilance (usually p = 0 0);
2. FOR (each training pattern)
WHILE (a training pattern is applied to network) DO
IF (the network misclassified the pattern on the last testing cycle) THEN increase vigilance (usually top = 0.9999);
F. and F. processing;
Apply output ofF. to bottom-up adaptive filter; Reset all F, nodes that are associated with a class that differs from the class
of the current training pattern;
F.. processing (choose the w1nn1ng node i or add a new node);
Send top-down expectation from node j through the top-down adaptive filter;
IF (degree of match between bottom-up inputs and top-down expectation > p)
THEN
Adjust the LTM trace to make the bottom-up and top-down filters look more like the training pattern;
IF (fth node is an uncommitted node) tag it with the class of the current training pattern;
IF (the number of weight adjustment is exceeded) remove the pattern from the network;
ELSE
Activate the orienting subsystem to reset the fth node in F,. (i.e inhibit it from competing again while the current training
pattern remains on the network);
3. FOR (each training pattern)
Apply each training pattern to the new version of network allowing any category to win. See if the class of pattern is the same
as the tag of the category it mapped to.
4. Repeat Step 1 to Step 3 until all patterns are classified correctly in Step 3.
END