Professional Documents
Culture Documents
Fig. 2. The mother wavelet used in our study. This is the first member of the
Daubechies family of wavelets, also known as Haar. Fig. 3. A 25-kHz Sallen-Key bandpass filter used in our study.
optimal features for neural network training. This issue is discussed in which two or more input features may differ by several orders of
further in Section IV. magnitude. These large variations in feature sizes can dominate more
As mentioned before, the main advantage of using wavelet transform important but smaller trends in the data and should be removed through
is to reduce the number of inputs to the neural network. This is accom- normalization. Our studies have shown that normalizing feature vec-
plished by the downsampling operation performed in wavelet analysis, tors selected by PCA to have zero mean and unit standard deviation
which causes the number of wavelet coefficients to be reduced by a can make the neural network training phase much more efficient.
factor of two at each level of approximation or detail. Since this de- The neural network selected for this study is a multilayer feedfor-
composition is hierarchically organized, we can search for an appro- ward neural network trained by backpropagation [9]. Our work shows
priate collection of these coefficients that remain distinct across fault that through preprocessing of the output signals from an analog circuit,
classes. The selected coefficients are further preprocessed by PCA and we can select an optimal number of features (inputs) for the neural
normalization to prepare the inputs to the neural network. The coeffi- network. This consequently minimizes the size of the neural network,
cient selection process will be discussed later in Section IV. reducing its training time and improving its performance. The wavelet
coefficients not selected as features to train the neural network are dis-
B. PCA and Data Normalization carded since they are irrelevant to distinguishing among fault classes.
Principal component analysis is a preprocessing technique which can
significantly reduce the complexity of the neural networks employed III. SAMPLE CIRCUITS AND FAULTS
in fault classification problems. PCA achieves this goal by further re- The two circuits studied in our work are the same as those in [1] and
ducing the dimensionality of the input space after wavelet analysis are shown in Figs. 3 and 5. The first circuit is the Sallen-Key band-
while preserving as much of the relevant information as possible for pass filter [10]. The nominal values for the components which result
fault classification. In cases where input space is not high-dimensional, in a center frequency of 25 kHz are shown in the figure. The resistors
PCA can still be a very useful preprocessing technique for selecting op- and capacitors are assumed to have tolerances of 5% and 10%, respec-
timal features that minimize classification errors. In summary, the aim tively1. The primary motivation for selecting this filter and its associ-
of PCA is to map vectors x in a d-dimensional input space (x1 ; 1 1 1 ; xd ) ated faults described later in this section is to compare our results with
onto vectors z in an M -dimensional space (z1 ; 1 1 1 ; zM ) in such a way those in [1]. The same circuit and faults are considered in this reference
that loss of essential information and data variation needed for classifi- with the impulse response sampled and fed directly to a neural network
cation is minimized while keeping M < d: To apply PCA to a typical without any preprocessing. Comparison of the two case studies proves
input space vector x; we first write x as a linear combination of a set the importance of preprocessing in neural network based diagnostic
of d orthonormal vectors ui in the form systems which leads to neural networks having simpler architectures
d and improved performance.
x= i ui z (3) The impulse response of the circuit in Fig. 3, with R3; C 2; R2; and
i=1 C 1 varying within their tolerances, belong to the no-fault class (NF)
Fig. 4. Linear plot (value versus index) of the first element of levels 1–5 approximation coefficients for the circuit shown in Fig. 3. Fault classes in each plot are
in the order 1 1 2 2 2 2 3 and 3 and are separated from each other by vertical lines. Within a fault class,
50 typical values are shown.
that the output generated by such a pulse is very adequate to classify The reason these faults belong to the same ambiguity group is because
all faults considered here. they produce very similar outputs, as discussed in Section V.
The second circuit studied in our work and [1] is more complicated
and is shown in Fig. 5. This is a two-stage four-op-amp biquad low-pass IV. WAVELET COEFFICIENT (FEATURE) SELECTION
filter [10]. The work in [1] uses two ambiguity groups of seven and
eight faulty components. The first group contains C 1 *; C 3 *; C 4 *;
As indicated before, we have used Haar wavelet function in this work
R16 *; R19 *; R21 +; and R22 *; and the second group consists
since it gives the most distinct features across fault classes. This be-
of C 2 *; R17 +; R3 *; R4 +; R6 *; R7 +; R8 *; and R9 *.
comes clear during the training phase, when the network must meet
a reasonable error goal to warrant good generalization. Among many
The faulty component values used in our work and [1] are shown in
Table I where * and + imply significantly higher and lower than nom-
wavelets we examined, only Haar function achieved our prespecified
mean-square error goal of 0.01. The wavelet properties that can pro-
inal values. These two ambiguity groups are defined in [1] based on
vide insight into the appropriateness of this function are support and
the degree of similarity between circuit outputs belonging to different
regularity. Support is a measure of the duration of a wavelet in time
fault classes. The measure used in this reference to obtain the degree
domain and regularity is related to its ability to correlate to smooth
of similarity is the K1 criterion defined by
functions [11]. Since Haar function has a compact support and a regu-
larity of zero because of its discontinuous nature, it is very well suited
Rxy to extract features from signals characterized by short durations and
K1 =
R xx + Ryy 0 xy
R
(6)
swift variations. When an input pulse is applied to our circuits, it is the
localized behavior of the output signal at its onset that carries the dis-
where tinct features required to classify faults. This localized behavior of the
T output signal at its rupture characterized by short duration then makes
1 Haar wavelet appropriate for this application.2 The general guidelines
R xx = x(t)x(t) dt (7)
T 0 for selecting wavelet coefficients as features to train the neural network
are as follows:
and
1) Depending on the nature of the signal, one needs to go up to
1
T sufficiently high levels of approximation and detail to expose the
R xy = x(t)y (t) dt: (8) low- and high-frequency features of the signal, respectively. One
T 0
can then use principal component analysis to further reduce the
In this equation, x(t) and y (t) are the signals whose correlation is 2To demonstrate the significance of the localized behavior of the outputs in
under investigation and T is the period. If this measure indicates high this application, we have examined these signals, which typically last 1 ms. Ex-
correlation between outputs, the associated faults are placed in the same tracting features from the first 0.2 ms of the output signals is sufficient to train
ambiguity group. Through our proposed preprocessing techniques, we the neural network and achieve a mean square error goal of 0.01. Training the
neural network on features extracted from the last 0.8 ms of the output signals
show that it is possible to select an optimal number of features that leads to a mean square error which is several order of magnitudes higher. This
leads to successful classification of individual faults presented in Table proves that the onset of the output signal has the necessary information to dis-
I except for one ambiguity group containing R6 *; R7 +; and R9 * : tinguish among fault classes which is best analyzed by the Haar function.
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—II: ANALOG AND DIGITAL SIGNAL PROCESSING, VOL. 47, NO. 2, FEBRUARY 2000 155
Fig. 5. The two-stage four-op-amp biquad low-pass filter used in our study. All resistors are in ohms.
number of features and/or enhance their distinctiveness across has 49 inputs, 10 first-layer, and 10 second-layer neurons, resulting in
fault classes. a total adjustable parameters of about 700. During the training phase,
2) The approximation and detail coefficients reflect the low- and an error function of these parameters must be minimized to obtain the
high-frequency contents of a signal, respectively. The low-fre- optimal weight and bias values. Their trained network was able to prop-
quency contents usually give a signal its basic structure, while erly classify 95% of the test patterns. Our backpropagation neural net-
the high-frequency contents provide its details. As a result, the work has two layers with 5 inputs, 7 neurons in layer 1, and 1 neuron
main features of a signal are usually captured in the approxima- in the output layer. The total number of adjustable parameters in our
tion coefficients. network is about 45, with the reduction in the number of weights and
Using these guidelines, we have selected the first coefficient of approx- biases directly translating to shorter training time and better perfor-
imation levels 1–5, associated with the impulse response of the filters as mance. Our trained network is capable of 100% correct classification
our features. Training the neural network on these features meets our of our test data.
prespecified error goal of 0.01, implying that they are distinct across The real advantage of preprocessing becomes evident when applied
fault classes. It is possible to demonstrate graphically the role these to the more complex low-pass filter shown in Fig. 4. As mentioned be-
features play in distinguishing among fault classes. Consider the NF fore, the work in [1] assigns all faults to one of two ambiguity groups.
and eight faulty classes associated with the sample circuit in Fig. 3. The size of the neural network is not specified in their work and they
For each of the nine classes, 50 impulse responses are generated by achieve a 100% correct classification of the test data as belonging to
varying components within their tolerances as described in Section III. one of the two ambiguity groups. We have successfully classified all the
The selected approximation coefficients associated with these impulse faults in Table I except three, (R6 *; R7 +; R9 *); which are placed in
responses are shown in Fig. 4. In this figure, each plot corresponds to one ambiguity group. The backpropagation neural network trained for
a single coefficient with the x-axis ranging from 1 to 450 to cover nine this problem has 5 inputs, 20 neurons in hidden layer 1, and 1 neuron
intervals of size 50, separated by vertical lines, corresponding to each in the output layer. This neural network correctly classifies 99.3% of
class. Fault classes in each plot are presented in the order C 1 *; C 1 +; the test data. Due to the effective preprocessing of the low-pass filter’s
C 2 *; C 2 +; N F; R2 *; R2 +; R3 *; and R3 + : We can examine output, the neural network architecture used for this complicated cir-
Fig. 4 to determine how fault classes are distinguished by features. For cuit is very simple, implying fast and efficient training and superior
instance, feature 1 can separate N F; R3 *; and R3 +; or N F and performance.3 As mentioned before, the reason for R6 *; R7 +; and
R2 * can be distinguished by features 2 or 3. The effectiveness of R9 * belonging to the same ambiguity group is that they create very
these five features to classify the faults associated with the sample cir- similar outputs. This is evident from an analysis of the low-pass filter,
cuits will be discussed in the next section. It is important to note that which gives
feature selection is a critical and intricate task in analog fault diagnosis
dvo1 R7
or any other neural network based system. However, this task needs to = v1 : (9)
dt R6 R9 C2
be carried out only once in a given application.
For these three fault classes, the expression (9) remains relatively un-
V. RESULTS changed for the faulty component values shown in Table I. As a result,
3It is important to note that the work in [1] is a black-box approach to fault di-
In this section, we compare the size and performance of our neural
networks with [1] to show the significance of the proposed prepro- agnosis and does not analyze the circuit characteristics, except for some simple
signal similarity metric. This is intended to move the test/diagnosis objective
cessing techniques. To perform a diagnosis of the faults described in close to the goal of a built-in self-test module. However, in light of the advan-
Section III for the Sallen-Key bandpass filter, the work presented in [1] tages of the proposed preprocessing techniques, it is beneficial to add a prepro-
requires a three-layer backpropagation neural network. This network cessing unit to this module.
156 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—II: ANALOG AND DIGITAL SIGNAL PROCESSING, VOL. 47, NO. 2, FEBRUARY 2000
TABLE I REFERENCES
FAULT CLASSES USED FOR THE TWO-STAGE
FOUR-OP-AMP BIQUAD CIRCUIT. THE NOMINAL AND FAULTY [1] R. Spina and S. Upadhyaya, “Linear circuit fault diagnosis using neuro-
VALUES ARE ALSO SPECIFIED morphic analyzers,” IEEE Trans. Circuits Syst. II, vol. 44, pp. 188–196,
1997.
[2] H. Spence, “Automatic analog fault simulation,” in Proc. AU-
TOTESTCON’96 Conf., pp. 17–22.
[3] R.-W. Liu, Testing and Diagnosis of Analog Circuits and Sys-
tems. New York: Van Nostrand, 1991.
[4] Selected Papers on Analog Fault Diagnosis. New York: IEEE Press,
1987.
[5] J. W. Bandler and A. Salama, “Fault diagnosis of analog circuits,” in
Proc. IEEE, 1985, pp. 1279–1325.
[6] M. Catelani and M. Gori, “On the application of neural networks to
fault diagnosis of electronic analog circuits,” Measurement, vol. 17, pp.
73–80, 1996.
[7] G. Strang and T. Nguyen, Wavelet and Filter Banks. Cambridge, MA:
Wellesley-Cambridge Press, 1996.
[8] H. W. Lee, Wavelet Gri-Minace Filter for Rotation-Invariant Pattern
Recognition. Philadelphia, PA: SPIE, 1994, vol. 2762, pp. 343–352.
[9] M. H. Hassoun, Fundamentals of Artificial Neural
Networks. Cambridge, MA: MIT Press, 1995.
[10] M. V. Valkenburg, Analog Filter Design. New York: Oxford Univ.
Press, 1982.
[11] B. B. Hubbard, The World According to Wavelets. Boston, MA: Peters,
1996.
[12] G. J. Hamink, B. W. Meijer, and H. G. Kerkhoff, “Testability analysis
of analog systems,” IEEE Trans. Computer-Aided Design, vol. 9, pp.
573–583, June 1990.
these three faults must be placed in the same ambiguity group if the cir-
[13] C. M. Bishop, Neural Networks for Pattern Recognition. New York:
cuit output is the only measure to determine fault classes. Training the Oxford Univ. Press, 1995.
neural network on more than one node voltage or using testability anal-
ysis as a preprocessor [12] can further resolve the ambiguity groups.
For instance, if the training data for the lowpass filter includes features
associated with the v2 node voltage, our neural network can classify
R9 * fault correctly and the ambiguity group reduces to R6 * and