You are on page 1of 19

Mechanical Systems and Signal Processing (1997) 11(1), 149–167

GEAR CRACK DETECTION BY ADAPTIVE


AMPLITUDE AND PHASE DEMODULATION
D. B, M. T, H. O  A. R
Centre de Recherche en Automatique de Nancy, CNRS URA 821,
Université Henri Poincaré, Nancy 1, B.P. 239, 54506 Vandoeuvre Cedex, France

(Received January 1996, accepted September 1996)

A new method for gear crack detection is presented. It consists of the coupling of
adaptive demodulation with an abrupt change detector. The adaptive algorithm is intended
to account for the slow variations of the signal. Two methods for this adaptive
demodulation are proposed: a RLS approach based on a linear version of the signal model
and a LMS approach which directly estimates the physical parameters (amplitude and
phase) for each harmonic considered. Their respective advantages and disadvantages are
discussed and the superiority over the Hilbert transform approach is shown. The crack
detection is formulated as an abrupt change detection problem solved by the sequential
monitoring of the prediction error. Its effectiveness is demonstrated through the application
of the demodulation detection algorithm to gearbox under reliability tests.
7 1997 Academic Press Limited

1. INTRODUCTION
Perfect toothed-gear vibration signals (identical teeth with constant spacing) are periodical
with period Teng = 1/feng where feng is the meshing frequency (product of the number of teeth
and the shaft rotation frequency fr ). Due to non-linearities in the meshing process, the
spectrum of the signal contains not only the fundamental, but also harmonics of the
meshing frequency [1]. Nevertheless, real gears are not perfect. Non-constant teeth spacing
results in a contact point not strictly on the pitch circle and may cause frequency
modulations around its nominal value feng , while irregularities of the teeth contact surfaces
cause mechanical load variations which show up in the vibration signal as amplitude
modulations [3]. Such amplitude and phase modulations are also observed with wear
(increase) or grinding (decrease).
A localised fault, like a gear crack, also involves those two kinds of modulation, but
while the first kind of fault leads to slow variations, a gear crack results in abrupt variations
of the amplitude and phase at the wheel cycle scale. This transitory event is 1/fr periodical
where fr is the rotation frequency of the faulty wheel. This periodical non-stationarity may
be seen as a cyclostationarity [4], depending on the observation scale.
The model corresponding to these phenomena is given by:
m
x(t) = s ak(t) cos (2pkfeng t + 8k(t) ) + e(t) . (1a)
k=1

where m is the number of harmonics considered and e(t) is an additive white noise. The
complex signal y(t) associated to x(t) is the following:
m
y(t) = s ak(t) ej(2pfeng t + 8k (t )) + e(t) . (1b)
k=1
149
0888–3270/97/010149 + 19 $25.00/0/mg960 7 1997 Academic Press Limited
150 .   .
where, in this case, e(t) stands for a complex additive white noise, and ak(t) , 8k(t) are the
amplitude and phase modulation laws. These laws are 1/fr periodical, but at one wheel
rotation scale, they have to be considered as random (see [2, 4]). As a consequence, the
adequate framework for studying such phenomena is the class of random amplitude and
phase modulation processes.
The detection of gear cracks has motivated many works [5, 6, 7]. Among the numerous
possible aproaches, one of the most promising is to monitor the amplitude and phase
modulations. To do this, the classical approach is based upon the Hilbert transform of
the real bandpass filtered signal, where instantaneous amplitude and phase are estimated
as the instantaneous amplitude and phase of the analytic signal and the detection is
performed by comparison of these instantaneous values to an adequate threshold.
The approach proposed here is quite different. Indeed, non-stationarities expressed in
model (1a, b) through time-varying amplitude and phase modulations are of two kinds:
slow non-stationarities induced by distributed faults (wear, shaft misalignment); and
abrupt non-stationarities representative of localised faults (cracks). In this framework, an
adaptive algorithm is used to estimate the slowly time-varying amplitude and phase. A
sequential abrupt change detection algorithm is then coupled to the adaptive algorithm
to decide whether a crack is present or not.
The organisation of the paper is as follows. In the next section, we present results about
random amplitude and phase modulations. More particularly, we focus on the equivalent
spectral bandwidth of such processes. In Section 3, two different approaches for the
adaptive demodulation are presented. They are based respectively on a recursive least
squares and a least mean squares algorithm, each depending on a different discrete version
of model (1a). The choice of the design variables is addressed and the results of Section
2 are used to discuss the adequation of the different possible methods including the Hilbert
transform. Then, in Section 4, the question of crack detection is discussed. It is treated
as an abrupt change detection problem which is solved by the sequential monitoring of
the prediction error provided by the adaptive algorithm. Finally, in Section 5, the good
performances achieved with the proposed demodulation/detection algorithm are
demonstrated on a real process, consisting of a car gearbox under reliability tests. Its
application to optimal sensor location is then considered.

2. SPECTRAL BANDWIDTH OF RANDOM AMPLITUDE AND PHASE MODULATIONS


The aim of this section is to present some important results about the influence of the
spectral bandwidths of the random modulation laws on the spectral bandwidth of the
modulated process. For detailed derivation of these results see Appendix A (random
amplitude modulation) and Appendix B (random phase modulation).

2.1.      


Let {a(t) } be a random process, which is assumed to have zero mean. We consider the
new random process defined by:
y(t) = (a0 + aa(t) ) ej(v0t + 80). (2)
where a0 , a, v0 are real constants, 80 is the initial phase uniformly distributed over the
interval [0, 2p.
This is a simplified version of model (1b) because it includes only one component and
has constant phase. The autocorrelation function (ACF) of {y(t) } may be expressed as a
function of the ACF of {a(t) }:
Ry(t) = [a02 + a 2Ra(t) ] ej(v0t) (3)
   151
The power spectral density (PSD) is then obtained as the Fourier transform of the ACF:
=Y(v) = 2 = a02d(v − v 0) + a 2 =A(v − v 0) =2 (4)
=A(v) =2 = FT(Ra(t) )
The equivalent spectral bandwidth is defined as:

g
+a

v 2 =Y(v) = 2 dv
−a
Dvy2 = (5)

g
+a

=Y(v) =2 dv
−a

where for notation simplification the mean frequency is taken equal to 0 (this is always
the case for real modulation laws). Elementary calculations lead to:
Dvy = Dva (6)
The spectral bandwidth of {y(t) } is the same as that of {a(t) } and it does not depend on the
amplitude modulation rate. The latter only acts on the power of the random process.

2.2.      


Let {8(t) } be a random process, which is assumed to have zero mean. We consider the
new random process defined by:
y(t) = A ej(v0t + a8(t ) + 80). (7)
where A, a (modulation rate), v0 (carrier frequency) are real constants, 80 is the initial
phase uniformly distributed over the interval [0, 2p]. Once again this is a simplified version
of model (1b) because it includes only one component and has constant amplitude (m = 1,
A(t) = A).
The ACF of {y(t) } may be explicitly derived if the probability density function of {8(t) }
is gaussian. In this case, it is given by:
Ry(t) = A 2 e−a ea R8 (t ) ejv0t
2 2
(8)
and the PSD is:
=Y(v) =2 = A 2 e−a =Z(v) =2 ( d(v − v 0)
2

= A 2 e−a =Z(v − v 0) =2
2
(9)

g
+a

=Z(v) =2 = FT(ea R8 (t )) = ea R8 (t ) e−jvt dt


2 2

−a

Under the supplementary assumption that the ACF of {8(t) } is also gaussian, the equivalent
spectral bandwidth of {y(t) } is simply given by:
Dvy2 = a 2 Dv82 (10)
Dvy = a Dv8
The interpretation of this result is that, in the case of a gaussian ACF, the spectral
bandwidth increases linearly with the modulation rate. However, the latter does not change
the power of the random process {y(t) }.
152 .   .
2.3. 
The extension of the previous results to simultaneous amplitude and phase modulations
seems to be tractable, but has not been treated yet. However, according to the previous
results, an important, but not often mentioned, limitation of the Hilbert transform
approach may be pointed out. Indeed it only makes use of a single filter, while two different
width filters could be used to obtain more accurate estimates (one for the amplitude and
one for the phase). We will return to this point in Section 3.3.

3. AN ADAPTIVE APPROACH TO AMPLITUDE AND PHASE DEMODULATION


From now on, we will only consider the discrete version of model (1a), that is:

m
xn = s ak,n cos (2pkfeng n + 8k,n ) + en . (11)
k=1

where for notation simplicity, it is assumed that the sample frequency equals 1, that is to
say that feng is the normalised meshing frequency. As pointed out before, two types of
non-stationarities are included in model (11): slow non-stationarities are intended to
account for the so-called distributed faults, while abrupt non-stationarities can account for
localised faults, say gear cracks. In this section, we focus on the adaptive estimation of
the slowly varying amplitude and phase. Model (11) may be exploited in two ways. A first
possible approach is to work with the linear version of (11) for which classical recursive
least squares (RLS) techniques are available. A second possible approach directly estimates
the parameters of model (11), namely amplitudes and phases. For this non-linear
estimation, a stochastic gradient (or least mean squares) method is suggested.

3.1.    


The previous model may be written in the Fourier base as:

xn = HnTun + en

HnT = [cos (2pfeng n) sin (2pfeng n) ··· (cos (2pmfeng n) sin (2pmfeng n)]T. (12)

unT = [a1,n cos 81,n a1,n sin 81,n ··· am,n cos 8m,n am,n sin 8m,n ]T

where un is the parameter vector to estimate. The model being linear with respect to the
parameter, the estimation may be achieved by the RLS algorithm with forgetting factor [8]:

Kn = (l + HnTPn − 1 Hn )Pn − 1 Hn

en = xn − HnTu
n
(13)
u
n = u
n − 1 + Kn en

Pn = l−1(I − Kn HnT )Pn − 1

The forgetting factor l is introduced to enable the tracking of the time-variable parameters.
Its choice will be discussed in Section 3.3. The main advantage of this approach lies in
the linearity of the model used, while its main drawback is that it does not give a direct
access to the physical parameters, namely the amplitudes and the phases of the different
components of the signal under study. Particularly, unwrapping techniques have to be
employed to obtain the phase. This can make the analysis more difficult.
   153
3.2.   
To handle with the aforementioned limitations, we may perform the direct estimation
of the amplitude and phase. Indeed, the vector parameter:
unT = [a1,n 81,n ··· am,n 8m,n ]T. (14)
may be estimated as the one which minimises the LMS criterion defined by:
J(un ) = E{en2 } (15)
m
en = xn − s ak,n − 1 cos (2pkfeng n + 8k,n − 1 )
k=1

u
n = arg min J(un )
un

The criterion being non-linear with respect to the parameter, the application of the RLS
technique is not possible and non-linear optimisation techniques have to be used.
Derivating the criterion J with respect to u gives:
1 1
J(un ) = (E{en2 }) = −2E{en 9n }
1un 1un
K
G cos (2pfeng n + 81,n − 1 ) L
G
G −a1,n − 1 sin (2pfengn + 81,n − 1) G
9n =G ..
.
G
G G
G cos (2pmfeng n + 8m,n − 1 )
k−am,n − 1 sin (2pmfengn + 8m,n − 1) G
l
Estimating the gradient E{en 9nT } by its instantaneous value, leads to the stochastic gradient
(or LMS) algorithm:
u
n = u
n − 1 + gGen 9
n (16)
m
en = yn − s âk,n − 1 cos (2pkfeng n + 8̂k,n − 1 )
k=1

K
G cos (2pfeng n + 8̂1,n − 1 ) L
G
G −â1,n − 1 sin (2pfengn + 8̂1,n − 1) G
9
n = G ..
.
G
G G
G cos (2pmfeng n + 8̂m,n − 1 ) G
k−âm,n − 1 sin (2pmfengn + 8̂m,n − 1) l
The matrix adaptation gain gG is the design variable which enables to control the tracking
behaviour of the adaptive algorithm. Its choice will be discussed in Section 3.3.
It should be noted that the criterion has an infinite number of minima. As an example,
in the case of a monocomponent carrier, the set of parameter minimising the criterion may
be expressed as:

6$ % 7 6$ % 7
a −a
U= , k $Z * , k $Z
8 + 2kp 8 + (2k + 1)p

The error performance surface corresponding to the LMS criterion has the shape shown
on Fig. 1. Depending on the initial value of the parameter, each local minimum may be
154 .   .
reached by the adaptive algorithm. Moreover, the adaptation gain should be chosen small
enough to ensure the stability of the algorithm. In other words, a small gain value ensures
that it remains in the same local minimum.

3.3.       


In this section, we are interested in the influence of the design variables of the two
adaptive demodulation algorithms, that is the forgetting factor for the RLS method and
the adaptation gain for the LMS method. Basically, the problem addressed is the following:
given a model structure, an adaptive algorithm and a description of how the true
parameter of the model changes, one needs to derive the optimal design variables,
resulting in the optimal tracking algorithm.
This has motivated many works and a comprehensive treatment may be found in [9] while
a deep study is presented in [10]. There are different ways to evaluate the tracking per-
formances of an adaptive algorithm. For example, [10] proposes to study the expectation
of the norm of the parameter estimation error, while [9] uses the variance of the prediction
error. In fact, all of these measures are closely related to one another and they can always
be decomposed into a bias term and a variance term. An increase (respective decrease) of
the adaptation gain (1 − l for the RLS, g for the LMS) results in a decrease (respective
increase) of the bias and an increase (respective decrease) of variance.
Such a trade-off has also to be solved for the Hilbert transform approach for
amplitude/phase demodulation. Indeed, in order to apply it, we first need to bandpass filter
the real signal and it is clear that the filter bandwidth has a strong influence on the accuracy
of the estimations. Basically, a large bandwidth will result in an increase of the variance
of the estimated amplitude and phase while a small bandwidth leads to biased estimates.
This clearly evidences that the trade-off which has to be solved to determine the optimal
design variables for adaptive demodulation is the same as the one for determining the
optimal filter bandwidth for the Hilbert transform demodulation approach.
Concerning the matrix G of the LMS algorithm, its choice influences on the search
direction. The most common choice is G = I (identity matrix). However, in the case
considered, it is more accurate to choose:
G = diag (g1,amp g1,phase ··· gm,amp gm,phase ) (17)

Figure 1. Error performance surface of the LMS criterion J(u).


   155
T 1
Comparison of the demodulation approaches
Hilbert transform RLS LMS
Accuracy –only one filter –automatic learning of –independent control of
bandwidth the gain matrix the adaptation gain
–no direct access to the –no direct access to the –direct access to the
physical parameters physical parameters physical parameter (no
(unwrapping) (unwrapping) unwrapping)
Rank 3 Rank 2 Rank 1
Choice simplicity four choices two design variables 2m + 1 design variables
of the design –number of harmonics –number of harmonics –number of harmonics
variables –filter type –forgetting factor –2m adaptation gain
–filter order L
–filter bandwidth
Rank 2 Rank 1 Rank 3
Complexity O(mN)
O(m(LN + N ln (N))) O(m 3N) (diagonal gain matrix)
O(m 2N)
(general gain matrix)
Rank 3 Rank 2 Rank 1
Sequential No Yes Yes
processing Rank 3 Rank 2 Rank 1

which corresponds to independent variations of the amplitudes and phases. This gain
matrix does not provide the optimal search direction as defined in [10], but practical results
achieved with it are very satisfactory. These diagonal elements may be fixed by successive
trials. The access to the non-diagonal elements is more complicated because they

Figure 2. Gearbox.
156 .   .

Figure 3. Flowchart of the algorithm.

describe the dependency between the variations of each parameter. The individual control
of the adaptation gain for each parameter is the main advantage of the LMS approach.
Indeed, as pointed out before, acting on the adaptation gain is equivalent to varying the
filter bandwidth. As a consequence, having an independent control on each gain is
equivalent to choosing filters with bandwidths well-adapted to the variations of each
parameter. This is certainly the most efficient approach because, according to the results
of Section 2, amplitude and phase modulations result in different spectral bandwidths of
the modulated processes.
The RLS algorithm does not offer such a possibility. However, it should be noted that
the recursive propagation of the matrix Pn may be interpreted as an effective procedure
to automatically determine the optimal adaptation matrix gain. Even so, it does not ensure
the optimal search direction since it corresponds to a particular model of the parameter
variations (see [10] for details). The main limitation of the RLS approach is that it does
not enable a direct control of the physical parameter variations. Thus, both of the proposed

methods handle with the limitation of the Hilbert transform method mentioned in Section
2.3.
As a conclusion, Table 1 summarises the respective advantages and disadvantages of the
three different approaches (LMS, RLS, Hilbert), in terms of accuracy of estimation, choice
simplicity of the design variables, numerical complexity and possibility of a sequential data
processing for real time applications. Note that it is possible to further reduce the
numerical complexity of recursive algorithms using faster versions, but this will always be
done to the detriment of their numerical stability.
   157
4. SEQUENTIAL DETECTION OF GEAR CRACKS
The crack detection problem may be seen as an abrupt change detection problem which
can be stated as the following hypotheses testing problem:
H0 : un = u
n (un actual parameter, u
n estimated parameter)
\r E N tq :u = un if n Q r
H1 :
u = un + du if n e r
H0 hypothesis simply states that the actual and estimated parameters are (approximately)
equal, while H1 hypothesis states that an abrupt change has occurred at the (unknown)
time r. A classical approach to solve this test is the sequential monitoring of the prediction
error (residues) variance. According to the classification proposed in [11], this is a
one-model approach, because it does not use an estimation of the parameter under H1
hypothesis as in the generalised likelihood ratio test. It also has similarities with the method
proposed in [6], the main difference being the way the residues are obtained (subtraction
of the measured and averaged signals). Assuming that, under H0 , the prediction error is
normally distributed with variance s 2, the statistic Tn = en2 /s 2 is x 2 distributed with 1 dof.
It follows immediately that:
EH0 (Tn − 1) = 0
Moreover, it may be shown [12] that:
EH1 (Tn − 1/Hn ) = entropy under H1 − entropy under H0 + Kullback information
As a consequence, a change will result in an increase of the prediction error energy if, and
only if, EH1 (Tn − 1/Hn ) q 0. The Kullback information is always positive. Thus, a sufficient
condition is that the entropy of the stochastic process under H1 is greater than or equal
to the entropy under H0 . Under the simplifying assumption that the parameters are
constant before and after change, a sufficient condition is that the variance of the additive
noise does not decrease after change.
Considering the statistic Tn − 1 normally distributed with zero mean, the previous
abrupt change detection problem is equivalent to the sequential detection of positive jumps
in a gaussian sequence. This may be done in an optimal manner by the Page–Hinkley test
(see [11] for example), which results in the following stopping rule:
na = inf (n/gn e s) (17)

0 1
n
gn = max gn − 1 + Tn − 1 − , 0 (18)
2

The stopping time na is the first time at which the statistic gn is greater than the threshold
s. The latter has to be chosen in order to achieve the performances desired. The parameter
n represents the minimum jump magnitude which has to be detected. Finally, the change
time may be estimated as the last time at which gn was equal to 0.
To end this section, we discuss some advantages of the proposed approach over the
Hilbert Transform approach. In the latter, the decision is made directly on the estimated
amplitude and phase. However, as pointed out in Section 3, these estimates may be biased.
This is particularly true in the case of localised faults due to their large bandwidths. As
a consequence, decisions based upon this approach may lead to erroneous results. For that
reason, we don’t work directly with the estimated parameters provided by the adaptive
algorithm. On the contrary, the proposed approach tries to measure a distance between
158 .   .

Figure 4. Averaged signal without (left) and with (right) fault (First gear, first sensor position).

Figure 5. Results of adaptive demodulation.

a reference (time-variable) and the actual model’s by looking at the statistical properties
of the residual signal. The coupling of one of the adaptive algorithms with the abrupt
change detection method is also very important because it will result in a robust detection
method in the sense that it will be insensitive to the slow variations of amplitude and phase.
Finally, an other advantage of this two-step method lies in the fact that it is a sequential
procedure which enables a real-time implementation.
   159

Figure 6. Results of detection.

5. APPLICATION OF GEARBOX FAULT DETECTION


5.1. 
The car gearbox considered is driven by an asynchronous motor at 75 Hz. A second one
brakes the differential output with a momentum of 80 Nm. The vibrations of the whole
box are recorded with piezo-electric accelerometers located at 12 selected positions on the
casing. Signals are low-pass filtered at 11.2 kHz and sampled with a fixed rate of 22.5 kHz.
The angular position of the driving shaft is recorded by means of a pulse generator.
The gearbox itself is a complex mechanical device. All five gears are mounted on two
shafts (Fig. 2). The five driving wheels are fixed on the driving shaft, while the
corresponding driven wheels may be kept fixed or not by using the clutch fork mechanism.
The shafts are supported by three ball bearings and one roller bearing. The whole
mechanism is assembled in a casing containing the lubricant. The output shaft drives the
output gear which transmits motion to the differential. Only helical gears are used. The

Figure 7. Determination of the optimal forgetting factor.


160 .   .
T 2
Variance of the prediction error
Gear
1 0.0471 0.0021 0.0135 0.0143 0.0284
2 0.0397 0.0045 0.0079 0.0226 0.0504
3 0.0362 0.0227 0.0055 0.0142 0.0247
4 0.0260 0.0237 0.0062 0.0131 0.0261
5 0.0264 0.0379 0.0094 0.0097 0.0173
6 0.0280 0.0332 0.0062 0.0505 0.0512
7 0.0339 0.0355 0.0039 0.0373 0.0290
8 0.0463 0.0238 0.0046 0.0218 0.0429
9 0.0064 0.0049 0.0096 0.0236 0.0348
10 0.0014 0.0029 0.0068 0.0126 0.0192
11 0.0052 0.0027 0.0029 0.0377 0.0592
12 0.0051 0.0071 0.0110 0.0252 0.0511
For gears 1 to 5, the optimal sensor locations are respectively: 10, 1, 11, 5, 5.

vibration signal recorded on the casing represents the response of the mechanical structure
to forces introduced on the one hand by gear meshing and bearings and on the other hand
by the circulation of the lubricant. Faults were induced successively on each wheel of the
driven shaft.

5.2.    


Equation [15] shows that the vibration measured on the casing is its response to an
excitation which may be decomposed into deterministic periodical and purely random
excitation. Thus, the measured vibration signal is the superposition of a deterministic
periodical signal and of a purely random signal (which is the response of the structure to
a stochastic excitation). The periodic part of the signal may be extracted by synchronous
averaging. Not only does it improve the signal-to-noise ratio of the signal, but it also makes
the model (1) more accurate and is helpful in determining the damaged wheel. Depending
on the averaging period, that is the rotation period of the wheel considered, the
synchronous averaging may be seen as a selective filter (periodical Dirac impulses with
period fr in the frequency domain). As a consequence the averaged signal only includes
vibrations induced by the monitored wheel. Thus, the first step of the method is to perform
the synchronous averaging with respect to driving and driven wheels. The two resulting
signals are then processed by the method proposed. The demodulation stage is performed
using the RLS approach of Section 4.1 because of the easiness to choose the design
parameters. The flowchart of the algorithm is shown on Fig. 3.

5.3.    


Figure 4 shows the averaged vibration signal corresponding to the first gear recorded
at the first sensor location. The averaging is performed synchronously to the driven wheel.
The left part shows a faultless signal, while the right signal corresponds to a gear with a
crack on the driven wheel. Amplitude and phase estimation has been performed with m = 3
harmonics and a forgetting factor l = 0.95. It appears that a non-faulty gear vibration
signal only includes slow amplitude and phase modulation, while a crack results in abrupt
parameter variations. It should be noted that the results provided by the adaptive
algorithm cannot be used to analyse the amplitude and phase modulation over the
transient part of the signal. This is because the adaptive algorithm cannot converge in this
part. For this purpose, a time-frequency analysis is more convenient [14, 15].
   161
Concerning the choice of the parameter m (number of harmonics) and l (forgetting
factor), we can make the following comments.
1. The number of harmonics may be estimated by the AIC criterion [16] or other
order estimation methods (see [17] for example). However, successive trials
have shown that this choice was not crucial, provided that the number of
harmonics was chosen large enough. In the sequel, the number of harmonics
considered is fixed to three.
2. The forgetting factor is determined experimentally on fault-free vibration
signals. It is chosen so that the mean square error is minimised. This approach
will be adopted in Section 5.5.
5.4.  
Figure 6 shows the results achieved by the proposed method on two different faulty
signals. In the left figures, the crack clearly appears while in the right figures, it is less
visible. For these trials, the minimum jump magnitude was taken equal to 0.1. It appears
that the detection performs well in both cases.
5.5.     
The adaptive demodulation detection algorithm has been used to determine the optimal
sensor location. At first, this has been done from an estimation point of view. The criterion
is: for each gear and sensor location we determine the optimal forgetting factor, as is
the one which minimises the variance of the prediction error (Fig. 7). The optimal sensor
location corresponds to the smaller error variance (Table 2). This is done on fault-free
signals.
We have repeated the study, but from the detection point of view. For each sensor
location and each gear, we have determined the maximum value of the detection statistic
gn computed with the optimal forgetting factor (see previous case). For each gear, the
optimal sensor location corresponds to the highest maximum (Fig. 8). This time, the
different experiments are performed on faulty signals. It should be noted that (except for
gear 1) the optimal sensor locations are the same. From a practical point of view, this is
of prime importance because the optimal sensor locations may be obtained under
Fault-free conditions.

6. CONCLUSIONS
A new method for gear crack detection is proposed. The approach used consists of the
coupling of adaptive demodulation with an abrupt change detector. The adaptive
algorithm is intended to account for the slow variations of the signal. Two methods for
this adaptive demodulation are proposed: a RLS approach based on a linear version of
the signal model and a LMS approach which directly estimates the physical parameters
amplitude and phase for each harmonic considered. Their respective advantages and
disadvantages are discussed and the superiority over the Hilbert Transform approach is
shown. The crack-detection is formulated as an abrupt change detection problem solved
by the sequential monitoring of the prediction error.
The application of the demodulation detection algorithm to gearbox under reliability
tests demonstrates its effectiveness.
The proposed algorithm exhibits good properties in the following points:
the coupling of the two stages ensures good robustness against slow amplitude and phase
variations not representative for crack faults;
the design variables are easy to choose (RLS approach);
the numerical complexity is lower than that of the Hilbert transform approach;
the algorithm is sequential allowing its real time implementation.
162
.   .

Figure 8. Maximum of the detection statistic for the 12 sensor locations The optimal sensor location for gears 1 to 5 are respectively: 2, 1, 11, 5 and 5..
   163
At the present time, the algorithm is implemented on a DSP board and will be part of a
monitoring tool for gearbox reliability tests.

REFERENCES
1. R. B. R 1981 Journal of Mechanical Design 1, 259–268. A new method of modelling gear
faults.
2. W. D. M 1978 Journal of the Acoustic Society of America 63, 1409–1429. Analysis of the
vibratory excitation of gear systems: basic theory.
3. C. F and P. P 1992 Proceedings of Colloque Progrès Récents des Méthodes de
Surveillance Acoustiques et Vibratoires, 639–649. Surveillance et diagnostic des engrenages.
4. C. C, M. S and J. L. L 1995 Proceedings of the 2nd International
Symposium Acoustical and Vibratory Surveillance Methods and Diagnostic Techniques 1,
391–401. Apport de la théorie des processus cyclostationnaires à l’analyse et au diagnostic des
engrenages.
5. P. D. MF 1986 Journal of Vibration, Acoustics, Stress and Reliability in Design 108,
165–170. Detecting fatigue cracks in gears by amplitude and phase demodulation of the meshing
vibrations.
6. P. D. MF and J. D. S 1985 Proceedings of the Institution of Mechanical Engineers 199,
287–292. A signal processing technique for detecting local defects in a gear from the signal average
of the vibration.
7. D. G and M. S 1990 Mécanique Matériaux Electricité 435, 61–64. Méthodes
avancées de traitement du signal en vue de la surveillance des machines.
8. L. L and T. S¨¨ 1983 Theory and Practice of Recursive Identification. Prentice Hall.
9. O. M 1989 Traitement du Signal 6, 335–345. Adaptatif et non-stationnaire.
10. A. B, P. M́ and P. P 1986 Algorithmes Adaptatifs et Approximations
Stochastiques. Masson.
11. M. B and I. N 1993 Detection of Abrupt Changes—Theory and applications.
Prentice Hall.
12. J. S and A. C. S 1980 IEEE Transactions on Information Theory 26, 249–255.
Detecting changes in a time series.
13. V. B, M. T, D. B, A. R, M. C and M. G 1994 Compte Rendu
d’Avancement de Travaux CRAN/PSA. Diagnostic de défauts sur boı̂tes de vitesses par analyse
vibratoire.
14. H. O, D. B, V. B and M. T 1995 Proceedings of the UK Symposium
on Applications of Time–frequency and Time-scale Methods (TFTS 95) , University of Warwick,
Coventry, UK 128–135. A method for analysing gearbox failures using time–frequency
representations.
15. H. O, D. B, V. B and M. T 1995 Proceedings of International
Conference on Acoustics, Speech, and Signal Processing, Detroit, 128–135. Examination of
gearbox cracks using time-frequency distribution.
16. H. Aı¨ 1974 IEEE Transactions on Automatic Control 19, 716–723. A new look at the
statistical model identification.
17. D. B, M. T, T. C and H. N 1993 Proceedings of the IMAC/IFAC, 2nd
International Symposium on Mathematical and Intelligent Model in System Simulation, Brussels
1, 402–406. Application of local test statistics to order estimation.

APPENDIX A: SECOND-ORDER CHARACTERISATION OF RANDOM AMPLITUDE


MODULATION

 
Let {a(t) } be a random process, which is assumed to be zero mean. Consider the new
random process defined by:

y(t) = (a0 + aa(t) ) ej(v0t + 80). (A1)


164 .   .
where a0 , a, v0 are real constants, 80 is the initial phase uniformly distributed over the
interval [0, 2p[. Note that this is a simplified version of model (1b) because it includes only
one component and has a constant phase.
The first- and second-order statistics of {y(t) } are defined by:

E{y(t) } = E{Re (y(t) )} + jE{Im (y(t) )}. (A2)

(s) }
Ry(t,s) = E{y(t) y*

where y*(t) denotes the complex conjugate of y(t) . Ry(t, s) will be referred to as the
autocorrelation function (ACF) of {y(t) }. In the stationary case, we have:

Ry(t,s) = Ry(t − s) = Ry(t) (A3)

Moreover, Ry(t, s) have the hermitian symmetry property. This may be directly verified from
the definition (3):

Ry(t,s) = R*
y(s,t) (A4)

The aim of this appendix is to determine the ACF of {y(t) } as a function of the ACF of
{a(t) }. It will be used to determine the power spectral density (PSD) and the equivalent
spectral bandwidth of {y(t) }.

      {() }


According to definition (3), Ry(t,s) is given by:

(s) }
Ry(t,s) = E{y(t) y*
= E{(a0 + aa(t) )(a0 + aa(s) ) ejv0(t − s)}

= [a02 + a(E{a(t) } + E{a(s) }) + a 2E{a(t) a(s) }] ej(v0(t − s))

= [a02 + a 2Ra(t,s) ] ej(v0(t − s)) (A5)

where Ra(t,s) = Ra(t) is the autocorrelation function of {a(t) }. (stationary case) The PSD is
then obtained as the Fourier transform of the autocorrelation function:

=Y(v) =2 = a02d(v − v 0) + a 2 =A(v − v 0) =2

g
+a
(A6)
=A(v) =2 = FT(Ra(t) ) = Ra(t) e−jvt dt
−a

Defining the equivalent spectral bandwidth as:

g
+a

v 2 =Y(v) =2 dv
−a
Dvy2 = (A7)

g
+a

=Y(v) = dv
2

−a

it follows immediately that:

Dvy = Dva (A8)


   165
APPENDIX B: SECOND-ORDER CHARACTERISATION OF RANDOM PHASE
MODULATION
 
Let {8(t) } be a random process, which is assumed to be zero mean. Consider the new
random process defined by:
y(t) = A ej(v0t + a8(t ) + 80), (B1)
where A, a (modulation rate), v0 (carrier frequency) are real constants, 80 is the initial
phase uniformly distributed over the interval [0, 2p[. Note that this is a simplified version
of model (1b) because it includes only one component and has constant amplitude. First-
and second-order statistics of {y(t) } are defined as in (A2).
The goal of this section is to determine the ACF of {y(t) } as a function of ACF of {8(t) }.
In the particular case where {8(t) } has a Gaussian probability density function (PDF), the
result may be used to obtain an explicit form of Ry(t,s) . The study of the influence of R8(t,s)
on the spectral bandwidth of {y(t) } is made in one special case.

   {() }


According to definition (A2), Ry(t,s) is given by:
(s) }
Ry(t,s) = E{y(t) y*
= A 2E{ej(v0(t − s) + a(8(t ) − 8(s )))}
= A 2 ej(v0(t − s))E{eja(8(t ) − 8(s ))}
Recalling the definition of the characteristic function of a random variable j with
probability density p(j):

g
+a

Mj(n) = p(j) ejnj dj = E{ejnj}


−a

Taking n = a and j = 8(t) − 8(s) , it follows immediately that:


Ry(t, s) = A 2 ej(v0(t − s)) M8 (t ) − 8 (s )(a) (B2)
Thus, the autocorrelation of {y(t) } is given simply in terms of the characteristic function
of the increment between two different times.

     


The following supplementary assumptions are made: {8(t) } is stationary, Gaussian with
zero mean, and autocorrelation function R8(t) . Without loss of generality, it may be
assumed that R8(0) = 1. The process {8(t) } being gaussian, the random variable 8̃(t,s) defined
by:
8̃(t,s) = 8(t) − 8(s)
is also Gaussian with zero mean and variance:
s8̃(t,s)
2
= E{8̃(t,s) 8̃*
(t,s) }

= R8(t,t) + R8(s,s) − 2R8(t,s)


which in the stationary case reduces to:
s8̃(t)
2
= 2(R8(0) − R8(t) ) = 2(1 − R8(t) )
166
The characteristic function of 8̃(t,s) is then given by:
2
M8̃(a) = e−a (1 − R8 (t ))
and finally the autocorrelation function of {y(t) } becomes:
Ry(t) = A 2 e−a ea R8 (t ) ejv0t
2 2
(B3)
The PSD of {y(t) } may be obtained as the Fourier transform of Ry(t) . Due to the hermitian
symmetry of Ry(t) and R8(t) , their Fourier transforms are real and positive. Thus, we may
introduce the following notations:

g
+a

=Y(v) =2 = FT(Ry(t) ) = Ry(t) e−jvt dt (B4)


−a

g
+a

=Z(v) =2 = FT(ea R8 (t )) = ea R8 (t ) e−jvt dt


2 2
(B4)
−a

g
+a

=F(v) =2 = FT(R8(t) ) = R8(t) e−jvt dt


−a

From equations (B3) and (B4), we get:


=Y(v) =2 = A 2 e−a =Z(v) =2 ( d(v − v 0)
2

= A 2 e−a =Z(v − v 0) =2
2
(B5)
It is clear from equation (9) that =Z(v) = play a key role in evaluating =Y(v) = . In particular,
2 2

the study of the bandwidth of =Z(v) =2 with respect to that of =F(v) =2 may give some insight
into the behaviour of =Y(v) =2. In the general case, this is a very difficult problem and we
will only address it for the special case where the process {8(t) } has a gaussian ACF. The
reader should keep in mind that this supplementary assumption only concerns the shape
of the autocorrelation function. However, the pdf of {8(t) } still remains gaussian. So, we
have:
2 2
R8(t) = e−(1/2)t /st
(B6)
=F(v) =2 = z2pst2 e−(1/2)st v
2 2

From now on, we want to determine the spectral bandwidth Dvy of {y(t) }, defined as:

g
+a

v 2=Y(v) =2 dv
−a
Dvy2 =

g
+a

=Y(v) =2 dv
−a

where, for simplicity, the mean frequency is assumed to be equal to zero. So the first step
to determine Dvy is to calculate =Y(v) =2. Recalling the series expansion of ea R8 (t ):
2

a
(a 2R8(t) )n
ea R8 (t ) = s
2
(B7)
n=0
n!
   167
From (B6) and (B7), we have:
a
(a 2)n(R8(t) )n
ea R8 (t ) = s
2

n=0
n!
2 2
(R8(t) )n = e−(t /2(st /zn) )
We finally obtain:

$ %
a
(a 2)n
=Y(v) =2 = A 2 e−a 1 + s =Fn(v − v 0) =2
2
(B8)
k=1
n!

where:

g X st2 −(1/2)(s2t /n)v2


+a

=Fn(v) =2 = n
R8(t) e−jvt dt = 2p e (B9)
−a
n

An important remark is that assuming R8(0) = 1, we also have (R8(0) )n = 1 and due to the
Wiener–Kintchine theorem, we have: f =Fn(v) =2 dv = 1. Then the power of {y(t) } calculates
to:

g 0 1
+a a
(a 2)n
=Y(v) =2 dv = A 2 e−a s
2
= A2 (B10)
−a k=0
n!

and the spectral bandwidth of {y(t) } is:


a
(a 2)n
Dvy2 = e−a s Dvn2
2

n=1
n!

with:

g
+a

Dvn2 = v 2=Fn(v) =2 dv = n Dv82


−a

which finally gives the very simple result:


Dvy2 = a 2 Dv82
(B11)
Dvy = a Dv8

You might also like