Professional Documents
Culture Documents
Abdellah KACHA
I.
INTRODUCTION
The separation of the speech signal into its vocal tract and
glottal source contributions is an important topic in speech
processing. Once the two components are isolated, they can be
modeled independently. The glottal source characterization can
be used in many areas of speech processing such as speaker
recognition [1], analysis of voice disorders [2], speech
recognition [3] and speech synthesis [4]. These reasons justify
the need to develop algorithms able to estimate the glottal
source robustly and reliably.
Although vocal tract modeling techniques are fairly well
established, this is not the case of the representation of the
glottal source. Some works have addressed the problem of
estimating the glottal source directly from the waveform of the
speech signal. Most of the approaches are based on a
parametric modeling of the vocal tract, and then, the inverse
filtering is used to to eliminate the effect of the vocal tract and
get an estimate of the glottal source signal. In [5], the discrete
all pole model has been used to model the vocal tract. The
iterative adaptive inverse filtering method described in [6]
isolates the source by estimating iteratively both components
due to the vocal tract and the source signal. In [7], the
estimated glottal signal is refined over several glottal cycles. A
2015 IEEE
th
IMF :
1(j th IMF ), r j (t )
x (t )
2.1.
For
do
2.1.1. Project the bivariate-value signal
on
direction :
2.1.2. Extract the location
of the maxima of
2.1.2. Interpolate the set
envelope curve in direction
2.2. Compute the mean of
2.3.
2.4.
3.
4.
to obtain the
all
envelope
curves
x(t )
IMF j (t ) rN (t )
(1)
j 1
III.
Ew ( f ) V ( f )
(a)
(6)
120
90
log V ( f )
(7)
30
0
mt 2
3
IMFs order
mp 4
(b)
20
Amplitude(dB)
log X w ( f )
60
10
0
-10
-20
0
1000
2000
3000
Frequency(Hz)
4000
5000
(c)
Figure 2. Illustration of the separation of the harmonic component and
spectral envelope of synthetic /a/ via empirical mode decomposition. (a) Log
magnitude spectrum and IMF components. (b) IMF variances. (c) estimated
glottal source magnitude.
log E w ( f )
IMF j ( f )
(8)
j 1
IMF
IMF
Speech
FFT
signal
CEM
D
IMF
IMF
REFERENCES
[1]
Clustering
method
|E|
Low
pass filter
Estimated
glottal source
IMF
IMF
[3]
IV.
[2]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
14
Amplitude (dB)
x 10
[15]
2
0
[16]
-2
-4
-6
0.01
0.02
0.03
0.04
Frequency (Hz)
0.05
0.06
V.
CONCLUSION
[17]
[18]
[19]
[20]
[21]