You are on page 1of 2

GMM BASED SPEECH RECOGNITION

Objective
Speech Recognition is the process of recognizing the word (predefined) spoken by the
speaker on the basis of information included in speech waves. GMM or Gaussian Mixture
model algorithm compares the cepstral coefficients generated by speech samples in the
training and testing phase. Furthermore this technique makes it possible to use the
speaker’s voice to verify their identity. This project is implemented in ADSP 2181
processor.

Project Description

The Speech Recognition can be classified into two phases.

1) Training Phase. 2) Testing Phase

In Training Phase, the frequency components of the given speech signal is


extracted. Each registered speaker has to provide samples of their speech (given words)
so that the system can build or train a reference model for that speaker. In addition, a
speaker – specific threshold is also computed from the training samples.

In testing phase, the input speech is matched with stored references models (s)
and recognition decision is made on the basis of Mel Frequency Cepstrum Coefficients
(MFCC) , Gaussian Mixture model(GMM).

Block Diagram
Speech
Input Windowi Mel-
Framing
F ng |FFT|
r
F
F Filtering
F
r r F
a r
m a a r
a a
i m m
m m
n i i
Recognizg i i
n Static
n
n
ed O/P GMMg g n
F coefficient
g DCT g
Classifier F
r F s
a F r
r a
m a r
i a m
m i
n i m
g i n
n g
g n
g
Implementation

a. The Speech signal is sampled by means of PC port.


b. The sampled speech signals are given to Matlab code.
c. In the Training phase the sample speech signal is converted to MFCC
codes.
d. In the Testing Phase the test signal given is compared and recognized by
GMM algorithm.
e. The recognized word is displayed in the PC.

You might also like