Professional Documents
Culture Documents
This research aims at developing robust text independent speaker recognition system from the
speech data collected from different speakers. Since the data collection is natural and in real time
environment, the data may include various noises like babble, vocal, reverberation and speech from
other speakers, etc. The preprocessing of speech plays an important role in developing speaker
recognition system. In this work, the credibility of existing speech processing techniques are
explored first and then the techniques required for improving the performance need to be innovated.
The literature survey reveals that though the researches have been continuously striving hard to
improve the performance of speaker recognition under degraded condition, the performance obtained
is not to the level of acceptable as the speech is non-stationary and efficient techniques are not
available to remove the noise. Developing robust speaker recognition system under degraded
condition may help to mankind to interact with the system as if they interact with the human beings.
Objectives
To collect the speech data from the different speakers in the real time environment.
To eliminate the noise in degraded speech data collected from the speakers.
Research Methodology
The research methodology is explained with the help of the block diagram depicted in figure 1
below.
Input Pre- Feature Speaker
Speech processing Extraction Modeling Testing Identification
Signal
The process of removal of this unwanted noise, dividing sounds into voiced and unvoiced
sounds and channel compensation etc. for the enhancement of speech/voice is called pre-processing.
Feature extraction
The purpose of this module is to convert the speech waveform, using digital signal processing
(DSP) tools, to a set of features (at a considerably lower information rate) for further analysis. This
Modeling
From the extracted features speaker model is created for each and every speaker in the
database. The created speaker model should be unique and contains maximally the speaker specific
Testing
After modeling the speech signal, in testing phase, the identification accuracy of the learned
models is evaluated using data that was not included in the model training. In this phase, the input speech
Possible Outcome
Creating a big database of 500 speakers speech data from different speakers under degraded
condition.
The result from the experimental analysis could be published in conference proceedings and
renowned journals.
Security
Multi-speaker tracking
Second year:
Third year: