You are on page 1of 3

ROBUST SPEAKER IDENTIFICATION IN THE CONTEXT OF

TEXT-INDEPENDENT BY COLLECTING THE REAL-TIME


SPEECH DATA UNDER NOISY AND REVERBERANT
ENVIRONMENT

This research aims at developing robust text independent speaker recognition system from the

speech data collected from different speakers. Since the data collection is natural and in real time

environment, the data may include various noises like babble, vocal, reverberation and speech from

other speakers, etc. The preprocessing of speech plays an important role in developing speaker

recognition system. In this work, the credibility of existing speech processing techniques are

explored first and then the techniques required for improving the performance need to be innovated.

The literature survey reveals that though the researches have been continuously striving hard to

improve the performance of speaker recognition under degraded condition, the performance obtained

is not to the level of acceptable as the speech is non-stationary and efficient techniques are not

available to remove the noise. Developing robust speaker recognition system under degraded

condition may help to mankind to interact with the system as if they interact with the human beings.

Objectives

The main objectives of the problem statement are,

To collect the speech data from the different speakers in the real time environment.

To eliminate the noise in degraded speech data collected from the speakers.

To identify and verify the speakers speech data.

To build a robust speaker recognition system.

To improve the accuracy of speaker recognition system.

Research Methodology
The research methodology is explained with the help of the block diagram depicted in figure 1
below.
Input Pre- Feature Speaker
Speech processing Extraction Modeling Testing Identification
Signal

Figure 1: Steps involved in implementing automatic speaker identification system


Pre-processing

The process of removal of this unwanted noise, dividing sounds into voiced and unvoiced

sounds and channel compensation etc. for the enhancement of speech/voice is called pre-processing.

Feature extraction

The purpose of this module is to convert the speech waveform, using digital signal processing

(DSP) tools, to a set of features (at a considerably lower information rate) for further analysis. This

is often referred as the signal-processing front end.

Modeling

From the extracted features speaker model is created for each and every speaker in the

database. The created speaker model should be unique and contains maximally the speaker specific

information for reproducing the vocal tract.

Testing

After modeling the speech signal, in testing phase, the identification accuracy of the learned

models is evaluated using data that was not included in the model training. In this phase, the input speech

is matched with stored reference model(s) and identification decision is made.

Possible Outcome

Creating a big database of 500 speakers speech data from different speakers under degraded

condition.

Developing efficient methods for noise elimination.

Implementing robust speaker recognition system under degraded condition.

The result from the experimental analysis could be published in conference proceedings and

renowned journals.

Patenting innovative ideas


Applications

Speaker recognition for authentication

Speaker recognition for surveillance

Forensic speaker recognition

Security

Multi-speaker tracking

Personalized user interfaces

Year wise plan of work


First year:
Literature survey

Implementing feature extraction module

Publishing the results in SCI journals

Second year:

Implementing modeling/training techniques

Publishing the results in SCI journals

Presenting the paper in international conferences

Third year:

Implementing testing part

Publishing the results in SCI journals

Patenting innovative ideas

Thesis writing and submission