You are on page 1of 25

VOICE MORPHING

Batch 10-03

K. Mounika 10881A0481

Under the supervision of Mr. M. Nagarjuna Assistant Professor ECE Department

Structure of Presentation
INTRODUCTION CLASSIFICATION BLOCK DIAGRAMS

APPLICATIONS
LIMITATIONS AND CHALLENGES FUTURE SCOPE REFERENCES

What is Voice Morphing?


Voice morphing is a technique for modifying a source speakers speech to sound as if it was spoken by some designated target speaker.
Signal 1 Signal 2

Morphing algorithm

Morphed Signal

SPEECH IDENTITY OF A PERSON

SPEECH = PITCH + ENVELOPE IN A CONVOLVED FORM

PITCH
FUNDAMENTAL FREQUENCY
HARMONICS

ENVELOPE
PHONEME PHASE, AMPLITUDE

Quality

Phonetics

Timing

High quality (natural and intelligible )

Requirement of the same utterances to be spoken by the source and target speaker.

The ability to operate with target voice training data ranging from a few seconds to tens of minutes.

Goals of Voice Morphing

Classification of VM techniques
Voice-based models: Vocal Tract Length Normalization (VTLN), Formant Frequencies, and Glottal Flow models.
Mixed Voice/Signal Models: Linear Prediction Coding (LPC), Line Spectral Frequencies (LSF), Cepstral Coefficients, and Speech Transformation and Representation using Adaptative Interpolation of weiGHTed spectrum (STRAIGHT). Signal-based models: Improved Power Spectrum Envelope (IPSE), Discrete Wavelet Transform (DWT), Harmonic plus Noise Model (HNM).

Typical Voice Morphing System


Typically, Voice Morphing takes place in two phases: Training phase Transformation phase

Typical Voice Morphing System


Extract Spectral Envelope
Source Speaker Time alignment Extract Spectral Envelope Target Speaker Extract Spectral Envelope Source Speaker Converted speech Spectral envelopes conversion

Estimate Transforms

Voice Morphing Process


Pre-Processing or Representation conversion

Pitch and Envelope analysis


Morphing which include warping and interpolation

Signal re-estimation
Representation Conversion

Speech signal 1

Cepstral Analysis
Pitch Morphing Pitch

Envelope

Signal estimation

Morph

Speech signal 2

Representation Conversion

Cepstral Analysis

Envelope

Pre-Processing
This involves processes like signal acquisition in discrete form and windowing.

Signal Acquisition

Windowing

Pitch and Envelope analysis.


This process will extract the pitch. Formant information in the speech signal

Morphing: Matching and Warping


Both signals will have a number of 'time-varying properties'. To create an effective morph, it is necessary to match one or more of these properties of each signal to those of the other signal in some way. Dynamic Time Warping (DTW) is used to find the best match between the pitch of the two sounds.

Signal re-estimation
Due to signals being transformation into the cepstral domain, a magnitude function is used. This results in a loss of phase information in the representation of the data . Therefore to estimate a signal whose magnitude DFT is close to that of the processed magnitude DFT is required.

Over-view of Voice morphing

Formant

Formant

Cepstral form

Applications
Military applications: Voice morphing is a powerful battlefield weapon which can be used to provide fake orders to the enemy's troops, appearing to come from their own commanders.

Interesting Fact: Voice morphing technology was used by U.S. military during the war with Iraq to mislead them.

Applications
Medical applications:

Voice restoration systems Training interfaces for speech pathologies

Entertainment
Voice editing and dubbing : regenerating voices of actors/actresses who are no longer alive or whove lost their voice to old age or illness Applications as the creation of peculiar voices in animation films.

Limitations and Challenges


There are many open problems in voice conversion, which have been identified in several previous articles:


www.company.com

Phonetic Issues Prosody Issues Quality Issues Similarity Issues Evaluation Issues Over fitting Issues

AVAILABLE SOFTWARE FOR VOICE MORPHING .


MORPH VOX PRO VOICE CHANGER 2.0.6. MORPH VOX PRO VOICE CHANGER 4.2.2. MORPH VOX PROVOICE CHANGER 4.3.8. TERA VOICE SERVAER 2004. FLASH VOICE BUTTONS 3.0. VOICE TWISTER 1.0.4. VOICE AGAIN 1.5.2. QUICK VOICE FOR OSX 2.2.0. QUICK VOICE FOR WINDOWS 2.2.0.

Future Scope

Improve the quality of the converted speech. Cross language voice conversion will be another challenge.

Extending the functionality of tool. - Create a powerful and flexible morphing tool. Increased user interaction. - Graphical User Interface could be designed and integrated to make the package more user-interactive.

References
[1] Ye H. and Young S., High quality voice morphing, International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2004, Montreal, Vol. 1,9-12 [2] Abe M., Nakamura S., Shikano K. and Kuwabara, H., "Voice conversion through vector quantization", Proc. IEEE ICASSP, 1988.

[3] T. Dutoit, A. Holzapfel, M. Jottrand, A. Moinet, J. Perez, Y. Stylianou, Towards a voice conversion system based on frame selection, in ICASSP, pp. 513516, 2007.

References
[4] Hui YE, High Quality Voice Morphing, Cambridge University http://www.sshchd.org/publications/kkhe_seminar1.pdf H. Duxans, D. Erro, J. Perez, F. Diego, A. Bonafonte, A. Moreno, Voice conversion of non-aligned data using unit selection, in TC-STAR WSST, 2006.

[5]

25

You might also like