Professional Documents
Culture Documents
Batch 10-03
K. Mounika 10881A0481
Structure of Presentation
INTRODUCTION CLASSIFICATION BLOCK DIAGRAMS
APPLICATIONS
LIMITATIONS AND CHALLENGES FUTURE SCOPE REFERENCES
Morphing algorithm
Morphed Signal
PITCH
FUNDAMENTAL FREQUENCY
HARMONICS
ENVELOPE
PHONEME PHASE, AMPLITUDE
Quality
Phonetics
Timing
Requirement of the same utterances to be spoken by the source and target speaker.
The ability to operate with target voice training data ranging from a few seconds to tens of minutes.
Classification of VM techniques
Voice-based models: Vocal Tract Length Normalization (VTLN), Formant Frequencies, and Glottal Flow models.
Mixed Voice/Signal Models: Linear Prediction Coding (LPC), Line Spectral Frequencies (LSF), Cepstral Coefficients, and Speech Transformation and Representation using Adaptative Interpolation of weiGHTed spectrum (STRAIGHT). Signal-based models: Improved Power Spectrum Envelope (IPSE), Discrete Wavelet Transform (DWT), Harmonic plus Noise Model (HNM).
Estimate Transforms
Signal re-estimation
Representation Conversion
Speech signal 1
Cepstral Analysis
Pitch Morphing Pitch
Envelope
Signal estimation
Morph
Speech signal 2
Representation Conversion
Cepstral Analysis
Envelope
Pre-Processing
This involves processes like signal acquisition in discrete form and windowing.
Signal Acquisition
Windowing
Signal re-estimation
Due to signals being transformation into the cepstral domain, a magnitude function is used. This results in a loss of phase information in the representation of the data . Therefore to estimate a signal whose magnitude DFT is close to that of the processed magnitude DFT is required.
Formant
Formant
Cepstral form
Applications
Military applications: Voice morphing is a powerful battlefield weapon which can be used to provide fake orders to the enemy's troops, appearing to come from their own commanders.
Interesting Fact: Voice morphing technology was used by U.S. military during the war with Iraq to mislead them.
Applications
Medical applications:
Entertainment
Voice editing and dubbing : regenerating voices of actors/actresses who are no longer alive or whove lost their voice to old age or illness Applications as the creation of peculiar voices in animation films.
www.company.com
Phonetic Issues Prosody Issues Quality Issues Similarity Issues Evaluation Issues Over fitting Issues
Future Scope
Improve the quality of the converted speech. Cross language voice conversion will be another challenge.
Extending the functionality of tool. - Create a powerful and flexible morphing tool. Increased user interaction. - Graphical User Interface could be designed and integrated to make the package more user-interactive.
References
[1] Ye H. and Young S., High quality voice morphing, International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2004, Montreal, Vol. 1,9-12 [2] Abe M., Nakamura S., Shikano K. and Kuwabara, H., "Voice conversion through vector quantization", Proc. IEEE ICASSP, 1988.
[3] T. Dutoit, A. Holzapfel, M. Jottrand, A. Moinet, J. Perez, Y. Stylianou, Towards a voice conversion system based on frame selection, in ICASSP, pp. 513516, 2007.
References
[4] Hui YE, High Quality Voice Morphing, Cambridge University http://www.sshchd.org/publications/kkhe_seminar1.pdf H. Duxans, D. Erro, J. Perez, F. Diego, A. Bonafonte, A. Moreno, Voice conversion of non-aligned data using unit selection, in TC-STAR WSST, 2006.
[5]
25