Professional Documents
Culture Documents
Why Speech?
No visual contact required No special equipment required Can be done while doing other things Telephones AT&T Mobile Phones (1G and 2G)
Speech Processing
Speech Coding Speech Synthesis Speech Recognition Speaker Recognition/Verification Dyslexia and Auditory problems
Audio Engineering
Speech Coding
Compress a Speech File Why not use standard compression techniques? MP3 Format
Perceptual Coding Exploits sensory organ biases
Speech Synthesis
Construct Speech waveform from words Speaker Quality and Accent Prosody?
http://www.research.att.com/~ttsweb/tts/demo.php
Speech Recognition
Convert a sound waveform to words The most relevant and important task in the industry 90% in lab conditions, much lower in factory conditions Sphinx by CMU, ViaVoce by IBM & SDK by Microsoft
Speaker Recognition
Concerned with Biometrics Acceptable as a verification technique How would this be different from Speech recognition?
Speaker Quality Prosody Pitch, Accent etc.
Audio Engineering
Adding effects to sound Clarity of reproduction A Big industry with players like Dolby, Bose, Phillips etc Voice Morphing!
SOURCE TARGET CONV 1 CONV 2
ASR: Problems
ASR: Method
ASR: Application
Speech Production