You are on page 1of 29

Lecture 3 Vocal Organs and Articulatory Phonetics

Overview
How is speech produced by humans? Vocal organs
Lungs and trachea (windpipe) Larynx (voicebox) Vocal tract

Speech generation
Excitation and modulation

Articulatory phonetics
Phonetic alphabets Consonants: manner and place of articulation Vowels

Continuous speaking effects


Phonology

MRI movie of speech

The vocal organs

Nasal Cavity Oral Cavity Pharynx

Vocal tract: modulates the waveform from the larynx

Vocal Folds (within the Larynx) vibrate to give pitch Trachea Lungs
Provide energy for speech production

Text copyright J. J. Ohala, Sept 2001, from Sharon Rose slide

Lungs and trachea


Speech generation relies on compressed air being provided on which to modulate different speech sounds This compressed air is produced in the lungs and delivered to the vocal organs through the trachea to the larynx Non-speech breathing normally consists of equal inhalation and exhalation periods occurring about 17 times per minute When speaking, the physiological requirement of oxygenation of the blood must continue, so breathing tends to occur as short inhalations and long exhalations as the expelled air is being used to generate speech. (egressive sounds) It is possible to make ingressive sounds: we can all do a pulmonic ingressive uvular trill a snore The main effect the lungs have on the output speech is its loudness
5

Larynx
The larynx is a continuation of the trachea but contains a highly specialised cartilage structure and associated muscles. The most important parts of the larynx are: 1. The vocal folds (cords) 2. The arytenoid cartilage The arytenoid cartilages are the main controllers of the vocal folds. Breathing: Swallowing: vocal folds held open vocal folds closed.

The vocal folds stretch across the larynx and when closed, separate the larynx from the trachea. The opening made by the vocal folds is known as the glottis.
6

Vocal Fold Vibration I


1. When air pressure below the closed vocal chords (sub-glottal pressure) is high enough, the folds are forced open (1-5 on diagram). 2. As flow increases between the folds, the pressure drops (the Venturi effect), and the folds come together again (6-10 on diagram) 3. Back to step 1.

Vocal Fold Vibration II


Varying the tension on the vocal folds affects the rate at which they open and close, and this in turn affects the frequency (pitch) of the resulting speech signal.

Pitch period, T

The frequency of the pitch signal above is f = 1/T Speech sounds produced in this way are known as voiced. When the tension of the vocal chords is less, they become quite loose. When air is forced from the lungs the effect is now to create a turbulent airflow rather than pulses. This is known as unvoiced speech.
8

Unvoiced Sounds
Whispering occurs when the vocal chords are moved close together but with a small opening remaining Air passing through this from the lungs creates turbulent airflow with the resulting excitation signal resembling wideband noise

Excitation signal There are no regular pulses of air and so the excitation has no specific pitch period Speech sounds produced by this form of excitation are termed unvoiced. Examples are /s/ as in step and /f/ as in from
9

Laryngograph
A laryngograph is an instrument for recording the movements of the vocal folds during speech A small high frequency signal (3 MHz) is transmitted across the larynx. As the vocal folds move, the conductance of this signal changes, and this change is recovered by the laryngograph.

10

Vocal Tract
The vocal tract is normally considered to be everything past the vocal folds. Vocal tract can be divided into following regions: 1. Oral cavity 2. Nasal cavity 3. Pharynx and contains a number of articulators: a.Tongue b.Soft palate (velum) c.Hard palate d.Teeth e.Alveolar ridge f.Lips

11

Vocal Tract
The vocal tract can be considered to be a series of connected cavities, each with their own resonant frequencies. The movement of the articulators (tongue, velum) changes the physical shape of these cavities and hence the resonant properties of the vocal tract. As the air from the larynx flows through the vocal tract, the resonant frequencies of these cavities modulates different effects onto the air flow and creates different speech sounds. The use of different components of the vocal tract has different effects on the produced speech signal. Moving the tongue changes the shape of the oral cavity and hence its resonant properties. The nasal cavity can also be closed off by the velum.
12

A model of speech generation


We have now considered the physical operation of the vocal organs. The operation of these for generating speech can be divided into excitation and modulation, with the lungs supplying the energy Excitation is the process of taking the air flow from the lungs and building from this a suitable flow of air which can be used as the input to the modulation process Modulation occurs in the vocal tract and is the process of using the articulators to impart information onto the excitation signal

Air flow (lungs)

Excitation (vocal cords)

Modulation (vocal tract)

Speech

13

Modulation
Modulation is the process of imposing information on the excitation, or glottal, waveform Considering the effect of modulation from a physiological point of view is termed articulatory phonetics While considering modulation in terms of its acoustic properties is known as acoustic phonetics Articulatory phonetics considers how changes in the vocal organs change the resulting speech sounds. These changes are mainly effected by the tongue, but also other components such as the velum, teeth and lips Acoustic phonetics considers the modulation as a filtering operation applied to the excitation waveform. The vocal tract can be thought of as an acoustic tube which has various resonant frequencies These resonant frequencies can be adjusted by the articulators in the 14 vocal tract and are known as formants

Illustration of Speech Generation I


Analysis of speech production mechanism shows that the resultant speech output is determined by modulating the excitation signal by the vocal tract. Consider the stationary vowel sound at end of utterance three voiced speech. Modulation Excitation Speech (vocal tract) (glottis)
Pitch period = 5.5ms, so pitch, F0 = 181Hz
F1

Formants
F2 F3 F4

Excitation signal

Frequency response of vocal tract

Speech signal

15

Illustration of Speech Generation II


Consider now an unvoiced sound: the /th/ sound at the start of the utterance three.

Excitation (glottis)
Excitation consists of wideband noise - no pitch

Modulation (vocal tract)

Speech

Much less formant structure

Output speech also noise-like

Excitation signal

Frequency response of vocal tract

Speech signal

Articulatory phonetics and phonology


Phonetics: relates to the anatomy and physics required to understand how different speech sounds are produced. This is independent of any particular language Phonology: study of the organisation of speech sounds in relation to a particular language (phonology is to the speech sounds in a language as syntax is to the words of language) Articulatory Phonetics: study of how speech sounds are produced in relation to anatomical details. In particular the position of the vocal organs when producing particular speech sounds Acoustic Phonetics: study of observable and measurable characteristics of speech sounds with particular focus on how to distinguish between different sounds. Provides important background for speech recognition and synthesis
17

Phonemes and phones


A phoneme is an abstract signalling unit: if two speech sounds differentiate two words, they are said to be different phonemes Consider pat and bat: the sounds at the start of these words differentiate them in English, so they are different phonemes, written as /p/ and /b/. In some languages, they might not be differentiated, so they would not be phonemes. But they are still different sounds, and are regarded as phones rather than phonemes. In Japanese, [r] and [l] are not distinct phonemes, so Japanese people find it difficult to recognise and pronounce these differences A phoneme is actually a set of acoustically similar sounds which, for a given language, are accepted as conveying the same meaning. Members of the same set are called allophones. Consider the phoneme /k/ at the beginning of kin and cup. Both represent the same phoneme, but sound slightly different.
18

International Phonetic Alphabet (IPA)


Examples of symbols and their associated sounds.

19

SAMPA symbols for British English


p b f v S Z tS dZ @ I V pear bear fear very sheer treasure cheer jeer ago bit bud Complete set available. See, for instance: T D U u 3 thing this good boot bird

... Some symbols appear logical, others not so.

http://www.ims.uni-stuttgart.de/ projekte/mate/mdag/pd/ pd_2.htm

20

Consonants and vowels


Now that we know how speech sounds are produced by the vocal organs we will now examine specifically the production of consonants and vowels

21

Consonants
Consonants are relatively easy to define in anatomical terms. They are principally distinguished by: 1. Place of articulation 2. Manner of articulation 3. Voicing or phonation

22

Place of articulation
This is the place in the vocal tract where the major source of constriction occurs It is defined in terms of both the active and the passive articulators

Name 1. 2. 3. 4. 5. Bilabial Labio-dental Dental Alveolar Alveopalatal

Description Between lips Lower lip to upper teeth Front of tongue between teeth Front of tongue to alveolar ridge Front of tongue between alveolar ridge hard palate Middle of tongue to hard palate Root of tongue to rear of mouth Tongue to uvular Tongue to pharynx Constriction at glottis

Example pea fee thigh see she you key French r (none in English) Arabic (none in English) 23 sor of (Cockney)

6. Palatal 7. Velar 8. Uvular 9. Pharyngeal 10. Glottal

Manner of articulation
Manner of articulation is characterised by the degree of constriction and the manner of its release into the following sound.

24

Voicing
Indicates presence or absence of phonation Previous tables have shown that consonants can have voiced or unvoiced forms, depending on phonation Examples is voiced zzz and unvoiced sss

25

Vowels
Vowels are less well defined than consonants. The tongue rarely touches another organ in the vocal tract so gives no specific points of articulation Vowels can be roughly described in terms of their position in the mouth by four variables:
1. Tongue high or low 2. Tongue front or back 3. Lips rounded or unrounded 4. Nasalised or unnasalised

Considering only the position of the tongue, a vowel diagram can be used to show the location of different vowels.
Front High Middle Low Back Front High Middle Low
heed hid hayed ahead bud hawed 26 hoed

Back
whod

i e

u o

head had

Vowels
The vowels shown in the diagram are most common in English, but the diagram is not entirely consistent. Front vowels are pronounced with the lips unrounded (heed). Back vowels are pronounced with lips rounded (whod). This is customary in English, but not in all languages. Leads to two vowel diagrams to accommodate the differences - one for rounded vowels and the other for unrounded vowels. The neutral vowel [] is one of the most commonly used and is given the name schwa. Examples of schwa are start of above and end of soda. Velum determines whether a vowel is nasalised or unnasalised. When the velum is open this opens up the nasal cavity which can impart more structure onto the vowel sound. In the English language, no distinction is made between nasal and unnasalised vowels. However in some languages (for example French) the difference is important.
27

Continuous speaking effects


Useful to analyse speech generation using phonetics, but in reality, phones are not produced accurately in the context of other phones. Each phone can be considered a target at which the vocal organs aim, but in reality never actually reach. Typically, once the target has been approached close enough to be intelligible, the organs change their destination to the next phone. Coarticulation is the altering of phonemes as a result of their neighbouring sounds. For example I went by bus is typically spoken as I wemp by bus. Changes are /n/ /m/ and /t/ /p/ because of the bilabial /b/ in by. Some words change if following word begins with a vowel. e.g. the the mouse and the elephant Some words are shortened you and me you an me I must see you I ms ee you
28

Summary
Examined vocal organs used in speech generation - lungs/trachea, larynx/vocal cords, vocal tract Considered speech generation in terms of an excitation signal and its modulation by vocal tract Looked at differences between voiced and unvoiced sounds Articulatory phonetics - IPA alphabet of speech sounds Vowels and consonants - position of articulators to produce the different speech sounds Continuous speaking effects and coarticulation

29

You might also like