Pronunciation in Action

A non-flash menu is in the footer of this page.


The top-down approach: American English pronunciation

People learn pronunciation best in whole fixed phrases, like the lyrics of a song.  Learning the whole phrase rather than the individual words imprints the rhythm, melody, and linking of a phrase.

Judy B. Gilbert
phrase on a musical staff

You can study English reading, writing and grammar for many years, but when you begin to talk with English speakers from other countries, especially fluent speakers talking together informally, you may be surprised to find that spoken English is very different from written English.  

The reason for this is that there are several important features of spoken English which are not apparent in the written language.  Understanding these features can be a great help to English learners, but unfortunately they are not always taught in English classes.  These features make up the unique “music of English.”

Word Stress
Thought Groups
Intonation (Pitch Pattern)
Connected Speech

The suprasegmentals listed above, (as opposed to segmentals, or individual sounds), work together to “package” American English in a way that can be easily processed and understood by fluent speakers.  Speaking English without them—pronouncing each word distinctly and separately, as written—can actually make an English learner less fluent and less easily understood.

In this course we will begin with the prosody of English—rhythm, stress and intonation.  It is called  “The Top-Down Approach” because prosody is found in the overview of language—phrases, sentences and paragraphs.  However, we will also work “from the bottom up”, with problematic consonant and vowel sounds as they occur in the larger segments of language. 

* * * * *

Word Stress

Because identifying word stress is so important for communication in English, fluent speakers use a combination of signals to show which syllable in a word is stressed.  The most important signals are the length and clarity of the vowel in the stressed syllable.  Equally as important for contrast is unstressing the syllables that are not stressed by reducing the length and clarity of the vowel.

Thought Groups

Perhaps the most important way English speakers help their listeners  understand them is by breaking the continuous string of words into groups of words that belong together.  These smaller groups are easier to say, and can be processed more easily by the listener.  A thought group can be a short sentence or part of a longer sentence, and each thought group contains a “focus word” (most important word) that is marked by a change in pitch.  Understanding thought groups can also help improve reading comprehension.


English depends mainly on intonation, or pitch pattern (“melody”), to help the listener notice the most important (focus) word in a thought group.  By making a major pitch change (higher or lower) on the stressed syllable of the focus word, the speaker gives emphasisto that word and thereby highlights it for the listener.  This emphasis can indicate meaning, new information, contrast, or emotion.

We also use intonation to help the listener know what is ahead.  The pitch stays up between thought groups (to show that more is coming), and usually goes down to show the end of a sentence (except Yes/No questions).


We learn the rhythm of our native language in the first months of life, and tend to mistakenly apply that rhythm to any new language we learn.  It is important to learn the unique rhythm of each language.  English is one of the “stress-timed” languages, and the basic unit of English rhythm is the syllable. 

The rhythm of English is largely determined by the “beats” falling on the stressed syllables of certain words in phrases and sentences.  Stressed and unstressed syllables occur in relatively regular alternating patterns in both phrases and multi-syllable words.  In phrases, “content words” (words that have meaning) rather than “function words” (words with grammatical function only) usually receive the stress. 


Reduction helps highlight important syllables in yet another way—by de-emphasizing unstressed syllables.  The vowel in an unstressed syllable is reduced in both length and clarity.  The most common reduced vowel sound in English is the “schwa.” /e vowel sound/  Though represented by many different spellings, the schwa is always a short, completely relaxed and open sound (like second syllable in “pizza”).  

Contractions are another example of reduction.  They reduce the number of syllables, and eliminate some vowels completely. (I am/I’m, you are/you’re, etc.)

Connected Speech

Connected speech is a general term for the adjustments native speakers make between words, “linking” them so they become easier to pronounce.  Words that English learners might easily understand in isolation can sometimes be unrecognizable in connected speech.  Likewise, English learners trying to pronounce each word separately and distinctly, as it is written, sometimes make it harder for native listeners to understand them.

Adapted from:

Chun, Dorothy M. (2002) Discourse Intonation in L2.
Gilbert, Judy B. (2005) Clear Speech.
Goodwin, Janet (2001) “Teaching Pronunciation.” Teaching English as a Second or Foreign Language, Third Edition. Marianne Celce-Murcia, Ed.