Anda di halaman 1dari 35

Chapter 2 Speech Sounds

2.1 Speech production and perception


Human beings are capable of making all kinds of sounds, but only some of these
sounds have become units in the language system. Language is first and foremost a
"system of vocal symbols", as we have seen in the discussion of language. Speech
sounds had existed long before writing was invented, and even today, in some parts of
the world, there are still languages that have no writing systems. Therefore, the study of
speech sounds is a major part of linguistics and, in this chapter, we will look at ways of
studying speech sounds and the patterns in which they are used.
We will begin with the study of sounds, which is called "PHONETICS'', and then go
on to the study of sound patterns, "PHONOLOGY". As we can imagine easily, the
speech sound is articulated by a Speaker A. It is then transmitted to and received by
Speaker B. Consequently, a speech sound goes through a three-step process as shown
in Fig. 2.1 below.

Speech

Speech

Production

Perception

( Speaker A)

(Speaker B )

Fig. 2.1 The process of speech production and perception

Naturally, the study of sounds is divided into three main. areas, each dealing with one
part of the process.
ARTICULATORY PHONETICS is the study of the production of speech sounds.

ACOUSTIC PHONETICS is the study of the physical properties of sounds

produced in speech.
A UDITORY PHONETICS is concerned with the perception speech sounds.

For the purpose of this book, we will concentrate only on Articulatory Phonetics,
which deals with how sounds are produced and ignore the other areas of study.

2.2 Speech organs


SPEECH O~RGANS, as shown in Fig. 2.2, are also know as VOCAL ORGANS. They
are those parts of the human body involve in the production of speech. The organs,
however, are not used for speech alone, as their primary function is to fulfill the basic
biological needs of breathing and eating. In spite of this, there seems to have been
considerable evolutionary justification for them to fulfill the special task of speech as they
have been formed in such a way that they can function efficiently for the act of speech.
It is striking to see how much of the human body is involved in the production of
speech: the LUNGS, the TRACHEA (or wind-pipe), the THROAT, the NOSE, and the
MOUTH. Inside the
mouth, we need to distinguish the TONGUE and various parts of the PALATE while,
inside the throat, we have to distinguish PHARYNX, the upper part, from LARYNX, the
lower part containing the VOCAL FOLDS (or VOCAL CORDS). The pharynx, mouth, and
nose form the three cavities of the VOCAL TRACT. The mouth and the nose are often
referred to, respectively, as the ORAL CAVITY and the NASAL CAVITY.

?
Speech sounds are produced with an AIRSTREAM as their sources of energy. In most
circumstances, the airstream comes from the lungs. It is forced out of the lungs and then
passes through the BRONCH1OLES and BRONCHI, a series of branching tubes, into
the trachea. Sounds that are produced in this way are called PULMONIC' sounds.
At the top of the trachea is the LARYNX, the front of which is the Adams Apple. The
larynx contains two pairs of structure, the VOCAL FOLDS and VENTR1CULAR FOI.DS.'
The vocal folds lie horizontally below the latter and their front ends are joined together at
the back of the Adam's Apple. Their rear ends, however, remain separated and can move
into various positions: inwards, outwards, for wards, backwards and, tilting slightly,
upwards and downwards. For most phonetic purposes, it is sufficient to say that the vocal

folds are either (a) apart, (b) close together, or (c) totally closed.
When the vocal folds are apart, the air can pass through easily and the sound
produced is said to be VOICEI.ESS. Consonants [p, s, t] are produced in this
way. (Fig. 2.3)
When they are close together, the airstream causes them to vibrate against each
other and the resultant sound is said to be VOICED. [b, z, d] are voiced
consonants. (Fig. 2.4)
When they are totally closed, no air can pass between them. The result of this
gesture is the glottal stop [?]. (Fig. 2.5)
Fig. 2.5 Positions of the vocal folds: glottal stop
(Source: Roca & Johnson, 1999:22)
The larynx opens into a muscular tube, the PHARYNX, part of which can be seen in a
mirror. The upper part of the pharynx connects to the oral and nasal cavities.
The contents of the mouth are very important for speech production. Starting from the
front, the upper part of the mouth includes the UPPER LIP, the UPPER TEETH, the
ALVEOLAR RIDGE, the HARD PALATE, the SOFT PALATE (or the VELUM), and the
UVULA. The soft palate can be lowered to allow air to pass through the nasal cavity.
When the oral cavity is at the same time blocked, a Nasal sound is produced. In English,
[m, n, ] are all nasals, with the oral cavity blocked at the lips, alveolar ridge and the
velum respectively.
The bottom part of the mouth contains the LOWER LIP, the LOWER TEETH, the
TONGUE, and the MANDIBLE fi. e. the lower jaw): In phonetics, the tongue is divided
into five parts: the TIP, the BLADE, the FRONT, the BACK and the ROOT. (In some
analyses the major part of the tongue is often referred to as the TONGUE BODY or the
DORSUM. )
Some languages contain speech sounds that are produced without airstreams from
the lungs. These .sounds include EJECTIVES, IMPLOSIVES and CLICKS.

Such

.sounds are labeled 'NON-PUL-MONIC' sounds. As these sounds do not appear it


English or Chinese, we will not go into details with the description of such sounds.
2.3 Segments, divergences, and phonetic transcription

2.3.1 Segments and divergences


We all know that the English spelling does not represent its pronunciation. In the
production of the word above, for example, although the spelling suggests Five sounds,
there are in fact four. When the word is said slowly, we recognize the four sound
SEGMENTS that are comparable to the "a", "b", "o" and "v" of the spelling. In this case,
the "e" is silent- George Bernard Shaw (1856--1950)highlighted the lack of precision in
English orthography by spelling the word fish as ghoti, as gh is pronounced as fl] in
enough, o as [I] in women, and ti as []in nation.
The reason for this divergence between sound and symbol may seem to be
simple: as there are more sounds in English than its letters can represent, each letter
must represent more than one sound. In fact, it is much more complicated than this. In
old English, the relation between sound and symbol was much more regular. Some of the
sounds, especially the vowels, have undergone changes in the history of English.
Additionally, many English words have been borrowed from other languages throughout
history and the irregularity of its spelling is made worse because of such borrowings.
3.2 Phonetic transcription
The divergence between spelling and pronunciation becomes greater when we
consider the many accents of English used by people from different regions. In addition,
there are still many languages in the world that do not have a writing system of their own
and we need; to rely on a set of symbols to record the sounds they make too.
Because of these reasons, it is necessary to devise sets of symbols that can be used
for transcribing sounds in language. Several such systems are in use and, in this book,
we will introduce and use the notation system of the INTERNATIONAL PHONETIC
ALPHABET(IPA).
In 1886, the INTERNATIONAL PHONETIC ASSOCIATION was inaugurated by a small
group of language teachers in France who had found the practice of phonetics useful in
their teaching and wished to popularize their methods. It was first known as the Phonetic
Teachers' Association and was changed to its present title in 1897.
One of the first activities of the Association was to produce a journal in which the
contents were printed entirely in PHONETIC TRANSCRIPTION. The idea of establishing

a phonetic alphabet was first proposed By the Danish grammarian Otto Jespersen (1860
- 1943) in 1886, and the first version of the IPA was published in August 1888.Its main
principles were that there should be a separate letter for each distinctive sound, and that
the same symbol should be used for that sound in any language in which it appears. The
alphabet was to consist of as many Roman alphabet letters as possible, using new letters
and diacritics only when absolutely necessary. These principles continue to be followed
today.
The [PA has been revised and corrected several times and is now widely used in
dictionaries and textbooks throughout the world. The present system of the IPA derives
mainly from one developed in the1920s by the British phonetician, Daniel Jones (1881-1967), and his colleagues at University of London. Some of its special letters have even
been accepted as part of the new orthographies devised for previously unwritten
languages.
The latest version of the IPA was revised in 1993 and corrected (updated) in 1996.
2.4 Consonants
2 4.1 Consonants and vowels
The sound segments are, grouped into CONSONANTS and VOWELS. Consonants
are produced by constricting or obstructing the vocal1tract at some place to divert,
impede, or completely shut off the flow of air in the oral cavity. By contrast, a vowel is
produced without such obstruction so no turbulence or a total stopping of the air can be
perceived.
Theoretically, as far as phoneticians are concerned, any segment must be either a
vowel or a consonant. If a segment is not a vowel, it is a consonant. The problematic
area is that the initial round in hot gives little turbulence, depending on how forcefully it is
said, and in yet and wet the initial segments are obviously vowels. To get out of this
problem, the usual solution is to say that these segments arc neither vowels nor
consonants but midway between the two categories. For this purpose, the terms "SEMIVOWEL" or "SEMI-CONSONANT'' are often used. Other suggestions have been made
but as these affect only a small number of segments, the distinction between vowels and
consonants will be retained for our purposes in this book.

2.4.2 Consonants
In the production of consonants at least two articulators are involved. For example,
the initial sound in bad involves both lips and its final segment involves the blade (or the
tip) of the tongue and the alveolar ridge. The categories of consonant, therefore, are
established on the basis of several factors. The most important of these factors are: (a)
the actual relationship between the articulators and thus the way in which the air passes
through certain parts of the vocal tract and (b) where in the vocal tract there is
approximation, narrowing, or the obstruction of air. The former is known as the
MANNERS OFARTICULATION and the latter as the PLACES OF ARTICULATION.
2.4.3 Manners of articulation
There are several basic ways in which articulation can be accomplished: the
articulators may close off the oral tract for an instant or a relatively long period; they may
narrow the space considerably; or they may simply modify the shape of the tract by
approaching each other.
(1) STOP (or PLOSIVE): complete closure of the articulators involved so that the
airstream cannot escape through the mouth. It is essential to separate three phases in
the production of a stop: (a) the closing phase, in which the articulators come together;
(b) the hold or compression phase, during which air is compressed behind the closure;
(c) the release phase, during which the articulators forming the obstruction come rapidly
apart and the air is suddenly released. Technically this third phase is called "plosion",
hence the name "plosive", but because of the closure involved in the production of
plosives, the alternative name "stop" is frequently used to refer to this category of
sounds.
If the air is stopped in the oral cavity but the soft palate is clown so that it can go
out through the nasal cavity, the sound produced is a NASAL STOP. Otherwise it is an
ORAL. STOP. Although both types of sounds are stops, phoneticians have retained the
term STOP to indicate an oral stop and used the term NASAL to indicate a nasal stop. In
English, [p, b, t, d, k, g] are stops and [m, n, ] are nasals.
(2) FRICATIVE: close approximation of two articulators se that the airstream is
partially obstructed and turbulent airflow is produced. The audible friction defines this

class of sounds and thus explains the label "fricative". [f, v, , , s, z,,,h] are fricatives
in English.
(3) (MEDIAN) APPROXIMANT: an articulation in which one articulator is close to
another, but without the vocal tract being narrowed to such an extent that a turbulent
airstream is produced. The gap between the articulators is therefore larger than for a
fricative and no turbulence. It is important point to note that this category overlaps with
that of vowel.
(4) LATERAL (APPROXIMANT):obstruction of the airstream at a point along the
center of the oral tract with incomplete close between one or both sides of the tongue
and the roof of the mouth. As the lateral passage forms a stricture of open approximation
and no noise of friction is produced, it can come under the umbrella of" approximants'. [1]
is the only one lateral in English.
Other consonantal articulations include TRILL, TAP or FLAP, and AFFRICATE. A trill
(sometimes called ROLL) is produced when an articulator is set vibrating by the
airstream. A major trill sound is [r], as in red and rye in some forms of Scottish English.
The Spanish
"rr" in perro (dog) is a trill [r]. If only one vibration is produced, i.e. the tongue makes a
single tap against the alveolar ridge, the sound is called a tap or a flap.
Affricates involve more than one of these manners of articulation in that they consist
of a stop followed immediately afterwards by a fricative at the same place of articulation.
In English, the "ch [t]" of church and the "j[d]' of jet are both affricates. The legitimate
position of [ts, dz, t, d] have Been ejected from English because the first two are used
only for suffixes and foreign words, while the latter two are often realized as two sounds
in many people's speech. In Chinese,however, both ItshI and [ts] are legitimate affricates
as they appear in words like "" and "".
2.4.4 Places of articulation
Consonants may be produced at practically any place between the lips and the vocal
folds. Eleven places of articulation are distinguished on the IPA chart.
(1) BILABIAL: made with the two lips. In English, bilabial sounds include [p, h, m], as
in pet, bet and met. [w], as in we and wet, involves an approximation of the two lips but is

produced slightly differently: the tongue body is raised towards the velum at the same
time and in the IPA chart it is treated as a labial-velar approximant, outside the consonant
chart. However, as far as English is concerned, most linguists today have placed it under
the label "bilabial".
(2) LAB1ODENTAL: made with the lower lip and the upper front teeth. [f, v],' as in fire
and via, are produced by raising the lower lip until it nearly touches the upper front teeth.
(3) DENTAL: made by the tongue tip or blade (depending on the accent or language)
and the upper front teeth. Only fricatives ([]) are found to be strictly dental. Some
speakers have the tip of the tongue protruding between the upper and lower front teeth
whereas others have it close behind the upper front teeth. Both are normal in English,
and both may be called dental. The term INTERDENTAL is sometimes used to describe
the first kind in order to make a distinction.
(4)ALVEOLAR: made with the tongue tip or blade and the alveolar ridge. Sounds
produced at this place include [t, d, n; s, z, a, I] for English, which is a large group of
sounds.
(5)~POSTALVEOLAR: made with the tongue tip and the back of the alveolar ridge.
Such sounds include [] as in ship and genre. In some systems, this place is also~
known as palato-alveolar.
(6) RETROFLEX: made with the tongue tip or blade curled back (retroflexed ) so
that the underside of the tongue tip or blade forms a stricture with the back of the alveolar
ridge or the hard palate. .For English, the use of retroflex sounds, e.g. the "r" of red,
depends on accent and many speakers do not use such sounds at all.
(7)PALATAI: made with the front of the tongue and the hard palate. The only English
sound made here is [j], as in yes and yet, but many speakers do use a palatal fricative [ ]
for the "h" in he or Hugh.
(8)VELAR: made with the back of the tongue and the soft palate. In making such
sounds, the back of the tongue is raised to touch the velum. Examples in English are
velar stops [k, gl, as in cat and get, and velar nasal I], as in sing. The pronunciation of
the Scots word loch contains a velar fricative [x] after the vowel. The initial consonant in
the Chinese word "t " is also the velar fricative Ix].

(9) UVULAR: made with the back of the tongue and the uvula, the short projection of
soft tissue and muscle at the posterior end of the velum. Such sounds are not found in
standard English but uvular fricatives [x] are occasionally heard in certain rural Northern
accents of English as realizations of the "r" in try and dry. The sounds are, however,
standard in, some other languages.
(10) PHARYNGEAL: made with the root of the tongue and the walls of the pharynx.
There are few sounds at this place Because of physiological difficulty. Arabic is a
language which contains pharyngeal fricatives.
(11) GLOTTAL: made with the two pieces of vocal folds pushed towards each other.
The [hi in hat and hold is often described as a glottal fricative, although some people hold
it may be more realistic to interpret it as a type of vowel. The glottal stop [?] is formed by
bringing together the vocal folds, building up pressure behind them as for a stop and then
releasing the airstream suddenly. Because of such a gesture, it is more of the lack of
sound than a sound. A glottal stop is often perceived in words like fat [f?t] and pack [p?
k], and many speakers of English have it for the "t" in words like button [b?n], beaten[b?
n], and fatten [f?n].
2.4.5 The consonants of English
As we have noticed, in many cases the pronunciation of English depends on
individual speaker's accent and personal preference. There are different accents even
within Great Britain, let alone outside it. Although no standard had been established on
the way English should be pronounced, one form of English pronunciation is the most
common model accent in the teaching of English as a foreign language. It is referred to
as RECEIVED PRONUNCIATION (RP), and many people also call it BBC English or
Oxford English. RP originates historically in the southeast of England and is spoken by
the upper-middle and upper classes throughout England. It is widely used in the private
sector of the education system and spoken by most newsreaders of the BBC network.
Table 2.1 is a chart of English consonants as used by RP speakers.
Table 2.1 A chat of English consonants
Manner of
Articulation

Place of Articulation
Bilabial

Labio-

Dental

Alveolar

Post-

Palatal

Velar

Glottal

dental

alveolar

Stop

p b

t d

Nasal

Fricative
Approximant

f v

s z

k g
h

Lateral
Affricate

In many cases there are two sounds that share the same place and manner of
articulation.
These pairs of consonants are distinguished by voicing (see Section 2.2), the one
appearing on the left is voiceless and the one on the right is voiced. Now the consonants
of English can be described in the following way:
[p] voiceless bilabial stop
[b] voiced bilabial stop
[s ] voiceless alveolar fricative
[z] voiced alveolar fricative
When no distinction is made in voicing, only two features will be necessary. Therefore,
[m] is a "bilabial nasal", [j]a "palatal approximant", and lb] a "glottal fricative". [l] may be
called an "alveolarlateral" or simply a "lateral".
2.5 Vowels
2 5.1 The criteria of vowel description
As we have discussed earlier, the distinction between vowels and consonants lies in
the obstruction of airstream (2.4.1). In the production of vowels, there is no obstruction of
air as is the case with consonants. Therefore, the description of the vowels cannot be
done along the lines of the description of the consonants. To get out of this problem,
vowels are normally described with reference to four criteria:
the part of the tongue that is raised--front, center, or back.
the extent to which the tongue rises in the direction of the palate. Normally, three or
four degrees are recognized: HIGH, MID (often divided into MID-HIGH and MIDLOW), and LOW.
the kind of opening made at the lips--various degrees of lip rounding or spreading.

the position of the soft palate---raised for oral vowels, and lowered for vowels which
have been nasalized.
A little more needs to be said about tongue height. Alternatively, tongue height can be
described as CLOSE, CLOSE-MID, OPENMID, and OPEN, in reference to the way the
two lips are rounded (LIPROUND1NG) when producing sounds with different tongue
height. This is exemplified in Fig. 2.6.

high

i (unrounded)

mid-high

mid-low

low

e (unrounded)

(unrounded)

a(unrounded)

u (rounded)

o (rounded)

c (rounded)

(rounded)

Fig. 2.6 Lip positions used in the pronunciation of the cardinal vowels
It should be pointed out that it is difficult to be precise about the exact articulatory
positions of the tongue and palate because very slight movements are involved. Absolute
values are not possible due to differences in the mouth dimensions of individual
speakers.
2.5.2 The theory of cardinal vowels
The idea of a system of CARDINAL VOWELS was first suggested by A.J. Ellis in
1844 and was taken up by A. M. Bell in his Visible Speech (1867). The system we are
now considering here is the most famous of all and was put forward by Daniel Jones in a
number of writings from 1917 onwards, particularly in his Outline of English Phonetics
(1962). For Jones, the cardinal vowels are a set of vowelqualities arbitrarily defined, fixed
and unchanging, intended to provide
a frame of reference for the description of the actual vowels of existing languages. When
the cardinal vowels are explained, examples are usually given from various languages to
help the student. It should not be thought however that the cardinal vowels are actually
based on whatever examples are given.

The cardinal vowel diagram (or quadrilateral), therefore, is a set of standard


reference points based on a combination of articulatory and auditory judgments. The
front center, and back of the tongue are distinguished, as are four levels of tongue
height:
the highest position the tongue can achieve without producing audible friction;
the lowest position the tongue can achieve; and
two intermediate levels, dividing the intervening space into auditorily equivalent
areas,
The system then defines eight "primary" cardinal vowels, in relation to which a further
set of "secondary" cardinal vowels can be defined. The reader is referred to the IPA chart
of vowels for the positions of these vowels. Note that where symbols appear in pairs, the
one to the right represents a rounded vowel and the one to the left represents an
unfounded vowel.
By convention, the eight primary cardinal vowels are numbered from one to eight as
follows: CVI[i], CV2 [e], CV3[], CV4[a],CVS[], CV6[], CV7[o], CVS[u]. The first five of
these are unrounded vowels while CV6, CV7 and CVS are rounded ones.
A set of secondary cardinal vowels is obtained by reversing the lip-rounding for a
given position: CV9[y], CV10 [], CVl1 []CV12 [?],CV13 [u], CV14[x], CV15[x], CV16[m].
Further secondary cardinal vowels can be added to the inventory: vowels which have
tongue-positions half-way between Ii] and [ u] are represented asIi] (unrounded) and [u]
(rounded). Moreover, the tongue-position for the neutral vowel [DJ is neither high nor low
and neither front nor back. This vowel is often referred to as SCHWA.
A final word may be said about the abstractness of the cardinal vowels. If we imagine
that for the production of [~] the tongue is in a neutral position (neither high nor low,
neither front nor back), the cardinal vowels are as remote as possible from this neutral
position. They represent extreme points of a theoretical vowel space: approximation of
the articulators beyond this vowel space would involve friction or contact. Remember also
that all cardinal vowels are monophthongs and their quality does not change during their
production. This is an important point as in both English and Chinese diphthongs abound.
2.5,3 Vowel glides

As in English and Chinese, languages frequently make use of a distinction between


vowels where the quality remains constant throughout the articulation and those
where there is an audible change of quality. The former are known as PURE or
MONOPHTHONG VOWELS and the latter, VOWEL GLIDES. If a single movement of the
tongue is involved, the glides are called DIPHTHONGS. A double movement produces
TRIPHTHONGS. Diphthongal glides in English can be heard in such words as way [wei],
tide [taid], how [hau], toy [tI], and toe [tou]. Triphthongal glides are found in words like
wire [wai] and tower [tau].
2.5.4 The vowels of RP
As with consonants, we will examine the English vowels in the form of RP. Various
symbols have been used for the representation of vowels by different writers.
The first four columns show symbols used in dictionaries published in the UK. In
1990, Professor John Wells, holder of the Chair in Phonetics at University College
London and the leading authority on contemporary English pronunciation, published his
Longman Pronunciation Dictionary. This has been a major work in English pronunciation
and gives both the British English and General American pronunciation of over 7,500
words. As Ladefoged (1993: 76) notes, "everyone seriously interested in English
pronunciation should be using this dictionary." The second and third columns show
symbols that are used in (Cambridge international! Dictionary of English (1995) and
Oxford Advanced Learner's Dictionary (5th ed., 1995). These are basically the same and
follow the practice of the Wells' system. The fourth column, however, represents a major
change of viewpoint. This system is used in The New Oxford Dictionary of English (1998)
and the authors claim that "the transcriptions reflect pronunciation as it actually is in
modern English, unlike some longer-established systems, which reflect the standard
pronunciation of broadcasters and public schools in the 1930s." (p. xvii) The changes
mainly lie in the notation of three vowels:
the use of la] for [a] as in bat, which shows that it is now a lower vowel;
the use of [AI] for [al] as in bike and fire, showing a change of initial position; and
the use of [] for[e] as in hair, which is an example of vowel merger.
This notation system has been used in all Oxford dictionaries published since 1997.

(Where two symbols appear for the some vowel, the one to the left is appropriate for
most speakers of British English and the one to the right for most speakers of American
English.)
The fifth and sixth columns are symbols used by American linguists in A Pronouncing
Dictionary of American English by J. S.Kenyon and T. A. Knott (1953) and A Manual of
American English Pronunciation by C. Prator and 13. Robinett (4th ed., 1985). It is
interesting to compare these in that some of the "long" vowels are represented by two
symbols (e. g. [iy]) and some of the diphthongs are represented by a single symbol (e. g.
[e] for [er]).
The last two columns are phonetic symbols used in textbooks on linguistics.
Ladefoged's A Course in Phonetics (3rd ed., 1993) is an influential textbook for students
beginning their work on phonetics. His notation system in this edition has changed from
the second edition (1982) as the earlier edition was based on the 1979 version of the IPA
and the present edition is based on the 1989 version of the IPA. The symbols used in the
third edition also shows great similarity to Wells (1990) as Ladefoged has very high
appraisals on Wells' dictionary.
Radford et al.s Linguistics: An Introduction (1999) is a major introductory text on
linguistics at the end of the millennium.

In the words of Nell Smith, Professor of

Linguistics at University College London, "This introduction, by some of today's most


distinguished linguists, should rapidly become the market leader." The symbols they have
adopted here is comparable to the Oxford system to a great extent, except that they are
still unsure about whether the [al] should actually give way to [].
A comparison of the works above shows that, despite the divergences, linguists and
lexicographers have reached general agreements by the 1990s on the vowel segments
of English. Several issues have not been settled. Firstly, as the oral cavity is in fact
extremely small, the difference in quality of some of the vowels may depend heavily on
the speaker's accent and personal preference. In French, for example, [e] and [] are two
distinct sounds that make a difference of meaning in combination with other sounds. In
English this does not cause such a result. Therefore, the use of [e] or [ ] for words like

bed and peg is completely a matter of habitant preference. The same is true of using [a]
or [], [:] or le], l:] or [:], and [ai] or []. Secondly, the length of a particular vowel may
vary according to the context in which they occur. For example, the same vowel is longer
before a voiced consonant and shorter before a voiceless one. Consequently, the vowel
in bead is longer than the vowel in beat, which has about the same length as the vowel in
bid. The difference between the vowels in beat and bid is therefore not of length but of
other qualities. The [I]sound is slightly lower in position than [i:] so it requires less tension
of the muscles. In this light, Ii:] is often referred to as a TENSE VOWEL and [I] a LAX V)
WEL. Some confusion arises here, however. Some linguists use [i] for beat, while others
use Ii:]. But people all agree that the vowel in bit is [I]. In each case the former is called a
tense vowel and the latter, a lax vowel.
In this book, we will follow the English vowel system of Radfordet al. (1999), which is shown in
Table 2.3.
Table 2.3 English vowels
Front Central Back
High

i:

Diphthongs:

u:


Mid

Low

To summarize, the description of these vowels needs to fulfill four basic requirements:
the height of tongue raising (high, mid, low);
the position of the highest part of the tongue (front, central, back);
the length or tenseness of the vowel (tense vs. lax or long vs.short), and
lip-rounding (rounded vs. unrounded).
Consequently, we describe the vowels in this way:
[i:] high front tense unrounded vowel
[u] high back lax rounded vowel
[a] central lax unfounded vowel
2.6 Coarticulation and phonetic transcription

2.6.1 Coarticulation
Speech is a continuous process, so the vocal organs do not move from one sound
segment to the next in a series of separate steps. Rather, sounds continually show the
influence of their neighbors. For example, if a nasal consonant (such as [mi] precedes an
oral vowel (such as [a] in map), some of the nasality will carry forward so that the vowel
[a] will begin with a somewhat nasal quality. This is because in producing a nasal the soft
palate is lowered to allow airflow through the nasal tract. To produce the following vowel
[a] the soft palate must move back to its normal position. Of course it takes time for the
soft palate to move from its lowered position to the raised position. This process is still in
progress when the articulation of [al has begun. Similarly, when [a] is followed by [m], as
in lamb, the velum will begin to lower itself during the articulation of [a] so that it is ready
for the following nasal.
When such simultaneous or overlapping articulations are involved, we call the process
COARTICULATION. If the sound becomes more like the following sound, as in the case
of lamb, it is known as ANTICIPATORY COARTICULATION. If the sound displays the
influence of the preceding sound, it is PERSEVERATIVE COARTICULATION, as is the
case of map.
Anticipatory coarticulation effects are far more common than perseverative
coarticulation effects. Note how the lip-positions of the unrounded vowel [i:] and the
rounded vowel [u:] affect the [s] in seat, and soup respectively. In the production of the
[s]of seat the lips are unrounded, while in the [s] of soup they are rounded,
2.6.2 Broad and narrow transcriptions
We have noticed that the vowel [al in lamb has some quality of the fallowing nasal
and we call this phenomenon NASALIZATION. Then how do we trans, he this phonemes
in [PA symbols? The idea is that the IPA chart contains a set of DIACRITICS for the
purpose of transcribed the minute difference between variations of the same sound. To
indicate that a vowel has been nasalized, we simply add a curved line to the top of the
symbol [al, as []. By the same token, we can use these diacritics for recording many
other variations of the same sound. Take [p] for example, it is ASPIRATED in peak and
UNASPIRATED in speak. This aspirated voiceless bilabial stop is thus indicated by the

diacritic h, as [p h], whereas the unaspirated counterpart is transcribed as [p =] for


contrast. For most purpose, however, it is not necessary to indicate such variations of a
sound every time. When we use a simple set of symbols in our transcription, it is called a
BROAD TRANSCRIPTION. Thus the use of more specific symbols to show more
phonetic detail is referred to as a NARROW TRANSCRIPTION.
2.7 Phonological analysis
The study of speech sounds is partitioned between two distinct but related disciplines,
phonetics and phonology. As we have seen from above, phonetics studies how speech
sounds are made, transmitted, and received. PHONOLOGY, on the other hand, is the
study of the sound systems of languages. There is a fair degree of overlap in what
concerns the two subjects, so it is sometimes very difficult to draw the boundary between
them. Phonology is concerned with the linguistic patterning of sounds in human
languages, with its primary aim being to discover the principles that govern the way
sounds are organized in languages, and to explain the variations that occur.
The human vocal apparatus can produce a very wide range of sounds, but only a
small number of these are used in a language to construct all of its words and sentences.
Phonetics is the study of all possible speech sounds while phonology studies the way in
which speakers of a language systematically use a selection of these sounds in order to
express meaning. A common methodology of phonology is to begin by analyzing an
individual language, to determine its PHONOLOGICAL STRUCTURE, i.e. which sound
units are used and how they pattern. Then the properties of different sound systems are
compared so that hypotheses can be developed about the rules underlying the use of
sounds in particular groups of languages, and ultimately in all languages.
Phonology is not specifically concerned with aspects of speech production or
perception as these are purely the result of the physical properties of the System. In the
study of coarticulation in English, for example, it is often said that the articulation of the [t]
sounds in the words tea and too differ from each other slightly. In the [t] of tea the tongue
is brought towards the front of the mouth in comparison with the ft] of too. The reason for
this is that the vowel [i:] of tea drags the tongue slightly further forward in the mouth than
the vowel [u:] of too. In fact, it is virtually impossible to pronounce a clear and pure

[i:]"type vowel immediately after the kind of [t] sound found in too. In other words, it would
appear that some degree of fronting in these circumstances is physiologically inevitable.
Phoneticians are concerned with how these two [t]s differ in the way they are pronounced
while phonologists are interested in the patterning of such sounds and the rules that
underlie such variations.
2.8 Phonemes and allophones
2.8.1 Minimal pairs
Phonological analysis relies on the principle that certain sounds cause changes in
the meaning of a word, whereas other sounds do not. An early approach to the subject
used a simple methodology to demonstrate this. It would take a word, replace one sound
by another, and whether a different meaning resulted. For instance, the word tin in
English consists of three separate sounds, each of which can be given a symbol in a
phonetic transcription, [m]. If we replace [t] by [d], a different word results: din. [t]l and [d]
are thus important Sounds in English, because they enable us to distinguish tin and din,
tie and die, and many more word pairs.
Similarly, [i:] and [[J can be shown to be important units too, because they
distinguish between beat and bit, bead and bid and many other pairs. This technique,
called the "MINIMAL PAIRS test, can be used to find out which sound substitutions
cause differences of meaning. The method has its limitations as it is not always possible
to find pairs of words illustrating a particular distinction in language, but it works well for
English, where it leads to the identification of over 40 important units. In the earliest
approach to phonological analysis, these "important units are called PHONEMES.
Phonemes are transcribed using the normal set of phonetic symbols but within slant lines
instead of square brackets--/p/, /t/, /e/, etc. It shows that these units are seen as part of a
language, and not just a physical symbols. Some of the minimal pairs for English
phonemes are shown in Fig. 2.7 below:

Fig. 2.7 Some of the minimal pairs for English


Phonemes (Source: Crystal, 1997: 162)
Vowels

Consonants

beat-bit

pin-bin

yolk-choke

bit-bet

bin-tin

choke-joke

bet-bat

tin-din

bat-but

jade-fade
din-kin

fail-veil

but-heart

coat-goat

heart-hot

got-hot

heave-heath
wreath-wreathe

pot-port

height-might

though-sew

port-put

might-night

bus-buzz

full-fool

kin-king

cool-curl

tin-till
girl-gale

tale-tile

wet-yet

toil-toll
tone-town
how-here
here-hair
pair-poor
poor-pen

Confucian-confusion
led-red

lad-wad

tile-toil

zoo-shoe

beige-bait

2.8.2 The phoneme theory


The use of phonemic analysis is ancient but the first explicit formulation of a
phoneme theory was made only in the 1870s by Jan Baudouin de Courtenay and his
student Mikolaj Kruszewski at Kazan. The French word phoneme bad already been
invented, with the meaning of "speech-sound", in which sense de Sauasure and other
French-speaking writers continued to employ it.
In the early part of the 20th century, the idea was developed in many centers by
such renowned linguists as Daniel Jones in London, N.S. Trubetzkoy and the Prague
School in Vienna, and numerous American linguists, including F. Boas, E. Sapir, L.
Bloomfield, Y.R. Chao, and C.F. Hockett. Several theories were put forward and
discussed but for the practical task of describing sound-systems, the"minimal pairs" test
shows that the word phoneme simply refers to a"unit of explicit sound contrast": the
existence of a minimal pair automatically grants phonemic status to the sounds
responsible for the contrasts. A linguistic system is built on the idea of contrasts. By
selecting one type of sound instead of another we can distinguish one word from another.
Languages differ in the selection of contrastive sounds. In English, for example, the
distinction between aspirated [p h] and unaspirated [p =] is not phonemic. They both
belong to the same phoneme /p/ but are realized as different phonetic sounds
conditioned by different positions. Compare the words peak and speak, for instance.
The /p/ in peak is aspirated, phonetically transcribed as [ph] while the/p/ in Speak is
unaspirated, phonetically [p=]. In Chinese, however, the distinction between / p=/ and / ph /
is phonemic so that "" (bin, guest) and "" (pin, to piece together) are transcribed as /
p=In/ (or /pin/ to be economical) and / ph in/ respectively.
2.8.3 Allophones
Dictionaries often transcribe the words peak and speak as/pi:k/ and /spi:k/
respectively. Such "broad" transcription is said to be "phonemic" as it only shows the
sounds by phonemes. However, when it the two words are actually pronounced, the/p/ is
aspirated in peak and unaspirated in speak. We know that m English there is a rule that
this sound is unaspirated after /s/ but aspirated in other places. To bring out the
"phonetic" difference, an aspirated sound is transcribed! With a raised "h" after the

symbol of the .sound so a phonetic transcription for peak is [phi:k] and that for speak is
[sp=i:k]. Phonemic transcriptions are placed between slant lines (/ /) while phonetic
transcriptions are placed between square brackets ([ ]) - In phonetic terms, phonemic
transcriptions represent the "broad" transcriptions.
In the above example, [ph, p=] are two different PHONES and are variants of the
phoneme /p/. Such variants of a phoneme are called ALLOPHONES of the same
phoneme. In this case the allophones are said to be in COMPLEMENTARY
DISTRIBUTION because they never occur in the same context. That is to say that [p=]
always occurs after [s] while [ph] always occurs in other places. We can represent this
rule as:
(1) /p/

[p=] / [s] ____


[ph] elsewhere

(Note: "[s]____________ "is the environment in which /p/ appears. )


This phenomenon of variation in the pronunciation of phonemes in different positions
is called ALLOPHONY or ALLOPHONIC VARIATION. Another example of allophony in
English is the phoneme/l/. We all know that it is pronounced differently in lead and deal,
where in the second case the tongue is curled a little backwards towards the hard palate
(PALATALIZATION). We often call this "dark l" and use the symbol [] in phonetic (or
narrow) transcription. [l], as pronounced in lead, is called "clear 1". Consequently, lead is
transcribed as [li:d] and deal as [di: ] phonetically. The rule is very simple: the phoneme
/l/ is pronounced as [l] before a vowel and as [i] after a vowel. They are again in
complementary distribution. It can be represented as:
(2) / l / [ l ] / ______________ V
[ ] / V________________
To say that [ph, p=] belong to the phoneme /p/ and [l, ] belong to the phoneme / l /
reduces the number of phonemes in English--the four sounds are attributed to only two
phonemes. There are also other eases of allophony, of course, and the student can start
thinking about what others exist as allophones in English at this stage.
Not all the phones in complementary distribution are considered to be allophones of
the same phoneme, however. There are some restrictions for phones to fall into the same

phoneme: they must be phonetically similar and in complementary distribution.


PHONETIC SIMILARITY means that the allophones of a phoneme must bear some
phonetic resemblance. For example, [l, ] are both lateral approximants,and they only
differ in places of articulation; [ph , p=] are both voice-less bilabial stops differing only in
aspiration. In either ease, the allophones are in complementary distribution.
Apart from complementary distribution, a phoneme may sometimes have FREE
VARIANTS. For example, the final consonant of cup may not be released by some
speakers so there is no audible sound at the end of this word. In this ease, it is the same
word pronounced in two different ways: [khph] and [kh p]. (The diacritic ""indicates "no
audible release" in IPA symbols. ) The difference may because by dialect, habit, or
individual preference, instead of by any distribution rule. Such a phenomenon is called
"FREE VARIATION". Free variation is also seen in regional differences. For example,
most Americans pronounce the word "either" as [i:] whereas most British people say
[ai]. Individual differences may determine the use of [dir] or [dair] for the word
direction.

In dictionaries, free variants are often listed side by side. Of course, a

dictionary produced in the UK will normally put British pronunciation on the left.
2.9 Phonological processes
2.9.1 Assimilation
Let us begin by looking at the following sets of words and phrases. Consider their
pronunciation in each case.
ex. 2-1
a. cap[kap]
b. tap[tap]

can[kn]
tan[t n]

ex. 2-3
a. since[sins]
b. mince[ mins]

sink[ sik]
mink[ mik]

In both exx. 2l a and 2-1b, the words differ in two sounds. The vowel in the second
word of each pair is "nasalized" because of the influence of the following nasal
consonant. In ex. 2 3, the alveolar nasal /n/ becomes the velar nasal [] before the velar
stop [k]. NASALIZATION, DENTALIZATION, and VELARIZATION are all instances of

ASSIMILATION, a process by which one sound takes on some or all the characteristics
of a neighboring sound.
Assimilation is often used synonymously with coarticulation. Similarly, there are two
possibilities of assimilation: if a following sound is influencing a preceding sound, we call
it REGRESSIVE ASSIMILATION; the converse process, in which a preceding sound is
influencing
a following sound, is known as PROGRESSIVE ASSIMILATION. All our examples in exx.
2-1--2-3 are instances of "regressive assimilation".
Assimilation can occur across syllable or word boundaries, as shown by the following:
ex. 2 4
a. pan[] cake
b. sun[] glasses
ex. 2-5
a. you can[]keep them
b. he can[]go now
Studies of English fricatives and affricates have shown that their voicing is severely
influenced by the voicing of the following sound. The five pairs of English fricatives and
affricates are listed in (3).
(3) f, v; , ;s, z;, , t, d
Examples in ex. 2 6 show how fricatives and affricates in English may be assimilated
in voicing.
ex. 2-6
a. five past

[faivpa:st]

[faifpa:st]

b. love to

[l]

[l]

c. has to

[hazt]

[hast]

d. as can be shown [ ] [ ]
e. lose five-nil
f. edge to edge

[lu:zf]
[dd]

[lu:zf]
[td]

The first column of symbols shows the way these phrases are pronounced in slow
or careful speech while the second column shows how they are pronounced in normal,
connected speech. Examination of other sounds reveal that DEVOICING, a process by

which voiced sounds become voiceless, in such contexts does not occur with other
sounds (such as stops).
2.9.2 Phonological processes and phonological rules
The changes discussed above exhibit PHONOLOGICAL PROCESSES in which a
TARGET or AFFECTED

SEGMENT undergo

structural

change

in

certain

ENVIRONMENTS or CONTEX in each process the change is conditioned or triggered by


a following sound or, in the case of progressive assimilation, a preceding sound.
Consequently, we can say that any phonological process must have three aspects to it:
(1) a set of sounds to undergo the process;
(2) a set of sounds produced by the process;
(3) a set of situations in which the process applies. We can represent the process
by means of an arrow:
(4) /v/ [f]
Our data have shown that this does not only apply to /v/ but also to other fricatives.
Therefore, we can make a more general rule to indicate that voiced fricatives are
transformed into voiceless fricatives before voiceless segments:
(5) .voiced fricative voiceless/__ voiceless
This is a PHONOLOGICAL RULE. The slash (/) specifies the environment in which
the change takes place. The bar (called the FOCUS BAR) indicates the position of the
target segment. So the rule reads: a voiced fricative is transformed into the
corresponding voiceless sound when it appears before a voiceless sound. Nasalization,
dentalization, and velarization are also typical phonological processes that can be
represented by the following rules:
(6) Nasalization rule
[ - nasal] [ + nasal]/__ [ + nasal]
(7) Dentalization rule
[ - dental] [ + dental]/.__[ + dental]
(8) Velarization rule
[ - velar] [ + velar]/__[ + velar]
An interesting case is the indefinite article a/an in English. Consider the following:
ex. 2-7

a. a hotel, a boy, a use, a wagon, a big man, a yellow rug a white house
b. an apple, an honor, an orange curtain, an old lady
All the words begin with a in ex. 2-7a while an is used in ex.2-7b. We know that and
is used when the following word begins with a vowel sound. How do we capture this in
phonological representation? We should notice that it is the lack of a consonant that
requires the nasal [n] to be added to the article a. For that matter, we treat the change of
a to an as an insertion of a nasal sound. Technically, this process of insertion is known as

EPENTHESIS. We can formulate this rule as (with

(9)

indicating an empty position):

[n]/[]________ V

The regular plural and past tense forms in English also give rise to interest in their
phonologically-conditioned rules.
2.9.3 Rule ordering
So far we have seen how certain changes in the pronunciation is governed by rules.
In this section we will examine a more complex phenomenon. We know that in English
nominal plural forms are regular plural in most cases. The regular plural pattern,
however, is highly dependent on the phonological environment. Look at the following
forms:
ex. 2-8
a. desk [ desk]
b. chair [t:]
c. box

desks [desks]
chairs [t:z]

[ bks] boxes [bksz]

We see that the plural suffix, -( e ) s in written form, is probounced in three different
ways: [s], [z], and [z]. Our task here is to work out what rule governs this variation.
It is easy to see that unlike many of the examples we have seen earlier in this
chapter, these variants cannot be governed by the following sounds as there aren' t any.
Then what features of the sounds on the left attribute to this change? In ex. 2-9, we see
some more example pies of plural forms:

ex. 2-9
a. tables

-z

b. seats

-s

c. benches

-az

d. stools

-z

e. couches

az

f. sofas

-z

g. divans

-z

h. mattresses

i. quilts

-s

j. wardrobes

k. beds

-z

1. hammocks

m. rugs

-z

n. cushions

o. bridges

-az

q. pillows

-z

-az
-z
-s
-z

p. bunks

-s

r. ash

s. cupboards -z

-az

t. cases

-az

Now we sort them according to the plural variants (with transcription for the singular
forms only):
ex. 240
a.-z tables
sofas

/teibl/
/soufo/

stools
divans

wardrobes /wa:droub/
rugs

/rAg/

pillows
b. -s seats

/pilou/
/si:t/

quilts

mattresses /matras/
/bnds/

beds

/bed/

cupboards /kAbad/

benches /bEntl/

bridges

/&van/

cushions /kuln/

hammocks /hamak/
e.-az

/stu:l/

/kwIlt/

bunks

/b,xgk/

couches
cases

. ashes

/kautJ'/
/keIs/

/aJ'/

/z/ appears after these sounds: /1, o, n, b, d, g, au/,/s/ is found after/t, k/, and/az/
occurs after/s, f, tf, ds/. If we examine more words, we find that they follow the same kind
of pattern. It is easy to see that/s/ is used when the preceding sound is a voiceless
consonant ther than/s, $, ti/, /z/ occurs when the preceding sound is a vowel or a voiced
consonant other than/z, 5, ds/, and/az/follows any of the following sounds: /s, z,?/. This
group of fricatives and affricates, which often behave in the same way, is traditionally

known as SIBILANTS.
Now, the three variants of the plural form in English are applied is the following
fashion:
(10)

a. The/s/ appears after voiceless sounds.


b. The /z/ appears after voiced sounds. (All vowels are voiced. )
c. The/z/ appears after sibilants.

In order to bring out the rule that governs this pattern, we need to say that/z/, which
occur in the most cases, is the basic form and the other two forms are derived from it.
The basic form is technically known as UNDERLYING FORM or UNDERLYING
REPRESENTATION (UR). The derived form is the SURFACE FORM or SURFACE
REPRESENTATION (SR). Therefore, /s/ is a matter of devoicing and/oz/ is a case of
epenthesis. The two rules are represented follows:
(11) z s /[ -voice, C]__________

(Devoicing)

(12) a /sibilant _____________ z

(Epenthesis)

With these two rules at hand, we can see if we can derive the correct SRs from the
URs. Consider the derivations in
(13) a. //si:t+z// b. //bed+z// c. //keis+z//
s

N/A

si:ts

N/A

N/A

N/A

Ddvoicing

N/A

bedz

keiss

Epenthesis

Output

Clearly, something has gone wrong. The problem is that Devoicing will always apply
to/z/after a voiceless consonant and then there never the environment for Epenthesis to
apply. The obvious solution is to say that Epenthesis will always apply before Devoicing,
as in (14):
(14) a. //si:t+z//

b. //bed+z//

e. //kexs+z//

N/A

N/A
s
si:ts

Epenthesis
N/A
bsdz

N/A

Devoicing

ketsoz Output

Thus, in this particular case, we have to follow a specially stipulated RULE


ORDERING. If this order is disturbed, incorrect derivation will result.
2.10 Distinctive features
As we have seen from the discussion of IPA symbols in the last chapter, speech
sounds are divided up into classes according to a number of properties. For example,
consonants are described according to their places and manners of articulation, and
vowels are described ac-cording to their frontness or backness. One important property
is"voicing", which plays an important part in distinguishing obstructs in English. ~
Because voicing can distinguish one phoneme from another, it is a DISTINCTIVE
FEATURE for English obstruents. There are other features too and many of them are
BINARY FEATURES because we can group them into two categories: one with this
feature and the other without. Binary features have two values or specifications denoted
by "+" and "-" so voiced obstruents. are marked [ + voiced]and voiceless obstruents. are
marked [ + voiced]. Sonorants are always [+ voiced] so the feature [ -+ voiced] is
redundant for such sounds. Consequently, voicing is not a distinctive feature for
sonorants. By the same token, the feature [~ nasal] is used for distinguishing nasals from
non-nails so the nasal sounds are marked [+ nasal] 'and all other sounds are [nasal]. In
contemporary phonology, some twenty such features are used to group speech sounds
from different angles. Table2.5 shows the feature specifications for English consonant
phonemes.
A word needs to be said about the feature for places of articulation {[PLACE]}. The
place feature are divided up into four values: {PLACE; Labial}p. {PLACE :Coronal}p,
{PLACE: Dorsal}p, and {PLACE: Guttural}p, which are often in shorthand forms as
{Labial}p, [Coronal]p, Dorsal]p, and {guttural}p.
Table2.5 Distinctive feature matrix for English consonant phonemes

Notes:
1) L=LABIAL, C=CORONAL, D=DORSAL, G=GUTTURAL
2) "-/ is a special type of feature value for an affricate indicating that the sound has
both specifications, one after another.
(Source:

Radford, A., M. Atkinson, D. Britain, H. Clahsen & A.Spencer.

1999.

Linguistics: An introduction. Cambridge University Press, p. 141)


Now we can represent the rule that governs the aspiration of/p in (1) in terms of
features, as in
(15) a. [-voiced, -cont] / [-spread]/s __________

b.

[ + spread]

This means that /p, t, k/ are unaspirated ([ -spread]) after /s/ at this level, there is no
need to know exactly what each feature value mews. A complete description o{ distinctive
features used in recent years can be found in Andrew Spencer's Phonology, pp. 141-4,
published by Blackwell, 1996. and aspirated ([ + spread]) in all other positions.

2.11 Syllables
In this section and the next, we will consider SUPRASEGMENTAL FEATURES--those
aspects of speech that involve more than single sound segments. The principal
suprasegmantal features are syllable, stress, tone, and intonation.
Our discussion so far has been concentrated on the single-line or LINEAR approach
of phonology, as initiated by Chomsky and Halle's monumental book The Sound Pattern
of English (SPE, 1968). InSPE, words are held to consist of sequences or strings of
consonants
and vowels and the word "SYLLABLE" does not even appear in the index. The syllabic
theory, however, is often represented by a tree diagram. Such theories, are often known
as NON-LINEAR or MULTILEVEL PHONOLOGY.
2.11.1 The syllable structure
Different languages permit different kinds of syllables. In Chinese Putonghua, for
example, syllables typically consist of a consonant followed by a vowel. Only nasals/n, 0/

can occur after the vowel and there are no consonant clusters. This is why the English
monosyllabic
word please is often pronounced as /pulisi/ by Chinese who are beginning to learn
English.
In English, a word may be MONOSYLLABIC (with one syllable, like cat and dog) or
POLYSYLLABIC (with more than one syllable, like transplant or festival). A syllable must
have a NUCLEUS or PEAK, which is often the task of a vowel. However, sometimes it is
also possible for a consonant to play the part of a nucleus, as in the word table, which
consists of a syllable [tell and a syllable lbl]. In the second syllable there is only the
syllabic consonant [tei] to function as the nucleus.
When we say that words like bed, dead, fed, head, led, red, said, thread, wed rhyme,
we mean that the sounds after the first consonant or consonant cluster are identical.
Therefore, we can divide a' syllable into two parts, the RHYME (or RIME) and the
ONSET. At the vowel within the rhyme is the nucleus, the consonant(s) after it*rill be
termed the CODA. We can thus represent the SYLLABIC STRUCTURE of the word
clasp in (16). The Greek letter a ("sigma'') is used to represent a syllable.

(16)

O (nset)

R (hyme)
N(ucleus)

Co(da)

All syllables must have a nucleus but not all syllables contain an onset and a coda. A
syllable that has no coda is called an ()PEN SYLLABLE while a syllable with coda is
known as CLOSED SYLLABLE.
In English, there are both closed and open syllables but only tense vowels (long vowels
and diphthongs) can occur in open syllables. Differences in syllable structure also exist
cross-linguistically. In English, the onset position may be empty or filled by a cluster of as
many as three consonants, while the coda position may be filled by as many as four
consonants (as in sixths [siks0s]). For this matter, the English syllable may be
represented as ( ( ( C ) C ) C ) V (( ( (C) C ) C) C). The Chinese syllable, however, allows
at most one consonant in the onset position and only nasals In, *3] in the coda for the
Putonghua. Thus the Chinese syllable is represented as (C)V(C).
2. 11.2 Sonority scale
It is interesting to find that in English consonant clusters in onset and coda positions
disallow many consonant combinations. For example, we can have help, lump, pray, and
quick, but not help, lump, pray, or quick. It is found that a SONORITY SCALE is at work.
The DEGREE OF SONORITY of different classes of sound affects their possible
positions in the syllable:
(17) Sonority scale:
Most sonorous

5 Vowels

4 Approximants
3 Nasals
2 Fricatives
Least sonorous

1 Stops

In a word such as clasp, the sonority of each sound gradually rises to a peak at the
nucleus and then falls at the coda, as shown in (18):
(18) 5
4

*
*

3
2

k 1 a s p
This explains why *lkaps is not allowed:
(19) 5

4*
3
2

1 *

*
1 k a p s

The phoneme /s/ behaves unusually, however, in that it can combine with almost any
onset to form a cluster of up to three consonants, e.g. [spl-], [spr-], [str~], [skw- ]. No
explanation has been found to account for this issue.
2. 11.3 Syllabification and the maximal onset principle
. No agreement has been reached as to what forms a syllable. Consequently, the
division of syllables in polysyllabic words has to be solved according to some principles.
Consider the word country /kntri/, which contains the consonant cluster /ntr /. Obviously,
we can't split it into /k / or/k/ as */ntri / and */kntr/ are not permissible syllables
in English. The correct syllabification should be /k / according to the MAXIMAL
ONSET PRINCIPLE, which states that when there is a choice as to where to place a
consonant, it is put into the onset rather than the coda.
2,12 Stress
STRESS refers to the degree of force used in producing a syllable. In transcription, a
raised vertical line [i] is used just before the syllable it relates to. A basic distinction is
made between stressed and unstressed syllables, the former being more prominent than
the later usually due to an increase in loudness, length or pitch. This means that stress is
a relative notion. At the word level, it only applies to words with at least two syllables. At
the sentence level, a monosyllabic word may be said to be stressed relative to other
words in the sentence.
.

The stress pattern in English is no easy matter. In principle, the stress may fall on any

syllable. They also change over history and exhibit regional or dialectal differences. For
example, it has been observed that inTEGral, coMMUNal, forMIDable, and conTROVer

are becoming the norm whereas INtegral, COMMunal, FORmidable, and CONtroversy
are often considered conservative. (The capitalized syllables are stressed. ) Speakers of
RP and those of
General American (GA) also differ in their preferences in the stress pattern of these
words: laBORatory (RP), LABoratory (GA); DEJBris (RP), deBRIS (GA); GARage (RP),
gaRAGE (GA).
It has also been observed that stress is sometimes placed on a different syllable for
the different grammatical function a word plays. For example, conVICT (v.)--CONvict (n.),
inSULT (v.)Insult (n.), proDUCE (v.)--PROduce (n.), reBEL (v.)--REbel (n.).
Notice that alternations of stress often occur between compounds and phrases, A
BLACKboard is used in the classroom for teachers to write on whereas a black BOARD
is any piece of board that is black in color. Similarly, a BI. ACKbird is a special kind of bird
hut a black
BIRD is any bird that is black in color.
For long words, there are often two stressed syllables, one being more stressed than
the other. The more stressed syllable is the PRIMARY STRESS while the tess stressed
syllable is known as the SECONDARY STRESS. In the word epiphenomenal, for
example, the primary stress falls -no- while the secondary stress falls on epi- . All other
syllables are unstressed ones.
Sentence stress is much more interesting, fn general situations,notional words are
normally stressed while structural words are unstressed, Nevertheless, sentence stress
is often used to express emphasis, surprise etc. so that in principle stress may fall on any
word or any
syllable. For example,
ex. 2-1t
a. John bought a red bicycle.
b. JOHN bought a red bicycle.
c. John BOUGHT a red bicycle.
d. John bought a RED bicycle.
e. John bought a red BICYCLE.

Further Reading:
Ball, M. & J. Rahitly. 1999. Phonetics: The Science of Speech. London: Edward Arnold.
Carr, Philip, 1999. English Phonetics and Phonology. Oxford: Blackwell.
Clark, John & Colin Yallop. 1995. An Introduction to Phonetics and Phonology. 2nd ed.
Oxford: Blackwell.
Davenport, Mike & S.J. Hannahs. 1998. Introducing Phonetics and Phonology. London:
Edward Arnold.
Giegerich, Heinz g. 1992. English Phonology: An Introduction. Cambridge University
Press.
Guessenhoven, Carlos & Haike Jacobs. 1998. Understanding Phonology. London:
Edward Arnold.
Ladefoged, Peter. 1993. A Course in Phonetics. 3rd ed. Fort Worth,TX: Harcourt Brace.
Ladefoged, Peter. 2000. Vowels and Consonants. Oxford: Blackwell.
.Laver, John. 1994. Principles of Phonetics. Cambridge University Press.
Poole, Stuart. 1999. An Introduction to Linguistics. London: Macmillan. (Chapters 4-5 )
Roach, Peter. 1991. English Phonetics and Phonology. 2nd ed. Cambridge University
Press.
Roca, Iggy & Wyn Johnson. 1999. A Course in Phonology. Oxford: Blackwell.
Spencer, Andrew. 1996. Phonology. Oxford: Blackwell.
Yule, George. 1996. The Study of Language. Cambridge University Press. (Chapters 5 6)