Anda di halaman 1dari 47

Scat syllables and markedness

theory*
Patricia A. Shaw
University of British Columbia

here could be no more appropriate dedication to Jack:


Thou Swell (scat solo)

Thou swell 1927. Words by Lorenz Hart, Music by Richard Rodgers.


Scat solo by Betty Carter, transcribed by William R. Bauer (2002a: 251).

A highly creative domain between the prosodies of human language


and the ris of inrumental jazz is the dynamic vocal jazz idiom of
scat. The present analysis proceeds from the observation that, despite
the diinctly individualiic approaches to scatting by renowned jazz
maers such as Louis Armrong, Betty Carter, and Chet Baker, the
inventory of the semantically empty syllables used in scat is extremely
limited in comparison to the rich range of combinatorial possibilities
that dene well-formed syllables in English. This paper explores the
degree to which the form of scat syllables in the performance repertoire
of various artis conforms to poulated universal markedness
conraints on natural language syllables. Signicantly, markedness
theory plausibly accounts for a considerable range of the data.
Nonetheless, certain syematic deviations occur. It is proposed that
the relative markedness of such properties may be genre-dependent,
functioning in scat to enhance musical form or modality.

1.

Introduction

Like the majority of human languages in the world, which evolved and persist as strictly
oral traditions, scat emerged in the realm of musical genres as a vibrant, expressive, and
exclusively oral idiom. However, unlike human languages, scat does not build on a
consistent, conventionalized relationship between sound and meaning. Its essence is
creative, improvisational vocal tract sound. Its syllables and sequences are evocative and
emotive, but not denotative. There is no standardized or systematic interpretability to the
musically parsed cadences of scat syllables. For example, the title of Louis Armstrongs
1926 hit Heebie Jeebies has a consistent interpretation, verifiable across different speakers, as
* I am deeply indebted to Mike Fitzgerald, Kate Hammett-Vaughn, Ted Moore, Tyler Peterson, Suzanne
Pittson, Fred Stride, and particularly Bill Bauer and Alan Matheson for their generous guidance. Special
thanks to Walter Pedersen for his enthusiastic assistance with transcription and in tracking recordings.

Toronto Working Papers in Linguiics 27: 145191


Copyright 2008 Patricia Shaw

Patricia A. Shaw

refering to a kind of nervous energy or a scattered uneasy feeling, the jitters. However, the
sequence of syllables in the scat line in (1) would elicit no coherent consensual meaning.
(1) Bars 57 of the scat solo by Louis Armstrong in Heebie Jeebies (1926)
WRB: | duw daw diy duw d | diy d d dow diy | dow di dow duw duw |
| duw daw diy duw d | diy d d dow diy | dow d dow duw duw |

In a formal linguistic sense, then, scat syllables are semantically empty.


Nonetheless, of considerable linguistic interest is their form. The present analysis
proceeds from the observation that, despite the distinctly individualistic approaches to
scatting by great vocal jazz masters, the repertoire of the semantically empty syllables used
in scat is extremely limited in comparison to the rich range of combinatorial possibilities
that define well-formed syllables in English. For example, two properties of the excerpt
in (1) are immediately noteworthy, and, as it turns out, are robustly characteristic of scat
vocables produced by a broad diversity of performers. First, consider the onset and coda
structure of the scat syllables in (1): of the 15 syllables, all have a single consonant as
onsetthere are no clusters, no onsetless syllables, and none has even a single consonant
as coda. In other words, all are open CV or CVG syllables, despite the fact that English
words are built on an inventory of combinatorial possibilities that readily sanctions codas,
and allows quite extensive complexity within both onset and coda clusters, e.g. as [str...]
and [...ks] in strengths [strks]. Secondly, not only do all the scat syllables in (1) have a
non-complex onset, but in fact they all have the same consonant [d] as the syllable onset.
As a means of comparison, now consider in (2) the structure of the syllables in
another scat solo by Louis Armstrong from Hotter Than That, recorded 3 years later
(cited from Reeves 2001 by Bauer 2002b: 308). Just as in (1), all the syllables in (2) are
canonical open syllables: all have a single segment onset, none has a complex onset, and
none has a coda. However, in contrast to (1), there are no [d]s. Rather, here all 16 syllables
have [b] as their onset.
(2) Bars 4954 of the scat solo by Louis Armstrong in Hotter Than That (1929)
WRB: | boh b boh | ba b biy | b biy | bow b bow | b ba biy | ba biy |
| b b b | ba b biy | b biy | bow b bow | b ba biy | ba biy |

An independent measure of what hasor has notconventionalized semantic interpretability is


reflected by which sequences of sounds are accorded entry as words in standard English dictionaries.
Consistent with the particular example chosen here, heebie-jeebies is listed as a word in the American
Heritage dictionary: slang. A feeling of uneasiness or nervousness; the jitters.. However, none of the
various potential spellings of the scat syllables (de, dee, deh, di, dih, du, duh, doo, etc. ) are.
The transcription line labelled WRB is by Bauer 2002b: 308; the transliteration beneath it follows the
principles of phonemic interpretation in Appendix 1.
The use of the terms onset and coda here does not entail the attribution of category or constituency
status within a formal theory of prosodic structure. Rather, these are simply cover terms to reference (i)
as onset, the string (possibly null) of segments between the left edge of a syllable and the Nucleus, and
(ii) as coda, the string (possibly null) of segments between the Nucleus and the right edge of a syllable.
The Nucleus is assumed to be an independent category node, which in English dominates a short vowel
(V), long vowel (V), or diphthong (VG). C abbreviates consonant, V vowel, G glide, syllable.

146

Scat syllables and markedness theory

In sum, two generalizations are strikingly evident from the data in (1) and (2). First,
of the 24 consonants available in the English phonemic inventory, the only two used in
these excerpts are [b] and [d]. Secondly, the syllable structure is consistently open, i.e. not
closed by a coda consonant. To an extreme then, Louis Armstrongs repertoire in these
citations exemplifies the fundamental premise of this research: scat draws on a very limited
subset of the sounds and of the syllabic groupings that are regularly used in English.
However, how representative are these generalizations? Is the favouring of [b] and
[d] part of Satchmos own particular idiosyncratic style, or is this genuinely something that
is broadly characteristic of scat? Whatyou are doubtless wanting to interjectabout
the [] in shoo be doo? And to what extent do other scat singers use a more diversified and
complex range of syllable shapes? What about codas? After all, who put the bop in the bop
shoo bop shoo bop?
A diverse sampling of vocal scat is investigated here, ranging from classic jazz icons
like Louis Armstrong, Betty Carter, and Chet Baker to pop music song-writers/recording
artists like Johnny Cymbal and Barry Mann, who in the early 60s wittily transported the
playful and unmistakably sexy edginess of scat directly into their rocknroll lyrics. Across
these artists, generations, and genres, the basic introductory observations about scat
are consistently affirmed: the inventory of sounds used and their syllabic organization
constitute a significantly small subset of the full diversity of available English options.
The principal goal is to identify just what generalizations about phonological form hold
within this body of scat data, and to explore various hypotheses that might plausibly
explain why the particular patterns that are attested emerge in scat.
From the perspective of linguistic theory, the observations are evaluated in the
context of postulated universal constraints on articulatory phonetics and phonological
markedness. Interestingly, a considerable range of the data is plausibly accounted for by
markedness theory. Equally interesting is the finding that certain systematically attested
scat properties run directly counter to markedness expectations. The highly marked, yet
robustly attested status of these characteristics suggests that over-riding the body of
linguistic constraints on the scat phonological system are competing constraints on scat
as a musical performance genre, constraints that function to enhance the melodic pitch
contour, the musical phrasing, the auditory interpretation, or the distinctive trademarking
of individual artistic style. What results from this analysis is a unique perspective into
the structural and performative interface of two complex systems of human vocal
expressionmusic and languageeach subject to distinct sets of constraints and
conventions, sometimes convergent and sometimes conflicting, but ultimately combining
in the creative exuberance of scat.

1.1.

Purview

The analyses in 2 below are sequenced with respect to the recording date chronology.
Beginning with the seminal 1926 Heebie Jeebies recording, the full context of the Louis
Armstrong scat solo from which the three-bar excerpt in (1) was drawn is explored
Barry Mann and Gerry Goffin did, in their 1961 hit single, Who Put the Bomp?

147

Patricia A. Shaw

in 2.1. This is then compared to the phonological properties of a 1929 rendition


of Hotter Than That. In 2.2, the focus shifts to Chet Baker (1955; 1989), an icon
of consonantal minimalism. In contrast, Betty Carters repertoire, representatively
examined (1955; 1979) in 2.3, introduces a considerably expanded consonantal
inventory. Through the subsequent decades, these two artistsChet Baker and Betty
Carterremained committed to the vocal jazz idiom of scat despite a significant shift
in the general publics musical interests away from bebop. For each, a comparison of
performances recorded nearly a quarter century apart provides an interesting measure
of individual creative evolution, as well as of particular consistencies despite dramatic
shifts in the musical and cultural backdrop of the latter half of the twentieth century.
Although the popularity of bebopthe jazz medium that had become virtually
synonymous with scathad significantly declined by the 60s, the vocabulary of scat
itself surged into a different realm of wide-spread prominence in that same period: the
American Hit Parade. As seen in 3.1 and 3.2, in major hits by recording artists like Barry
Mann (1961) and Johnny Cymbal (1963; re-recorded by ShaNaNa in 1980), canonical
scat syllables are directly imported into lyrics like Who Put the Bomp in the Bomp bah
Bomp bah Bomp? Here scat is explicitly objectified, transported, and incorporated into
a different and evolving musical genre. Although bereft of its improvisational core, this
embedded scat phenomenon carries forward the continuing identification of scat as
infectiously fun and irresistibly seductive. Despite rife competition for cornering the
sex appeal market from a burgeoning and rapidly diversifying popular music scene in
America, it was scat (the bop, the dip, and the rama lama ding dong) that made my baby
fall in love with me, yeah!! By 1963, Mr. Bass Mans baw b b baw b baw b baw baw
had elevated him to being the hidden King of RocknRoll ( Johnny Cymbal 1963), and
scat had clearly spread from bebop jazz to become established in the RnR mainstream
as eminently cool.

1.2. Methodology
Transcriptions of the body of scat data that informs the present study are presented
in Appendix 2. With a few notable exceptions, particularly Bauer (2002a, b), there is a
paucity of formal documentation of scat, and the diverse original sources that have been
drawn on here differ considerably in transcription conventions and rigour.
Bauers work constitutes an immensely detailed and valuable resource: in the
extensive Appendix (2002a: 245343) to his outstanding contribution to the study of
Betty Carters musical genius, Bauer provides a full transcription in musical notation of
Carters melodic line, synchronized with the lyrics, for 15 tunes. Of these, six incorporate
scat vocables, phonologically transcribed by Bauer in Trager-Smith notation. The two
chosen for the analysis in 2.3 allow a comparison across a 24-year time frame stretching
from 1955 to 1979. As well as Bauers Betty Carter material, the present analysis also
incorporates his transcription (2002b) of Louis Armstrongs Heebie Jeebies and his

148

Scat syllables and markedness theory

citation of Reeves (2001) transcription of bars 4955 of Louis Armstrongs Hotter Than
That scat solo. Note, however, that the Trager-Smith system adopted by Bauer has been
transliterated here, following the transcription conventions detailed in Appendix 1.
Two other helpful sources were Kernfelds (1995) transcription of Armstrongs
Hotter Than That and Bastians (Bastian and Alexander 1995) transcriptions of Chet
Bakers scat solos. As both these writers used different non-standardized representations
(duh, day, doe, etc.) that were ambiguously interpretable, these were re-transcribed from
audio files of the original recordings, following the principles in Appendix 1. This retranscription is directly paired with the source transcriptions in Appendix 2.
For the other songs (3.1, 3.2), the transcriptions presented here are novel. It is
worth foregrounding the complexity and relatively narrow focus of this task. Because the
goal is to relate the articulatory expression of these singers to the range of phonological
parameters that typologically characterize natural language systems, many features
of the sophisticated manipulations of vocal tract sound are not represented in the
relatively broad transcription system adopted here. Further, individual perceptions of the
appropriate categorization of a constantly mutating cadence of vocables into segmental
values may differ, as discussed in detail in Appendix 1. Given the paucity of literature
on linguistic properties of scat, this preliminary study will hopefully open the door to
further research into the nature of this interface.

2.

The Phonological Properties of Scat

The analytical goal in this section is to examine the phonological inventory of onsets
and codas in the scat syllables of the tunes documented in the database in Appendix
2, as well as to determine general properties of syllable shape in the output. Some
challenges related to the fluidity of the medium or of individual expression are raised
in the discussion of particular performances below. More general methodological issues
pertaining to the classification of syllabic form are presented in Appendix 1.

2.1. Louis Armstrong


Of Louis Armstrongs vast repertoire, an examination of two of his recordings from the
early heydey of jazz in the 1920s serves here to establish a frame of reference both for
Armstrongs own style and for subsequent diachronic developments in scat.
Reeves, Scott. 2001. Creative Jazz Improvisation. 3d ed. Upper Saddle River, J.J.: Prentice-Hall. This
resource was not available to me, and hence is cited only through Bauers (2002b) reference.
These were re-transcribed independently by myself and by a research assistant with both musical and
linguistic training. Where there was variance in the transcriptions, either between us and/or with cited
sources (e.g. 2.2), I assume sole responsibility for the interpretation adopted in this analysis.
Thus, the present focus is on consonantal patterns. For analysis of vowel quality in scat syllable nuclei,
the interested reader is referred to Bauer (2002b), which presents detailed discussion of vocalic timbre.

149

Patricia A. Shaw

2.1.1. Louis Armstrong, Heebie Jeebies (1926)


Even a cursory look at the first 4 non-lexical syllables ([e iyf gf mf]) that lead into
the scat solo of Heebie Jeebies (see Appendix 2.1.1) suffices to identify them as unusual in
comparison with the syllabic patterns which follow. Therefore, the analysis below focuses
first on the subsequent 48 syllable tokens.
The chart in (3) summarizes the findings about simplex syllable onsets. Consonants
which are attested in onset position are in white cells, along with their raw frequency
count. Possible, but unattested, onset consonants appear in shaded cells. Additional
information about onsetless syllables and cluster behaviour is on the right.
(3) Onsets:

b=

d =

No Onset: /
Onset clusters: sk =

n
l=

r=

Viewed against the full backdrop of the 24 consonants which can function as syllable onsets
in English, the fact that 20 (83.3) are not used at all (viz. the shaded grey cells) clearly
underscores the initial premise that scat is highly selective in its segmental inventory. Of
the four segments [d, b, l, r] that do appear as onsets, [d] is the clear favourite, initiating
37 of the 48 syllables (77.1). As one might expect from the discussion in 1, the runner-up
is [b] and although it trails far behind with only five appearances (10.4), its occurrence
is nonetheless salient. The liquids [l] and [r] make an early appearance in syllables 5 and
7 respectively of this set of 48, followed very shortly (beginning with 3 of bar 4) by a
running stream of 19 consecutive [d]-initial syllables.
Markedly heralding the start of a new phrase in bar 8, an initial [b] breaks the
[d]-only alliteration, leading into an alternating b-d-b-d-d sequence. Then, after
this cascade of 22 [d] onsets with only two [b] onsets having disrupted the auditory
flow, in bar 9 the only consonant cluster hits: [sk]. Its alliterative sequencing (three in
a row), its timing, and its composition all contribute to its striking impact. Nothing
has primed the listener for an [sk] cluster. Although [sk] is not at all an uncommon
English onset, in the context of the segmental composition of Louis Armstrongs
scat sequence here, it is totally deviant: neither [s] nor [k] occur anywhere else, either
before or after, and it has unique status as the only onset cluster. Frequency, then, is
significantnot only at the high end in terms of ascertaining what segments might
most commonly appear in scat vocalization, but also at the low end in terms of observing
what segments and/or combinations are drawn on only very rarely, to powerful effect.
Although the vast majority of syllables in the Heebie Jeebies solo are open (39/48 =
81), the identity and frequency of the attested coda consonants is shown in the chart
in (4). Of the 21 possibilities, only three appear, with [p] being the most common. Note
that there is no overlap at all in the identity of the consonants that occur as onsets
As post-vocalic [w] and [y] appear only in diphthongs, they are not counted as possible codas.

150

Scat syllables and markedness theory

[d, b, l, r, sk] and those that occur as codas [p, m, t]. This is patently not an inherent
characteristic of English (cf. words like pad, tab, mask, etc.), but will be seen to be a
common characteristic, particularly of obstruents, in scat.
(4) Codas: p =
b

t=

No Coda: /

Coda clusters:

m=

n
l

[]
r

(y)

(w)

A final question is whether any particular syllabic forms, from a wholistic


perspective, are preferred. In this 48-syllable sample, there are three favoured shapes:
nine tokens of [d], eight each of [diy] and [duw]. Aside from these, there is remarkably
little repetition of exactly the same phonological form in the residual 23 syllables. The
frequency counts of the particular scat shapes are given in the following table:
(5) Frequency/

Syllable form

(18.8)

(16.7)

diy, duw

(6.2)

d, dow

Frequency/

Syllable form

(4.2)

dp, daw, b

(2.1)

biy, bam, bp, duwt,


dey, la, rp, p, skiyp
skm, sk

Having established this body of generalizations about onsets (3), codas (4), and overall syllabic form (5), let us return to formally consider the properties of the introductory
four syllables: [e iyf gf mf]. Clearly the initial impression that these four syllables
are unusual in the context of the entire scat sequence is indeed validated. Three of the
four are onsetless, compared to only one of the 48 syllables that follow. The only onset
consonant, [g], is unique: this segment appears nowhere else in the full scat database
examined here. With respect to codas, note that there are no coda clusters anywhere
else in the work, whereas this introductory sequence ends emphatically with an [mf]
cluster. Moreover, the last three of these four syllables reiterate the coda [f]: not only
are codas relatively infrequent in the rest of the work (there are only nine codas in 48
syllables: 18.7), but the particular segment [f] is unattested elsewhere as either a coda
or an onset. Louis Armstrongs choice of such unusual scat form in this quadra-syllabic
bridge functions dramatically to grab the listeners attention as Armstrong moves from
the preceding English lyrics invoking everyone to cmon and do the Heebie Jeebies
dance to settle into the full-blown canonical scat syllables that follow.

2.1.2. Louis Armstrong, Hotter Than That (1929)


The second Louis Armstrong tune analyzed here is the much longer 165 syllable scat solo
from Hotter Than That (see Appendix 2.1.2), from which the excerpt cited earlier
151

Patricia A. Shaw

in (2) was taken. Whereas bars 4954, as seen in (2), draw exclusively on a sequence of
[b]-initial syllables, a full count of onsets throughout the solo shows that [d] (= 78) is in
fact used more frequently than [b] (= 57). [d] and [b] are by far the most prevalent onset
consonants, with [b] exceeding the next ranked candidate [w] by a difference of 49.
(6)

Onsets: d (), b (), w (), l (), r (), n (), m (), y (), h (), t ()
No Onset: /
Onset clusters: zw (), mw (), bw ()

Even in this work where the inventory of onsets stretches to 10 different segments,
consistent patterns recur. For example, the four onsets attested in Heebie Jeebies, viz. [d,
b, l, r] are all included within this larger set. Of the residual segments, all are attested
though with low frequencyin the other scat data investigated here, except [t]. The
occurrence of [t] as an onset is unique not only in this song (in the second syllable of the
otherwise uniform [d]-initial syllables in line BK), but also in the entire sample of scat
repertoire studied here.
Moving to a consideration of the onset clusters attested in Hotter Than That, we
encounter an interesting trio: [zw] (time 2:02), followed in the same line by [mw] and
shortly thereafter by [bw]. Not only are none of these found elsewhere in the present
database, but none of these /Cw/ sequences is part of the standard repertoire of English
syllable onsets. Louis Armstrong here is clearly deviating from the canonical constraints
on English well-formedness, and Native English listeners would, of course, attend to such
novelty immediately. The hypothesis to be advanced here is that such cases illustrate a
domain of tension between linguistic form and musical expression, where enhancement
of the latter is achieved by violation of markedness constraints on the former.
Consider next the coda inventory:
(7)

Codas: p (), t (), m (), (), n (), l (), g ()


No Coda: /
Coda clusters:

As was the case in Heebie Jeebies, most (76.4) of the syllables in this tune too are open.
Although there is somewhat greater segmental diversity in the coda repertoire, it is still
very limited: only six of the 21 possible consonantal codas are attested. There are no
coda clusters. Interestingly, the three coda segments ([p, t, m]) that appear in Heebie
Jeebies constitute a proper subset of the larger coda inventory here, with [p] again being
significantly more frequent (2.75 times more; 52.4 of coda attestations) than its closest
contender [t] (19.0). Three syllables in this work are realized exclusively as a syllabic
[]. Apart from these cases, [m] functions once as an onset (see (6)) and three times as
a post-vocalic coda. Interestingly all instances of [] follow a coda [w] or a [u] in the
preceding syllable. The shared labial gesture across this sequence is a kind of harmonic
pattern which recurs in various forms in other case studies below.
Although [d] is attested as an onset segment 20 more times than [b], when one
looks at which full syllable shapes recur most frequently, the two are pretty comparable:
For space reasons, for the rest of the discussion attested consonants will not be contextualized within
the full inventory of English as in (3) and (4), but will simply be listed in rank order of frequency.

152

Scat syllables and markedness theory

[ba] edges out [d] by a count of 16 to 15. [bi] in its variant realizations (i.e. with length
and/or homorganic glide) is tied with [da] at twelve occurrences each, then the favoured
[d] takes over in the next most frequent syllables [di] and [du].
(8) Frequency/

Syllable form

Frequency/

Syllable form

(9.7)

ba

(6.7)

di ~ di ~ diy

(9.1)

(4.8)

du ~ du ~ duw

(7.3)

bi ~ bi ~ biy, da

Note that none of the most common syllables here have front/back lax or front/back
mid vowels.

2.2. Chet Baker


Among the major scat artists through the decades, Chet Baker is renowned for the
extreme minimalism of the consonant set that forms the basis for his scat improvisations.
A comparison of different takes of the same tune, Everything Happens to Me, recorded
more than three decades apart (1955 compared with 1989), illustrates remarkable
consistency in the consonantal repertoire employed, despite major differences in the
melodic and rhythmic structure.
Transcriptions of the eight-bar scat bridge in these two versions are given in
Appendix 2.2.1 and 2.2.2. Although Jim Bastians transcriptions (labelled JB) and my own
(labelled PAS) differ in orthographic form, they are generally consistent in those features
relevant to the present focus. However, two domains of difference merit comment.
One pertains to vowel quality: Chet Bakers vocalization is extraordinarily mobile.
The looseness and fluidity of movement in Bakers vocalic articulation present significant
challenges, such that the transcribed values that I propose are at best an approximation of
a nuclear target range within the interconsonantal domain. What emerges most reliably
is a general pattern of lax quality (primarily [ ]) and the predominant openness
of syllabic form.
The second notable difference between Bastians notation and mine pertains to
consonants. Whereas Bastian remarks on the fact that Chets scat vocabulary made
predominant use of syllables beginning with the letter D (Bastian and Alexander 1995:
4), not all [d]s are distinctly articulated with a full stop closure. In a number of cases, what
is phonetically realized is the corresponding fricative []. For example, the AIF wave file
in (9) from the 1955 version (time = [2:37.62.38.4]) shows a sequence of two syllables,
Whereas Bastians orthographic interpretation is English-like, e.g. ee for [i], the transcription I offer
follows the principles in Appendix 1, with explicit representation of the more prominent glides but
otherwise just length on the tense vowels.
A discrepancy in bar 6 of the 1989 version is that JB documents 2 more syllables than I am able to
discriminate. The present analysis is based on my total count of 53 syllables vs. Bastians 55. However, the
strength of the generalizations is statistically robust, regardless of the difference in syllable count.

153

Patricia A. Shaw

the first with a clear [d] stop closure attack in comparison with the lack of full closure []
in the onset of the second syllable:
(9)

JB:
PAS:

d
d

eh

eh

This tendency is much more prevalent in the 1955 version, where of the 42 D
onsets, 13 are realized as []. In the 1989 version, only one of 45 Ds is. It is entirely
plausible that the phonological target in cases like the second consonantal onset in (9)
is indeed a /d/, as consistently represented in Bastians transcriptions, but that its lenition
to the smooth, non-punctuated continuant [] may reflect Chet Bakers airy, almost
weightless, romantic crooner style, disarmingly characterized as being sweet talked by
the void (Bastian and Alexander 1995: 4).
Invoking Sapirs (1933) psychological reality of the phoneme argument, the
hypothesis advanced here is that Bastians perceived D is interpretable as a more abstract
level of representation, i.e. phonemic /d/, and that its sometimes lenited non-plosive
phonetic realization as [] is a phonologically non-distinctive, surface level articulation.
Consistent with this interpretation is the broad-based generalization in 1 that [d] is part
of the standard scat repertoire; [] is not otherwise attested in any of the scat pieces by
other artists studied here. In the analyses that follow, then, Bakers [] articulations are
taken to be epiphenomenal and are not independently represented in his scat inventory.

2.2.1. Chet Baker, Everything Happens to Me (1955)


The onset repertoire of the early (1955) version of Everything Happens to Me reveals a
highly skewed frequency distribution:
(10)

Onsets: d (), y (), b (), h ()


Onset clusters:

No Onset: /
Ambisyllabic [t] coda/onset: /

Similar to what was seen in Louis Armstrongs rendition of Heebie Jeebies (2.1.1), where
[d] initiates 37 of the 48 syllables (77), here /d/ accounts for 77.8 (42/54) of the onsets.
Concomitantly, the relative infrequency of the residual segments raises questions as to
their distribution and functional load. The next most frequent onset is [y]; it occurs only
three times (3/54).
Interestingly, the distribution of these markedly less frequent segments is often
melodically significant. For example, both [h] and [b]which occur only once each
The evaluation of No Onset status is challenged by Bakers fluidity of articulation. Specifically,
there are six cases in Baker 1955 and three in Baker 1989 where a coda [t] precedes a syllabic []: as
the [t] is interpretable as an ambisyllabic transition creating an onset for [], these are not counted
as No Onset.

154

Scat syllables and markedness theory

appear in particularly prominent prosodic positions. Each is phrase-initial: the only


occurrence of [h] introduces the second major phrase in bar 3, and the sole instance of
[b], in the up-take into bar 7, initiates the final phrase of the scat bridge.
Summarized in (11), the coda inventory is even more minimal.
(11)

Codas: t (), (), n ()


Coda clusters:

No Coda: /

Combining the consonantal repertoires of (10) and (11), we see that Bakers 1955
improvisation utilizes a mere six segments from the full English set of 24 options: 25
of the available inventory.
As observed in the previous works, here too there is a strong preference for open
syllables (39/54 = 72.2). However, in contrast to Louis Armstrong, for whom [p] was
the most frequent coda, Chet Baker does not use [p] at all, in either of the two scat
performances examined here. Rather, his codas are exclusively alveolar [t, n], with [t] being
the more prevalent.
Somewhat parallel to the trans-syllabic gestural continuity of the feature [labial]
leading into [] in Louis Armstrongs Hotter Than That, there is a consistent homorganic
pattern observed in the distribution of [] in Bakers scat. Specifically, all instances of []
are immediately preceded by a homorganic coda [t]. Further, all cases of coda [n] or []
are followed by a homorganic /d/ onset of the subsequent syllable.
In terms of syllable shape, Chet Bakers preferred forms in this 1955 take are syllables
where his near-ubiquitous /d/ combines with a non-low, non-high lax vowel:
(12) Frequency/

Syllable form

(35.2)

d() ~ ()

(16.7)

d() ~ ()

2.2.2. Chet Baker, Everything Happens to Me (1989)


Although by no means identical in rhythmic, melodic, or expressive form, the 1989
performance of this same song is remarkably consistent in its consonantal inventory. The
most transparent difference in the onset repertoire is the fact that [b], used only once in
the 1955 version, is completely absent in the 1989 take.
(13)

Onsets: d (), y (), h ()


Onset clusters:

No Onset: /
Ambisyllabic [t] coda/onset: /

As seen in (13), the prevalence of /d/ in the 1989 version emerges as even more
disproportionate, accounting for 85 (45/53) of the onsets. Clearly, /d/ in and of itself
constitutes the core of Chet Bakers consonantal inventory. Again, where another segment
is used by Baker, it functions through its very uniqueness to demarcate a prosodically
For example, a very rudimentary comparison shows the opening bar in the 1955 version has 6
syllables moving from Ebm to A b+ towards Db , whereas bar 1 in the 1989 recording has 10
syllables moduating from Fm through Bb towards Eb . (Note: = major 7).

155

Patricia A. Shaw

prominent position. Thus, the sole occurrence of [h] introduces what is arguably the most
prosodically salient position: the very first syllable of the first phrase of the scat bridge.
In the 1989 version, Bakers sparse and tightly restrictive treatment of codas is
remarkably consistent with his 1955 repertoire, though their particular distribution in the
scat melodic lines is entirely divergent.
(14)

Codas: t (), (), n ()


Coda clusters:

No Coda: /

As documented in (14), the same 2 segmental values are attested as in Bakers 1955 coda
chart in (11). Again, all three instances of [] are introduced by a dual function coda/onset
[t], and are followed by a homorganic onset [d].
Given the fluid mobility of Chet Bakers vowel articulations, a characterization
of his favoured syllable shapes unequivocably identifies an open syllable with a [d]
onset but is much less definitive in terms of vowel quality. Most generally, as in the 1955
version (see (12)), his articulation meanders around a mid lax vowel, either schwa [] or
a neutral position [], identified for English as the characteristic articulatory setting for
the onset of speech (Chomsky and Halle 1967). However, on notes of longer duration, his
resonant crooning often ascends to a tenser high back [u]. Based on the transcriptions in
Appendix 2, there is considerable consistency between the 1955 and 1969 versions in terms
of a frequency of use ranking:
(15) 1955 Frequency/

Syllable form

1969 Frequency/

Syllable form

(35.2)

(17)

(16.7)

each (13.2)

d, du, d

However, as is evident from the lower frequency numbers and the three-way tie for
second place in the 1969 count, there are no strongly identifiable constraints on his wideranging vocalic diversity.

2.3. Betty Carter


Even as the repertoire of scat vocabulary expanded through the creatively explosive bebop
rush of the 1940s, [d] and [b] remained particularly prominent. For example, although
Betty Carter was a major innovative force in extending the repertoire of jazz vocables,
Bauer notes that in Carters short scat solo in Babes Blues (1958), of the nine consonants
which are used as syllable onsets, /b/ and /d/ together initiate more than half of the
vocable classes used in the solo (2002b: 312). Other Betty Carter songs attest to this
same generalization: in my count of the 197 syllables in her 36-bar scat solo rendition of
Youre Driving Me Crazy (1958; transcribed by Bauer 2002a: 252254), the most frequent
onset consonant is [d] (in 80 of the 197 syllables) and the next most frequent is [b] (in
36 of the 197 syllables). Thus, although Carter uses 10 different consonants as onsets in
My commentary on Betty Carter is deeply indebted to Bauers (2002a,b) insightful and superbly
documented interpretation of her life and work.

156

Scat syllables and markedness theory

this solo, the two segments [b] and [d] together comprise the majority (58.9) of onset
choices. In the following sections, we look at two of her other tunes to broaden the base
of comparison further.

2.3.1. Betty Carter, Thou Swell (1955)


Recorded the same year (1955) as the early version of Chet Bakers Everything Happens to
Me that was considered in 2.2.1, Betty Carters scat rendition of the original 1927 classic
Thou Swell draws on the following inventory of eight consonants as simplex onsets (see
Appendix 2.3.1).
(16)

Onsets: d (), b (), l (), y (), w (), h (), r (), ()


Onset clusters: ly (), dl (), sp ()
No Onset: /

Constituting a combined total of 75/114 (=65.8), the consonants [d] and [b] are
reaffirmed as incontestably at the core of Cartersand everyone elsesstock of scat
resources. Although less frequently drawn on, the consonants [l, r, y, w, h] are all familiar
as staple scat segments that have been attested in the work of Louis Armstrong and Chet
Baker examined in the preceding sections.
The innovative element in (16) is Carters once-only exploitation of [] (bar 13,
coupled with the unique attestation of [r] in the sequence [iy ra]). The use of [] is rare in
Betty Carters scat, although it figured prominently in the influential repertoire of Sarah
Vaughan and became a flagship marker of 1950s doo wop motifs like shoo bee doo and
sha na na. Despite the collective recognition among jazz artists of certain segments
being standard communal property in the scat arsenal, other specific sounds acquired the
status of individual trademarks. Carter reportedly admonished a young vocalist in 1978:
Why are you using scat syllables like shoo-bee-doo-bee? Those belong to Sarah, and
they belong to the fifties. (Berliner 1994: 254, 804, cited by Bauer 2002b: 314315) At the
heart of improvisional creativity in music, as in language, is the challenge of innovation
under the constraints of structural limitations, critically the inventory of segments and
restrictions on their combination. Given the very small set of sounds that came to be
established as the conventional scat inventory in the works of the early artists, to then
have certain consonants among these evolve into sound symbolic associations with a
particular singer and/or decade effectively heightens the challenge for new artists to
create an individualistic scat voice.
Onset clusters are generally quite rare in scat. Of the four that occur in this work,
only one [sp] conforms to standard well-formedness constraints of English. The other
two, [ly] and [dl], draw on segments that are very common in the scat inventory of onsets,
but in bundling them into tauto-syllabic onset sequences Carter pushes beyond the
canonical bounds of regular English. Just as [] became a Sarah Vaughan scat trademark,
the [dl] onset is a strong candidate for a Betty Carter signature: jumping ahead 27 years
It was from the vocals in the Silhouettes 1957 hit song Get a Job that the 50s revival group, Sha Na Na,
took its name.
What a Little Moonlight Can Do (1982) Whatever Happened to Love? Verve/Polygram 835 6831; see
transcription by Bauer (2002a: 310343).

157

Patricia A. Shaw

to her 1982 recording of What a Little Moonlight Can Do, this same highly marked onset
appears eleven times, most strikingly in a sequence of six syllables in the climactic scat
line of bars 189190 (WB line as transcribed by Bauer 2002a: 317; transliteration (2nd
line) as in Appendix 1):
(17)
WB: | weh dlow dlow | dl dle dle | dlow dow | ...
| w dlow dlow | dl dle dl | dlow dow | ...

Carters usage of codas in Thou Swell is infrequent, as seen in (18), and the observed
patterns are familiar. She draws strictly on the resonants [m, n, l]. Both of the syllabic
segments are alveolar, and follow a homorganic onset [d].
(18)

Codas: m (), (), l (), ()


Coda clusters:

No Coda: /

The syllable shapes which surface most frequently in this piece are not at all
surprising either:
(19) Frequency /

Syllable form

Frequency / Syllable form

(13.2)

(10.5)

ba

(12.3)

duw

(8.8)

In sum, despite the creative uniqueness of how her scat artistry uses them, Carters
arsenal of tools as represented by this acclaimed 1955 performance draws on a markedly
standard repertoire.

2.3.2. Betty Carter, Open the Door (1979)


Based on his intimate and broad-based musical insights into the full body of Betty Carters
relatively small recorded output, Bauer (2002a: xi) contends that the defining features
of Carters style remained consistent even as her approach kept changing. From the
linguistic perspective of the present study, a comparative analysis of the scat interludes in
Carters 1979 version of Open the Door (see Appendix 2.3.2), recorded 24 years later than
the 1955 work discussed above, reveals a tightly focussed phonological repertoire. The
five onset segments that appear in the 1979 version, documented in (20), are a subset of
the eight that were used in Thou Swell (see (16)).
(20)

Onsets: d (), y (), w (), l (), h ()


Onset clusters:
No Onset: /

Notably absent from the attested onsets in (20) is [b]. However, the ubiquitous scat onset
[d] is not only present, but strongly dominant, introducing 19 of the 27 syllables (= 70.4).
The other onset segments here, viz. [l, y, w, h], are all scat basics, not just in Carters
earlier work, but in that of other scat vocalists.
Of particular interest in (21) is the total absence of post-vocalic coda consonants.

158

Scat syllables and markedness theory

(21)

Codas: (), ()
Coda clusters:

No Coda: /

The only syllables that are not open CV or CVG structures are the 3 cases where
there is a syllabic nasal. A comparison of the first three scat lines (cf. bars 9, 14, and 16,
respectively, in Appendix 2.3.2) reveals a striking and doubtless strategic parallelism of
form and function where these three syllabic nasals occur. Specifically, each is in absolute
phrase-initial position of the first three scat lines, with each new cycle entailing some
minimal variation from the preceding one: labial [] in the first phrase shifts place of
articulation to alveolar [] in the second phrase, which itself is repeated in the third line
but differentiated by the introduction of an [h] onset. Abstracting away from rhythm,
duration, and pitch, the segmental content of these three lines is reproduced below:
(22)

d dow ...
duw duw duw ...
h duw duw diy duw ...

What this short prosodic progression illustrates is that far from scat being comprised
of randomly articulated sequences of a delimited set of nonsense syllables, the skill of a
brilliant scat artist like Betty Carter entails masterful structuring of content and sequence:
here, each nasal syllable introduces an iteration of exclusively [d]-initial syllables, and
each line builds substance and momentum by adding one more syllable.
Finally, in determining which syllable shapes are most prevalent, there are two that
clearly emerge as most frequent:
(23) Frequency/

Syllable form

(29.6)

du(w)

(25.9)

dey

While [du(w)] figures prominently in the repertoire of her other work (cf. (19)) and that
of the other singers sampled here, [dey] is less favoured, though not unattested (cf. (5)).

2.4. Syllable Structure Generalizations


Having documented specific aspects of syllable content and form in two different works
from each of three renowned jazz vocalists, spanning the 63 years between 1926 and
1989, we are now in a position to determine what generalizations, if any, hold across this
sample, despite each artists highly individualistic musicianship and distinctly unique
approach to the idiom.
The initial question posed in 1 was to what extent the delimitation of onsets to [d]
and [b], as exemplified by the brief excerpts in (1) and (2), is representative of a broader
database of scat. The onset tabulations from each previous section (viz. (3), (6), (10), (13),
(16), (20)) are summarized in the table in (24) below. Note in (24) that the frequencies
of [d] and [b] are given both as a token count and as a percentage value of the number

159

Patricia A. Shaw

of scat syllables in each piece. There are three particularly interesting facts revealed by
these results. First, none of the six works studied hereincluding the full texts of each of
the classic performances from which (1) and (2) were drawnuses exclusively [d] and/or
[b] onsets. In every case, the vocalist has chosen some scat syllables with other onset
consonants, however minimal this extended range may be. For example, in cases like
Louis Armstrongs 1926 recording of Heebie Jeebies (2.1.1) and Chet Bakers 1989 version
of Everything Happens to Me, there is only one occurrence of each of two other onsets.
(24) Simplex Onsets: comparative usage by different scat vocalists

2.1.1. Armstrong 1926

37 = 77.1

5 = 10.4

2.1.2. Armstrong 1929 78 = 47.3 57 = 34.5

2.2.1. Baker 1955

42 = 77.8

2.2.2. Baker 1989

45 = 84.9

2.3.1. Carter 1955

40 = 35.1

2.3.2. Carter 1979

19 = 70.4

Total syllables: 461

261 = 56.6 98 = 21.3

35 = 30.7

= 1.9

At the other end of the spectrum, Louis Armstrongs Hotter Than That employs the
greatest diversity: ten different consonants. A further observation is that there are only nine
consonants other than [d] and [b] which comprise the full set of onsets that are collectively
utilized by these artists. Together, these latter two facts affirm the initial premise of this
research: of the full complement of 24 consonants that can potentially function as syllable
onsets in English, scat draws on a very limited, and largely recurrent, subset.
The third conclusion that emerges from (24) is that there is a consistent asymmetry
in the relative frequency of [d] over [b]. In two of the songs (2.2.2, 2.3.2), there is no [b]
at all; in a third (2.2.1), there is a single attestation; in the remaining three, though the
degree of imbalance differs, the direction of difference is constant.
In contrast to the robust generalizations about simplex onsets, the usage patterns
with respect to complex onsets, as summarized in (25), do not at all cohere.
(25) Complex Onsets: comparative usage by different scat vocalists
sk

2.1.1. Armstrong 1926


2.1.2. Armstrong 1929

ly, sp, dl

6.25

1.8

---

2.2.2. Baker 1989

---

3.5

---

2.17

2.3.2. Carter 1979

, ,

s with CmplxOns

2.2.1. Baker 1955


2.3.1. Carter 1955

160

zw, mw, bw

, ,

Scat syllables and markedness theory

Clusters appear in only three of the six pieces, with an extremely low frequency count
(averaging just over 2). Significantly, there is no overlap at all in the specific clusters
used in each of the works, even by the same singer. Moreover, with respect to the identity
of segments involved in these clusters, it is patently not the case that these sequences
are compositionally built from the simplex onset consonant inventory: [k], [p], [s], and
[z] in the clusters of (25) are not part of the repertoire of (24). Most striking is that a
majority (5/7) of the attested clusters violate canonical English patterns: although all of
the individual segments involved are legitimate potential simplex onsets, none of [zw],
[mw], [bw], [ly], or [dl] conform to standard well-formed sequences in English. Across
these diverse observations, there is in fact a consistent generalization, namely: complex
onsets are highly marked. In terms of frequency they are rare, and in terms of content
they are often exceptional.
Consider now the properties of codas, summarized in (26). Whereas the excerpts
in (1) and (2) in 1 were comprised exclusively of open syllables, this generalization does
not hold of any single work considered in its entirety. Nonetheless, open syllables are
unequivocably dominant, ranging from 94 to 72 in individual works and with the
overall average being 82.65.
(26) Codas: comparative usage by different scat vocalists
p

2.1.1. Armstrong 1926

2.1.2. Armstrong 1929

2.2.1. Baker 1955

2.2.2. Baker 1955

74.5

72.2

86.8

93.9

88.9

s with no Coda
81.25

2.3.2. Carter 1979

2.3.1. Carter 1955

Totals:

/ 82.0

Moreover, there were no complex codas. With respect to segmental identity, of the 21
potential English coda consonants, only six different segments appear. Compared with the
inventory of scat onsets in (24), it is interesting to note that there is overlap in the resonant
repertoire /m, n, l/, but complementarity in the obstruent stops: onset /b, d/ vs. coda
/p, t/. Once again, Louis Armstrong is the king of segmental diversity in his Hotter Than
That rendition (2.2.1), which draws on seven different codas, whereas the other artists
employ a much more restricted range of between two and four. Across the artists, the most
favoured segments are [t] and [n/], although Armstrongs clear personal favourite is [p].
Finally, consider in (27) the generalizations that hold regarding the overall form
of scat syllables that are used by these diverse singers. Only syllables which occurred at
least three times, and with greater than 7 frequency in each song are included in the
Thus, the unique instances of onset /t/ and coda /g/, both in Armstrongs Hotter Than That (2.1.2)
appear anomalous: the /g/ in terms of both place and voicing, and the /t/ in terms of voicing.

161

Patricia A. Shaw

table below. Because the overall syllable count differed considerably across the different
selections, the most frequent syllables for each artist are simply ranked, with 1 being
the most frequent. Ties are represented by the same number. The scale descends for
each artist, but may stop at either 2, 3, or 4 depending on the actual frequency values (as
detailed in the corresponding tables in each individual section above). Thus, for example,
for each of Chet Baker in 2.2.1 and Betty Carter in 2.3.2, the very high frequency of
two particular syllables results in no others exceeding the criterion level.
(27) Syllables: comparative usage by different scat vocalists (<7, Ranked =high)
d du(w) ba

2.1.1. Armstrong 1926

2.1.2. Armstrong 1929

2.2.1. Baker 1955

da bi(y) di(y) d dey b

2.2.2. Baker 1989

2.3.1. Carter 1955

2.3.2. Carter 1979

Comparing the scat choices in the two different recordings by each vocalist shows Baker
to be the most consistent, despite the 34-year time interval between these performances:
[d] and [d] rank 1 and 2, respectively, in both. In contrast, in each of the two recordings
by Armstrong and Carter, different syllables rank 1 and 2, and for each of them, the topranked syllable in one of their pieces does not even reach criterion in the other, viz. [ba]
in Armstrong, and [b] in Carter. Clearly, there is no single favoured syllable shape: in
this sample of six scat performances, there are five different syllables that emerge as the
most frequently used in any given piece.
Nonetheless, the tabulations in (27) provide striking confirmation of the two
generalizations originally observed in the brief Louis Armstrong extracts in (1) and (2).
First, all these favoured syllables are open. Secondly, they all start with [d] or [b]. The
over-all favourite is [d], with [du(w)]as in doo-wopcoming in second.

3.

Beyond Bebop

As bebop morphed into hard bop and doo-wop in the 1950s, and classical jazz of the
previous decades diversified under multiple influences, particularly R&B and the explosive
impact of rocknroll, jazz scat began to wane in popularity. A few exemplary jazz vocalists
continued the bebop scat tradition, but other genres had come to dominate the pop music
scene. Not until the uniquely versatile creative talents of Al Jarreau and Bobby McFerrin
emerged in the late 70s did vocal improvisation once again top the charts.
The next two tunes come not from core vocal jazz repertoire, but from the heart
of the early 60s Hit Parade era, in the decade following the bebop heyday. What makes
these works substantially different from the preceding ones is that scat is formally scripted
into the lyrics, not improvised: in the first example (3.1), scat syllables are explicitly
162

Scat syllables and markedness theory

referenced in the English text, and in the second example (3.2), more extensive scat
lines alternate with English. These case studies are of interest in two respects: first, for
interrogating the extent to which the segmental content and shape of these select scat
tokens conform to the generalizations established for the classical scat vocables examined
in the preceding vocal jazz tunes; and secondly, for the insights that this phenomenon
provides from a historical perspective on the evolving diversification of the cultural
impact of scat. Despite bebop itself having shifted out of the popular mainstream at that
time, the fact that very young creative songwriters chose to incorporate scat syllables
into their lyrics in the 1960s reflects its strong formative influence on their own musical
identities and its enduring legacy in the broad-based musical culture of the era.

3.1. Barry Mann, Who Put the Bomp? (1961)


The infectiously popular music and words of this 1961 hit were co-written by Barry Mann
and Gerry Goffin, with Barry Mann as the original recording artist. Because the lyrics
here are not improvised, but rather are composed in conformity with a tightly structured,
fixed melodic and rhythmic framework, the methodology of previous sectionsnamely,
a frequency count of attested segmental tokens in a stream of spontaneously improvised
scatis less revealing than simply the inventory of segments and syllable shapes that are
drawn on. That is, what is particularly significant is just which scat vocables are chosen
for the lyrics, as this very choice implies that these particular forms already (in 1961) had
significant currency in the general public domain as cool and hip.
Archetypal and high-profile scat syllables here (see Appendix 3.1) include the [u]
~ [] attributed to Sarah Vaughan (2.3.1), the [dp] that surfaces as early as Heebie Jeebies
(see (5)), as well as the [bap] that not only persists to this day as the name of the genre, but
that had become the basis of Betty Carters moniker: Betty Bebop. The rhythmically
alternating syllables [b] (line 2) and [d] (line 8) are clearly canonical scat form, adhering
to both the preference for [b]/[d] onsets and No Coda (open syllable).
Although it was noted in every improvisational jazz sample investigated earlier that
open syllables were much more frequent than closed syllables, a superficially inconsistent
observation is that the reverse is the case in Who Put the Bomp?. What this illustrates, I
would suggest, is the potential over-riding effect of prosodic constraints when scripting
lyrics to a fixed melodic line and rhythmic beat. The lyrics for the lines with the scat
syllables [bam], [bap], [dp] are basically structured as follows, with the CVC closed
syllables out-numbering the open CV syllables four to two. Each of the underlined
syllables in (28) is directly synchronized with a rhythmic beat.
(28)

Who put the [CVC]


in the [CVC] [CV] [CVC] [CV] [CVC] ?

Because closed CVC syllables are prosodically heavier, aligning a closed syllable
with a rhythmically strong position functions to enhance the prominence of the beat.
Barry Mann and Gerry Goffin were 22 when their co-written success Who Put the Bomp? was released,
and teen idol Johnny Cymbal was 18 when he wrote and recorded Mr. Bass Man.
This was Lionel Hamptons nickname for her, despite her expressed dislike of it (Bauer 2002a: 45).

163

Patricia A. Shaw

Note that the initial who [huw], even though open, is also heavy by virtue of the long/
tense diphthong. Further enhancing the strong rhythmic stability of these lines is the fact
that the light open scat syllables [b], [], and [d] are never aligned with the beat. While
this kind of prosodic alignment of heavy syllables with positions of rhythmic prominence,
and the complementary preference for light syllables in weak rhythmic positions, most
certainly occurs in improvised scat as well, it would appear to be a significantly less
dominant factor, perhaps since rhythm itself is also subject to improvisation.
Of further interest in the lyrics of this song is that there is a category of forms that
are neither standard English lexical items nor syllables that conform to the characteristics
of scat. Concatenations of essentially semantically empty compounds that rhyme or
alliterate, such as rama lama, or that carry some onomatopoetic value like ding dong,
or that live on a hip fringe of the English lexicon like boogity boogity were also drawn
from the pop music scene of the 50s, namely the Edsels major doo-wop hit Rama Lama
Ding Dong, originally released in 1958 on Dub Records and re-released on Twin Records
in 1961, and the Quincy Jones composition Boogity Boogity, recorded on Milt Jacksons
1958 album Plenty, Plenty Soul. Unlike scat, these sequences each pattern basically as a
lexicalized unit, without independent freedom of realization of the constituent syllables.
The form of all 3 of these expressions is essentially reduplicative, with the nature of any
deviance from full identity falling directly within recognized cross-linguistic patterns
of reduplication (e.g. Moravcsik 1978, McCarthy and Prince 1986, Hurch 2005). Finally,
based on the generalizations established in 2, some of the segmental content in these
examples falls markedly outside of that found in core scat, viz. the [] codas in ding
dong, and the [g] onset in boogity.
In sum, Who Put the Bomp? is highly syncretic in its explicit references to many
of the rapidly evolving musical influences of the era. The lyrics integrate unmistakably
identifiable scat syllables from the classical vocal jazz tradition, with references from the
rhythm and blues progression into doo-wop, along with the blues-based modern jazz
sophistication of Quincy Jones and Milt Jackson. What this tells us is that although the
pure jazz scat genre itself isnt charting in the mainstream at this point in time, it remains
a major foundational force in the broader musical scene. Moreover, of all the diverse
genres referenced in these lyrics, it is a scat line that is attributed with ultimate success
in the conquest of love: When my baby heard bam b b bam b bam b bam bam, every
word went right into her heart...

3.2. Johnny Cymbal, as recorded by Sha Na Na, Mr. Bass Man (1963)
The second example illustrating the continuing legacy of scat in the pop scene of the early
60s is Johnny Cymbals signature song, Mr. Bass Man. Sha Na Nas re-recording of it in
Although the Edsels, like the ill-fated car model they named themselves after, were defunct as a
group by the time their version of Rama Lama Ding Dong rose to prominence on the national charts,
the song itself attained significantly greater longevity as the title song of Sha Na Nas 1980 album.
Note too that in the historical context of the 50s Ding Dong itself carried an established frame of
reference from the title and lyrics of Louis Armstrongs early 1930s hit, Im a Ding Dong Daddy From
Dumas (on The Best of Louis Armstrong and His Orchestra: 1930-31. Classics B000001NJB).

164

Scat syllables and markedness theory

1980 stands both as a tribute to its enduring popularity and as a major contribution to
ensuring its continued exposure to subsequent generations. The transcription in Appendix
3.2 is based on the Sha Na Na version, and differentiates the scat lines that are sung by Mr.
Bass Man himself (abbreviated BM in Appendix 2) from the fledgling attempts of the
Wanna-be guy (abbreviated W in Appendix 2) who sings, following line 9, I wanna be a
bass man too. Interestingly, this separation reveals some fascinating differences.
As seen in (29), Mr. Bass Man himself uses exclusively [b] onsets. In contrast,
the majority of Wanna-bes onsets are [b], but his inventory also includes a substantial
number of [d]s and [y]s, both of which accord with the standard scat onsets documented
in (24). Although [] is not included in (24), its absence is directly attributable to the
transcriptional principles outlined in Appendix 1, so the two attestations of [] here are
not anomalous. The unique occurrence of [s] at the beginning of line 5 is odd, given the
generalizations of (24), but may be explicable as perseverance of the final sibilant of the
immediately preceding word songs, across the juncture from English lyrics to scat.
(29) a. Mr. Bass Mans scat lines (including back-up line and joint BM/W lines):
Onsets: b ()
Onset clusters:
No Onset: /
b. Wanna-bes scat lines:
Onsets: b (), d (), y (), (), s ()
Onset clusters:
No Onset: /

Not only is the greater diversity of segments in the novices attempts of interest, so
too is the distribution of these segments. For example, in three lines (lines 5, 6, 17), Wannabe switches in mid-sequence from [d]-onsets to [b]-onsets (significantly, a switch to the
correct target), but never does he switch in the opposite direction. All other lines are
either exclusively [d] (lines 13, 14, 25, 26) or exclusively [b] (3, 7, 10, 12, 22, 24, 29).
There is also a marked discrepancy in coda patterns between Mr. Bass Man and
Wanna-be. Mr. Bass Man uses exclusively [m]/[] codas, whereas Wanna-be models
[m] most frequently, to be sure, but he also draws on the 3 most favoured scat codas that
were documented in (26): [p, t, ]. Nonetheless, note that Wanna-bes very last solo line
achieves perfect canonical form as defined by Mr. Bass Man: exclusively [b] onsets and
exclusively [m] codas.
(30) a. Mr. Bass Mans scat lines (including back-up line and joint BM/W lines):
Codas: m (), ()
Coda clusters:
No Coda: /
b. Wanna-bes scat lines:
Codas: m (), t (), p (), ()
Coda clusters:

No Coda: /

Although Wanna-be uses a broader inventory of both onsets and codas, these
segments are significantly constrained in their distribution, in that a consistent pattern
of syllable-internal consonant harmony obtains with respect to place of articulation in
closed syllables. That is, a labial [b] onset is followed by a labial [m] or [p] coda, regardless
of the vowel quality in the nucleus, e.g.: bam, bum, bm, bom, bm, bp. Similarly, an

165

Patricia A. Shaw

alveolar [d] onset is closed by [] or [t]. Given that none of the other onsets /y, , s/
occur in closed syllables, this generalization regarding intra-syllabic consonant harmony
holds throughout the entire work.

4.

Explanatory Hypotheses

The analyses of these several examples of scat show that, across the diversity of musical
styles and individual expressions, the repertoire of sounds and syllable shapes is
remarkably consistent and extremely limited in comparison to the extensive range of
segments and combinatorial possibilities that are used in English, let alone available
within the articulatory range of the human vocal apparatus. To address the question of
what might account for these patterns, three hypotheses are explored: that vocal scat
is essentially imitative of instrumental jazz (4.1); that the repertoire of sounds in scat
are constrained by phonological markedness theory (4.2); and that scat production is
subject to independent constraints on musical form and vocal performance (4.3).
Although each of these, among other cognitive and performative factors, doubtless
contributes to shaping the output of scat, the argumentation to follow suggests that
specific tenets of phonological markedness theory interacting with the melodic imperative
for a voice line to carry pitch contribute substantially to broadening our understanding of the
attested patterns.

4.1. The Imitative Hypothesis


A number of theorists within the musical literature have hypothesized that scat vocalization
is essentially imitative of jazz instrumental expression. For example, Robinson (2002: 515)
attributes the origin of scat to singers imitat[ing] the sounds of jazz instrumentalists.
Bauer (2002b: 303) cites Milton Stewart (1987: 65, 68, 74) as showing that the vocables
used by such notable exponents of scat as Ella Fitzgerald and Sarah Vaughan often mimic
the tonguing, phrasing, and articulation of instrumentalists. Stoloff (2003: 4) notes that
Louis Armstrong, like many other instru-vocalists who followed, unconsciously used
scat syllables that emanated from his trumpet style. The core question in considering
the Imitative Hypothesis is to what extent such comparisons are based on essentially
arbitrary associations, as opposed to qualities of instrumental sound production that
are directly reproduced in the choice of consonants and vowels in a scat syllable. That
is, are there consistent, independently verifiable articulatory correlations between an
instrumental rendition and a particular scat vocalization? Or, like the arbitrariness of
the sound-meaning correspondences in natural language, is the seemingly imitative
association based on fundamentally arbitrary, conventionalized interpretations?
One type of case is illustrated by the fact that sometimes hand gestures lent an
explicit instrumental identity to the vocables. Stoloff (2003: 5) points out that Ella, for
example, often used trombone-like hand motions while scatting du-wah type syllables.
All 3 instances have the same vowel: [dt].

166

Scat syllables and markedness theory

The question here then is whether there is anything inherent in the phonetic properties
of the syllables du-wah [du wa] that is uniquely representative of the production or
perception of trombone sound, or whether the explicitly iconic identification established
by Ellas hand gestures substantially contributes to creating a conventionalized significance.
Weighing against a one-to-one interpretation of the Imitative Hypothesis is the fact, noted
earlier in (27), that [du(w)] is the second most frequent syllable used by Louis Armstrong
in Heebie Jeebies, Chet Baker in Everything Happens to Me (1989), and Betty Carter in
Thou Swell. In other words, the documentation in 2 establishes that throughout the scat
repertoire, [du(w)] is simply an extremely common syllable. What seems most plausible, then,
is that a du-wah/trombone sound-meaning connection evolved into a conventionalized
relationship, with the explicit interpretive overlay of hand gestures contributing significantly
to establishing this as a semi-lexicalized associative correspondence.
A second type of case exemplifying the frequent interpretation of scat as directly
representative of instrumental effects is illustrated by Robinsons (2002: 515) identification
of the following line from Louis Armstrongs Hotter Than That (1927, OK 8535) as one
which illustrates his clear imitation of a trumpet rip:
(31)

From L. Armstrong Hotter Than That (1927); transcription J.B. Robinson:

A basic question here is: How much of the interpretation of this phrase being a
trumpet rip follows from the initial monosyllabic identity tag rip? First, the research
documentation in 2 establishes that rip is not in the common inventory of scat syllables.
In fact, it is a unique attestation in the database of 461 scat syllables. Secondly, rip is
a recognizable English word, with a particularized semantic interpretation specifically
within the jazz lexicon. Thirdly, this word is positioned strategically at the very beginning
of the scat sequence that is interpreted by Robinson as a trumpet rip. In terms of
perceptual salience, initial position is the locus of greatest prosodic prominence in the
phrasal domain. Moreover, note in (31) that rip bears the highest pitch level and its
rhythmic value (a quarter note) is twice the value of each individual note in the sequence
of eighth notes that follows. Collectively these prosodic cues of position, pitch level, and
duration converge to focus the listeners attention on this entry, which is realized not
by a familiar scat syllable, but rather by the lexically informative label that this is a rip.
Finally, a complementary question stemming from Robinsons characterization of this
sequence as a trumpet rip, is whether there is anything in the choice of the particular
scat syllablesindependently of the lead signifier ripthat is uniquely associated with a
trumpet, as opposed to a sax, bass, or any other instrument. Again the collective evidence
in 2 establishes that the specific syllables that follow rip in (31) are all unequivocably
canonical scat, used by a diversity of singers across a diversity of melodies, chord
progressions, tempos, and rhythms.
Nonetheless, the fact that it is Louis Armstrong himself, one of the most virtuoso
jazz trumpeters of all time, who is scatting in (31) unquestionably establishes an association

167

Patricia A. Shaw

between his vocal and instrumental expression. Of course, a particular musicians primary
instrumental identity would not preclude scat excursions into imitative or evocative effects
of other instrumentation. However, one might ask: given that Chet Baker and Louis
Armstrong are both jazz trumpeters and scat vocalists, is there any significant parallelism
between them in the choice of scat repertoire? Comparison of their use of onsets in the
chart in (24) and of codas in (26) not only provides a distinct profile for each, but also
establishes no greater similarity between them than between either one of them and
Betty Carter, who was not a trumpet player. In short, the research evidence here argues
that the specific choice of scat syllables for each of these performers follows a canon
of phonological constraints on scat repertoire that are independent of trumpetor any
otherinstrumental realization. Most fundamentally, I would submit, it is the musical
individuality of each of these artists and their unique creative mastery of the cognitive
systems involved that transcends defined conventions on the essential form of notes and
syllables, and systemic constraints on their patterning.
However, to explore the empirical bases of the Imitative Hypothesis yet further,
consider commentary such as that advanced by Stewart (1987: 6566), who interprets
Ella Fitzgeralds 1949 performance of Flying Home as follows:
Fitzgerald alternates the bilabial b and p plosives with the lingua-alveolar d
plosives. The b and p sounds are formed similarly to the sounds of jazz wind
instruments, which sound by the release of built-up mouth air pressure onto the
reed, while the d sound is similar to the tonguing on jazz brass instruments.

On the basis of a phonological model of natural language sound production, my


hypotheses about the articulatory correlations entailed in initiating and modifying air
flow on reed and brass wind instruments differ from Stewarts. Specifically, pitch-based
sound on a trumpet or any other brass instrument is produced by bilabial constriction:
labial is the primary articulator. As well, tonguing effectsmost commonly coronal,
but also dorsalfunction significantly to modify the stream of sound in terms of attacks,
closures, trills, duration, phrasing, tonal quality, etc. Less frequent, but certainly available
within the repertory of articulatory modifications, are uvular and laryngealization effects.
Consequently, under an articulatorily-based Imitative Hypothesis, trumpet-denotative
scat would liberally draw on a inventory of both labial and coronal consonants, but
could also include other articulatory effects. In contrast, in producing the primary sound
on a reed instrument, like a sax or clarinet, the players lips and upper teeth hold the
mouthpiece: although lip compression can modify pitch, tone, or timbre, labial is not
a primary articulator in the way that it is with brass instruments. However, the range of
tonguing effects and other articulatory modifications would be similar. The Imitative
Hypothesis implication that follows from this comparison would be that sax- or clarinetimitative scat should have no [p]s or [b]s (contra Stewarts interpretation above), whereas
brass-imitative scat could. Essential to testing such articulatory-modeling claims would
be a body of data where the intentionality of the scat singer is unambiguous. As none
of the references drawn on here provide adequate documentation to explore these
hypotheses more definitively, they are left for future research.

168

Scat syllables and markedness theory

In summary, despite various approaches to the hypothesis that scat vocalization is


essentially imitative of jazz instrumental expression, what has been shown is that there
is in fact little empirical evidence to sustain a non-arbitrary relationship in the form of
realization across the two modalities. Moreover, compared with the huge range of distinct
combinatorial possibilities in jazz instrumentation, whether articulated by mouth, hand,
valve, slide, bow, or mallet, the exceedingly small set of segments in the core repertoire of
scat presents a striking contrast. What the Imitative Hypothesis fundamentally fails to
explain is why the rich diversity of instrumental sound is not more extensively mirrored
in scat. The possible articulatory range of the human vocal apparatus far exceeds what
is found in human language systems, let alone in scat. Moreover, even the much more
limited range of segmental and combinatory possibilities in the English phonological
system significantly exceeds what is found in scat. The fundamental question then is
what hypotheses might offer a more insightful and constrained explanation for the small
and remarkably consistent inventory of segments and syllable shapes that characterize
scat. In the next section it is argued that phonological markedness theory constitutes a
productive basis of inquiry.

4.2. Markedness Theory


From a linguistic perspective, the framework of phonological markedness theory
embodies a number of hypotheses against which these empirical generalizations about
scat can be evaluated. It is markedness theory that negotiates the interface of fundamental
questions regarding linguistic diversity vs. universality, seeking to understand across
the manifest differences of human languages just what properties of language may be
universally attested, what properties may be correlated with or implicated by another
property, and what properties are rare or may in fact never be attested. The basic premise
to be evaluated in the context of specific constraints identified in the discussion to follow
is that the phonological form and content of scat are relatively unmarked along various
diverse, independent measures of markedness.

4.2.1. Markedness Hypotheses about Syllabic Shape


Consider first syllabic form. Evidence from several diverse domains of natural language
cross-linguistic studies of canonical syllable structures, phonological epenthesis, cluster
simplification processes, language acquisition, prosodic morphology, etc.independently
identify CV syllables as the most basic and the single universally attested syllable shape,
justifying the characterization of CV as the core syllable. In accord with this empirical
generalization, all of the diverse approaches to markedness theory (cf. Jakobsen 1941/1968;
Trubetzkoy 1939; Chomsky and Halle 1968; Greenberg 1966; Kaye and Lowenstamm
1984; Prince and Smolensky 1993; McCarthy and Prince 1994; de Lacy 2002 among
others) converge on a recognition of open CV syllables as the least marked syllable type.
Within the framework of Optimality Theory (Prince and Smolensky 1993, McCarthy
and Prince 1995, Kager 1999, etc.), the relative markedness of an output sequence is
169

Patricia A. Shaw

determined with respect to its violation of each of a ranked set of universal constraints
on phonological structure. Constraints relevant to syllable shape properties are stated in
(32), adapted from Kager (1999: 93, 94, 97):
(32) a. Onset
b. NoCoda
c. *ComplexOnset
d. *ComplexCoda

*[V
*C]
*[CC
*CC]

A syllable must have an onset.


A syllable must not have a coda.
Onsets are simple.
Codas are simple.

The optimization of CV results from the fact that this syllable shape violates none of the
constraints in (32).
The emergence of core CV syllables as ubiquitously preferred in scat is therefore
entirely in conformity with markedness predictions about syllable shape. Different
measures confirm their special status, from the lead observation that the scat excerpts in
(1) and (2) contain exclusively core syllables to the accumulated evidence in (27) that the
10 most frequently used syllable shapes are all open CV syllables.
Although the survey of scat in 2 sustains the generalization that the vast majority
of scat syllables adhere to the simplex onset plus no coda pattern, it also reveals that
not one of the six pieces analyzed here consists only of such syllables. Deviation from
this optimally unmarked canon falls into two categories: 4.2.2. violations of (32c)
*ComplexOnset, and 4.2.3, violations of (32b) NoCoda. Notably, there are no syllables
documented in the present database that violate the *ComplexCoda constraint in (32d):
all codas in the tunes sampled here consist of a single consonant.

4.2.2. Complex Onsets


A very small set of syllables (an overall total of 2.17 of the sample, as shown in (25)) have two
consonants as opposed to one in the onset. Such cases violate the constraint *ComplexOnset
in (32c), and fall into two subtypes, dependent on specific segmental content.
First are the clusters [sk] and [sp]. What differentiates these from the second
subtype of *ComplexOnset violations is that [sk] and [sp] are familiar, frequent,
well-formed clusters of English. Interestingly, however, they are not common in scat.
Only Armstrong (1926: bar 9-10 in 2.1.1) uses [sk], and it occurs only in the alliterative
sequence [skiyp skm sk]. Similarly, only Carter uses [sp], and it occurs only once (1955:
bar 17 in 2.3.1). Thus, not only are these clusters marked cross-linguistically by virtue of
being structurally complex onsets, but they are also foregrounded in terms of perceptual
salience within the scat repetoire by virtue of being so infrequent. A final observation is
that outside of their occurrence in these clusters, nowhere else in this scat database do
any of the individual segments [s], [k], or [p] occur as simplex onsets. As a consequence,
these sequences do not conform to the basic generalization that complex margins in
natural language phonological systems are characteristically compositional. That is, the
well-formedness of an [sk] or an [sp] onset cluster in English builds on the independent
As stated by Greenberg (1963: 263): If syllables containing sequences of n consonants in a language
are to be found..., then sequences of n-1 consonants are also to be found in the corresponding position
(prevocalic or postvocalic).

170

Scat syllables and markedness theory

availability of each of [s], [k], and [p] as a simplex onset. Thus, on yet another dimension
of general properties of phonological systems, these clusters are marked. In short, despite
their being entirely within the well-formedness constraints of English, the rare injection
of an [sp] or [sk] cluster into a stream of the more limited consonantal playing field of
scat syllables will effectively cause them to stand out as highly unusual.
In contrast, the second subtype of violations of the *ComplexOnset constraint
in (32c) consists of clusters that deviate from standard English: [bw], [mw], and [zw]
in Armstrong (1929: 2:02, 2:15 in 2.1.2), and [dl] and [ly] in Carter (1955: bar 6, 7, 16
in 2.3.1). Interestingly, although these segmental concatenations are not well-formed
English onsets, they differ from the first subtype in that they are basically compositional
within the scat repertoire of onset consonants. That is, with the exception of [z], each of
the components of these clustersviz. [b], [d], [m], [l], [w], and [y]occurs as a simplex
onset in the scat database, as charted in (24). There are two other ways that this second
set of clusters differs from the [sk] and [sp] clusters. First, they comprise exclusively
voiced segments. The fact that the segments in these clusters agree in voice conforms
with Greenbergs (1978: 252) markedness generalization that combinations which are
homogeneous in respect to voicing are favoured over those which are heterogeneous.
Secondly, drawing on the Sonority Hierarchy in (33a), note that each of these onset
sequences conforms to the Sonority Sequencing Principle in (33b), in that there is an
increasing sonority cline between the first consonant and the second.
(33) a. Sonority Hierarchy (< indicates less sonorant than)
Obstruent (O) < Nasal (N) < Liquid (L) < Glide (G) < Vowel (V)
b. Sonority Sequencing Principle: (Clements 1990: 285)
Between any member of a syllable and the syllable peak, only sounds of
higher sonority rank are permitted.

To summarize, although these clusters are not part of the familiar English repertoire,
there are three general cross-linguistic markedness measures to which they conform:
they are compositional; they are homogeneously voiced; and they obey the Sonority
Sequencing Principle.
What sets this subset of onsets apart from standard English clusters as well as
from general cross-linguistic expectations is their relatively marked status with respect
to two other constraints on segmental sequencing, both of which fall within the broad
purview of the Obligatory Contour Principle (OCP). First, the systematic absence of
Liquid-Glide sequences in English reflects a general constraint on minimal sonority
distance (34a). In standard English, all Liquid-Glide onset clusters are prohibited:
*[ly-, *[lw-, *[ry-, *[rw-. In Betty Carters scat, however, [ly- slips past the *[Liquid-Glide
constraint. Secondly, militating against various assimilatory forces within the grammar
are certain context-sensitive pressures to avoid homorganic place. In standard English,
there are no Labial (*LabLab) onset sequences: *[bw-, *[mw-, *[pw-, *[fw-, *[vw-, but
in Louis Armstrongs scat [bw- and [mw- occur, these being the two that transition from
a voiced [-continuant] attack into the [w]. Similarly, with Betty Carter, it is the voiced
[-continuant] [d] that releases into a liquid [l] that violates the prohibition in standard
English against the *CorCor sequences, *[dl- and *[tl-.
171

Patricia A. Shaw

(34) Obligatory Contour Principle (OCP):


a. Minimal Sonority Distance (cf. Vennemann 1988, Clements 1990, Zec 2007):
*[Liquid-Glide: *[lyb. Avoidance of homorganicity in consonant-resonant onset clusters:
*LabLab: *[bw-, *[mw*CorCor: *[dl-

None of these constraints characterizes the other non-English cluster, [zw], that
Armstrong uses. On a cline of relative markedness, *[zw- is not strongly deviant: it is not
subject to repair strategies in the pronunciation of proper names like Zwicky; and its
voiceless onset counterpart [sw], as in sweet, sway, swan, swoon..., has well-established
familiarity in the non-scat lexicon of the romantic lyricists of this same era. Nonetheless,
[zw] is outside the boundaries of standard English phonotactics, and will be recognized
as such by the listener. The hypothesis developed in 4.3 below is that such violations
of the phonological system are not arbitrary: rather, they are strategic manipulations of
the dynamic constraints that conventionally delimit linguistic structure, functioning to
enhance a range of performative musical effects.
To summarize thus far, the argumentation in this section illustrates how
phonological markedness theory provides an insightful framework for characterizing why
certain overwhelmingly common patterns emerge in the scat syllables of different artists.
At the same time, the discussion reveals that this theoretical approach also functions to
identify what properties of the empirical residue are not amenable to general linguistic
explanation. Based just on an examination of syllable onsets, the fact that this residue is
extremely narrow in scope and in realization is itself an interesting finding. In the next
section, the relative markedness of coda realization is explored.

4.2.3. Coda Constraints


Although the vast majority of scat syllables in the repertoire here do not have a coda, 17
do, as tabulated in (26). However, like onsets, their realization is very restricted. Of the 21
possible coda segments in English (see (4)), only six different segments appear: there are
multiple occurrences of [p, t, n, m, l] and a single occurrence of [g]. As the transcribed
value of this latter segment (2.1.2, [1:56]) varies between [g] and [v]either one of
which would be a unique attestationit will not be incorporated into the following
discussion. In markedness terms, there are several cross-linguistic generalizations that
characterize the identity and distribution of the five other segments.
Note among the obstruents that there are no fricatives or affricates. There are only
the two plain anterior stops [p] and [t] which, in terms of frequency (see (26)), account
for 59 (49/83) of all attested codas. Given that these are the voiceless counterparts of
[b] and [d], which clearly emerge as the overwhelming segmental favourites in onsets,
a major question relates to why the value of [voice] is in complementary distribution
between onset and coda? Markedness theory offers a straightforward account of the
coda behaviour, in that the preference for obstruents to be voiceless in syllable-final
position (alternatively, at the end of a word or before another obstruent) is a widely

172

Scat syllables and markedness theory

attested cross-linguistic phenomenon. This contextual neutralization underlies the OT


formalization of the positional markedness constraint in (35):
(35) *Voiced Coda (Kager 1999; cf. Steriade 1999, Gordon 2007, Zec 2007)
Obstruents must not be marked for [voice] in coda position.

This constraint is unviolated in the entire scat corpus documented here, and effectively
captures the relevant generalization: if a coda is an obstruent, then it must be voiceless.
Not all the attested codas are obstruents, however. The residual codas [m], [n], and
[l] are all sonorants. On the basis of the cross-linguistic observation that some languages,
like Chinese, allow only sonorants in coda position, Pepperkamp (2003) proposes the
markedness constraint in (36):
(36) *Obstruent Coda
Codas cannot be obstruents.

The postulated constraint in (36) makes two predictions. First, a phonological system
could have only sonorant codas, as Pepperkamp argues for Chinese. Secondly, a
phonological system could not have exclusively obstruent codas: that is, if it has obstruent
codas, then it also must have sonorant codas. This second type of system is exactly what
is documented for both tunes analyzed for each of Louis Armstrong and Chet Baker
(see (26)). Of particular interest, however, is the fact that this is not what has emerged for
either of the Betty Carter recordings. As summarized in (26), her inventory of codas is
precisely the system characterized by the first prediction: there are only sonorant codas.
This is really quite striking confirmation of the role of universal markedness constraints
in governing the strictly delimited inventory of scat.
Moving to a consideration of place of articulation properties of codas, we note that
the limitation of the set of attested scat codas {p, t, n, m, l} to Labials and Coronals is also
systematically derivable from general tenets of markedness theory. Drawing on various
observed asymmetries in inventories, epenthesis, neutralization, etc., the markedness
hierarchy in (37) identifies Dorsal place as the most highly marked:
(37) Place Markedness Hierarchy (de Lacy 2007: 23)
*Dorsal *Labial *Coronal

Hence, the non-attestation of Dorsals and, concomitantly, the preferred status of


Coronals and Labials follow from this markedness generalization.
Finally, it is important to consider not just the distinctive properties of segments
in a particular prosodic position, but also aspects of their sequential relation to their
neighbours. As a dramatic example of harmonic assimilation, all nine instances of [] in
Chet Bakers minimally contrastive articulatory flow are preceded by homorganic [t] and
followed by [d]. Thus, a single coronal non-continuant gesture is sustained across the trisegmental sequence, modulated only by velic movement for the oral-nasal contrast and
laryngeal voicing. Even in the context of the much more diverse articulatory repertoire
in Louis Armstrongs Hotter Than That, an examination of trans-syllabic properties in it
reveals that the place of articulation in the vast majority of the 42 codas is homorganic
with the place of articulation of the following onset. Specifically, all eight cases of coda [t]
173

Patricia A. Shaw

are followed by a [d] onset. Similarly both [n] codas precede [d]. All three post-vocalic
coda [m]s are also homorganic, in one case to [b] and in the other two cases to [w]. All
three tokens of syllabic [] follow a comparable pattern, preceding onsets [w], [m], and
cluster [mw]. Aside from the unique instance of a [g], the only coda segment that is
ever independent of this assimilatory effect is Louis Armstrongs favoured coda in these
works, [p]. Still, the majority of [p] codas (13/22 = 59.1) precede homorganic [b]. The
residual nineall of which occur before [d]are the only non-homorganic codas in this
entire scat set.
Again, these coda-onset assimilatory patterns constitute further evidence of
a remarkably consistent and delimited range of vocal behaviours in scat that are
systematically correlated with a broadly motivated positional markedness constraint, the
Coda-Condition:
(38) Coda-Condition (It 1989; Kager 1999)
A coda cannot have a place feature different from the following onset.

Note that (38), which fosters adjacent labial-labial or coronal-coronal articulations, is


differentiated from (34b), which militates against labial-labial or coronal-coronal
sequences, by virtue of prosodic context. The former applies across a coda-onset sequence
whereas the latter obtains between segments within a complex onset.
What has been argued in this section is that all the defining properties of scat
codas in the current sample fall directly within the explanatory framework of the
independently movitated theory of phonological markedness. They may be exclusively
sonorant (36); if obstruent, they are voiceless (35); they are solely coronal and labial (37);
and they are overwhelmingly homorganic with a following onset (38).

4.2.4. Why are [d] and [b] the favoured onsets in scat?
Having examined the markedness properties of syllable shape, of complex onsets, and of
codas, let us now return to two fundamental questions raised in the introduction, where it was
noted that, in the Louis Armstrong excerpts in (1) and (2), [d] and [b] are the only onsets.
The first question was whether this observation is genuinely representative of scat
or whether it is essentially accidental, attributable perhaps to this particular artist, to
selective sampling, to the stylistic phraseology of these brief excerpts, or to some other
factor. The present analysis clearly affirms that these two segments are indeed the most
prevalent onset consonants. As summarized in (24), [d] is incontestably the favoured scat
onset in every tune by every artist examined in 2. The next most frequently attested
onset is [b]. However, as noted earlier, there is an evident asymmetry in their usage. In
two of the six songs, [b] is not used at all; in a third, it appears only once. In the other
three songs, it ranks below [d] with a broad range of variance: from 66.7 difference
(in Armstrong 2.1.1) to 12.8 (Armstrong 2.1.2) to 4.4 (Carter 2.3.1). Nonetheless,
despite this imbalance between [d] and [b], they both clearly stand out as more prominent
than any of the other nine consonants that appear in onsets.
The second question is what the explanatory basis of this generalization might
be. Notably, it does not mirror standard English frequency patterns. According to the
174

Scat syllables and markedness theory

Francis and Kucera (1982) count of the token frequency of all word-initial onsets in a
corpus of over a million English words, neither [d] nor [b] stands at the head of the
relative frequency ranking of single consonant onsets, summarized in (39a). In fact, in
terms of the absolute measures cited in (39b), [b] is almost twice as frequent as [d] in this
extensive corpus of standard English usage, a result that is opposite to the consistently
greater frequency and breadth of distribution of [d] over [b] in scat.
(39) a. > w > h > b > t > s > k, m > f > d > ...
b. [b] = .05335, [d] = .02762

In short, standard English frequency measures do not account for the two most robust
generalizations that have emerged in the scat data: (1) the preference of [d] over [b]; and
(2) the prevalence of [d] and [b] over all other consonants in the English inventory.
Phonological markedness theory contributes significantly to a principled
interpretation of these results. First, consider place of articulation. The Place Markedness
Hierarchy already introduced in (37) effects an internal ranking of the coronal place as
the most optimal (least marked), of labial as an intermediate class, and of dorsal as
the least optimal (most marked). Two major empirical findings about scat onsets follow
directly from this hypothesis of a fixed place hierarchy: one is the preferred status of
coronal [d] over labial [b]; and the second is the absense of dorsal [k] or [g]. Dorsals,
the most marked of the English stops, are simply unattested as scat onsets. Thus, the
place features of [d] and [b] are clearly consistent with fundamental markedness tenets.
But, what about their manner and laryngeal properties?
With respect to manner, the fact that [d] and [b] are both obstruents accords with
the cross-linguistic preference for low sonority onsets, captured by the fixed positional
markedness constraint hierarchy in (40a). Further, within the class of obstruents, the
articulated subcategorizations of (40b) identify stops as less sonorant than fricatives.
(40) a. Optimal Onset Sonority (de Lacy 2001; Prince and Smolensky 2004)
*Onset/L *Onset/N *Onset/O
b. Obstruent Sonority (Dell and Elmedlaoui 1985; Prince and Smolensky 2004)
*voicedFric *voicelessFric *voicedStop *voicelessStop

The combined effect of the markedness relations in (40a) and (40b) accords
directly with the notable paucity (a single attestation: see (24)) of fricatives as a simplex
onset (Sarah Vaughns and Sha Na Nas trademark [] notwithstanding). We conclude
then that, for manner features, phonological markedness theory provides considerable
explanatory coverage of the favoured status of [d] and [b] in the full context of the highly
constrained scat inventory.
However, a major anomaly persists with respect to laryngeal markedness: the
privileged status of the voiced obstruents [d] and [b] and, correlatively, the extreme
rarity of their voiceless counterparts [p] and [t] as scat onsets are directly counter to
the predictions of markedness theory. That is, based on cross-linguistic generalizations
This rank order is constructed from the Francis-Kucera token frequencies for single consonant onsets
cited in the appendix to Stemberger 1990: 157. The cited values in (39b) are from this same source.

175

Patricia A. Shaw

from a variety of perspectives (including the typology of sound systems, natural classes,
direction of neutralization, direction of language change, segmental complexity, perceptual
and articulatory contrast, and other factors), voiceless obstruents are the unmarked series,
this generalization motivating the markedness constraint in (41):
(41) Voicing markedness (de Lacy 2002)
*[+voice, -sonorant] Obstruents must be voiceless.

As shown in (24), [d] and [b] together comprise 77.9 of scat syllable onsets. In
contrast, there are no instances of [p] in onset position, and only a single occurrence of
[t] (in Louis Armstrongs Hotter Than That). Beyond the database of tunes analyzed
in 2, a full examination of Bauers (2002a: 245343) prodigious body of transcriptions
of Betty Carters scat corroborates this generalization: in the entire collection, [t] is
unattested as an onset and [p] is exceedingly uncommon. Given robust cross-linguistic
support for (41), it can only be concluded that the overwhelming preponderance
of the voiced obstruents [d] and [b] as onsets in scat, combined with the virtual
absence of voiceless [t] and [p], is distinctly odd from a markedness perspective. The
very fact that this pervasive asymmetry is clearly defined in markedness terms as
deviant is a productive consequence of the theory, and is pursued further in 4.3.

4.2.5. The Contributions of Markedness Theory


To summarize, the goal of 4.2 has been to explore the degree to which phonological
markedness theory provides an explanatory framework for the observed patterns
in syllable shape and segmental inventory in scat. The results of this approach are of
considerable interest, I believe, to deepening our understanding of the interface of natural
language systems and musical vocal performance. Although couched in an optimality
theoretic framework, the markedness generalizations invoked here essentially derive
from the confluence of a diversity of insightful conceptual approaches to markedness
issues that have spanned many decades of linguistic research. The breadth of empirical
coverage offered by an essentially small body of tightly constrained and independently
motivated hypotheses is considerable.
First, in 4.2.1 it is seen that the robust preference for open CV syllables in scat
directly accords with the four universal markedness constraints in (32) that govern
syllabic form. There are two kinds of deviations from this basic canon: a very small set
(2) of syllables have complex onsets and a larger set (17) have simplex codas. There
are no complex codas.
Obstruents are exclusively voiceless in the phonological inventories of many languages, e.g.

m
(Salish), Nuu-chah-nulth (Wakashan), Hawaiian (Austronesian), Korean, etc.
hn qmi
n
Further, the presence of voiced obstruents in a language characteristically entails the presence of
their voiceless counterparts, as in English.
Two tokens of [p] are in Youre Driving Me Crazy (Bauer 2002a: 252; bars 7, 9); a third token is in
the initial syllable of bar 57 in the 1979 take of I Could Write a Book (Bauer 2002a: 307). These last
two are plausibly interpretable as an ambisyllabic coda-onset from the preceding [bap] syllable.

176

Scat syllables and markedness theory

Then, to extend the analysis beyond syllable shape, the particular properties
that mark the attested complex onsets are examined in 4.2.2, the properties of codas
are analyzed in 4.2.3, and the observed featural asymmetries of simplex onsets are
investigated in 4.2.4. As shown in 4.2.3, the identity and distribution of the limited
repertoire of codas conform to positional markedness constraints on voice (35), manner
(36), and coda-onset place agreement (38), as well as to the fixed place hierarchy in (37).
In 4.2.4, it is seen that all but one dimension of the featural identity of the restricted
inventory of scat onsets follows markedness patterns. Specifically, they comply with the
fixed place hierarchy of (37), with the positional markedness constraints governing the
intersection of sonority and manner in onsets in (40a), as well as with the fixed hierarchy
in (40b) that optimizes non-continuant manner and sonority. Onsets deviate on only
onealbeit a perceptually highly salientmeasure: voice, as formalized by the constraint
in (41). An alternate hypothesis was therefore tested, namely, that this anti-markedness
result might correlate with identified frequency patterns for standard English onsets.
However, comparative evaluation of the evidence shows no systematic relationship to
support a frequency hypothesis.
In sum, the explanatory power of a markedness explanation for these diverse
and strikingly consistent factors is substantial. However, a further empirical strength
of applying markedness theory to the analysis of scat is that the theory characterizes
not only what corresponds with English and/or universal language patterns, but it also
functions to define the specific nature and locus of deviance. In essence, markedness
theory effectively subcategorizes the residue of scat properties that fall outside of the
predictions of markedness-governed sound patterns into two domains. One is absolutely
pervasive across all singers, namely: the consistent realization of voiced [d] and [b] onsets,
to the virtual exclusion of their voiceless unmarked counterparts [t] and [p]. The other
is a much less coherent set of often unique attestations of highly marked segments, such
as Louis Armstrongs once-only insertion of a triple sequence of [f] codas and later of
[sk] onsets in Heebie Jeebies. The fact that the empirical residue that is not amenable
to a linguistic markedness explanation is extremely narrow in scope is theoretically
interesting, and suggests that quite specific competing functional forces external to the
linguistic system may be at play. Scat is, after all, a component of each jazz artists musical
repertoire and individual creativity. In the following section, the musical interface of
vocal performance with the marked linguistic residue is examined.

4.3. The Performative Interface between Vocal Music and Phonological


Markedness
Given that scat is at the interface of constraints on linguistic sound structure and the
exigencies of vocal music production, the central issue to be addressed here is whether
there are factors of musical performance or creative expression that conflict with and
override phonological markedness, thus providing a systematic explanation for the
deviant residue identified in 4.2.

177

Patricia A. Shaw

4.3.1. Voice
The most salient property of scat syllables that breaches markedness patterns is the
pervasive preference for [d] and [b] as onsets, to the virtual exclusion of [t] and [p].
In natural language systems, there is a robustly attested cross-linguistic asymmetry:
a phonological inventory may have only voiceless obstruents, or both voiced and voiceless,
but not only voiced obstruents. This asymmetry is formalized in (41) by hypothesizing a
markedness constraint that identifies *[+voice, -sonorant] segments as marked, in the
absence of a corresponding constraint prohibiting [-voice, -sonorant] segments. What
has been documented in 2 is that in scat this asymmetry is reversed. The fundamental
question is why: what properties obtain in the performative context of scat that would
create pressure to systematically violate this constraint?
We have seen the role played by the canonical syllable structure constraints in (32)
to optimize a repetitive CV alternation in natural language. Further, it is hypothesized,
through constraints like (40), that the optimal CV contour alternates between minimal
sonority onsets (voiceless stops) and maximal sonority nuclei ([a]), thus enhancing
perceptual contrast of the peaks and troughs. What is happening in scat is that the
optimal voiceless stop onset is being compromised just one notch up the Obstruent
Sonority hierarchy in (40b) to the category of voiced stops. The articulatory factor that
differentiates these two categories is vocal cord vibration. The functional relevance of
this difference is that vocal cord vibration is essential to carry pitch. In natural language,
pitch plays a criterial role in marking tone, intonation, and often stress. However,
linguistically significant pitch is characteristically carried not by the onset, but by the
syllabic nucleus, and sometimes by subsequent coda elements dependent on sonority.
In music, pitch is fundamental to the expression of melody. Unlike language, the
melodic line of music is not structured by constraints that optimize a voiceless-voiced
contour. To the contrary, although periods of relative silence (rests) contribute to phrasing
a melody, the foundation of a melodic line is a continuous soundwave that allows pitch
realization and modification in order to create a succession of tones.
Consequently, in vocal music that incorporates natural language, there is an
inevitable tension between the articulation of voiceless sounds and the fluidity of
the melodic line, since voiceless segments cannot bear pitch. Research on singing in
Tashlhiyt Berber (Dell and Elmedlaoui 2008), a language with extensive sequences of
voiceless obstruents, reveals two strategies for the realization of the melodic line across
such stretches. One strategy is to prolong a preceding vowel so that it carries not only its
own tone, but also the tone that metrical scansion would normally assign to the following
voiceless syllable. A second strategy is to epenthesize an unmarked schwa vowel to carry
the pitch of the associated melody.
What I hypothesize is happening in scat is a third strategy, namely: the musical
melodic imperative for voiced realization overrides the natural language markedness
constraint in (40b) that optimizes voiceless stops in onsets. Because voiced stops can,
however briefly, carry some pitch realization, they satisfy the high-ranked musical
constraint on melodic voicing and therefore emerge as the most prevalent onsets in
scat. This interface between the vocal music Melodic Voicing constraint, formalized in

178

Scat syllables and markedness theory

(42a), and the markedness constraints on onset sonority (deconstructing *Onset/O from
(40a) into the more refined hierarchy of (40b)) is illustrated in the tableau in (42b):
(42) a. Vocal Music Melodic Voicing Constraint: *[-voice]/ melody
Sung segments must be voiced.
b. Music System:

*[-voice]/ melody

Language System:
a. [ta]

*Onset/voicedStop
*!

b. [da]

*Onset/voiceelessStop
*

Similarly, the absence of [p] onsets, despite the occurrence of [b] onsets, follows from this
same interface where the musical constraint (42a) outranks the language constraints.

4.3.2. Performative Markedness


The second kind of deviance identified by markedness theory does not cohere in a
single identifiable characteristic. Rather, there is an assortment of infrequentindeed,
often uniqueattestations of sounds that are relatively marked in terms of general
cross-linguistic patterns identified in 4.2. The question to be explored in this section
is whether there might be an independent functional motivation for the inclusion of
marked structure.
Consider, for example, the diverse array of complex onset clusters listed in (25)
and discussed in 4.2.2. A variety of different frequency measures related to these onset
clusters attest to their rarity. Clusters occur in only three of the six scat tunes, and in
the output of only two of the three artists: there are no clusters in Chet Bakers work.
In total, 10 cluster tokens are attested in the database of 461 syllables: thus, they mark
a mere 2.17 of onsets. Louis Armstrongs output contains six of the ten clusters; even
within his scat corpus of 213 syllables, the percentage of complex onsets is only marginally
higher: 2.8 (6/213). In terms of individual frequency, five of the seven different types of
documented clusters are unique attestations. As shown in the summary onset chart in
(24), even certain simplex onsets are attested only once in the present sample.
The basic generalization that the full body of frequency measures reported on
here sustains is that the inventory of segments and combinatorial possibilities in scat
is extremely limited in comparison to the full inventory of sounds in English, not to
mention the even more extensive repertoire of phones in other natural language systems
or, in fact, in the extraordinary array of oral articulations that the human vocal apparatus
is capable of. A further claim has been advanced that the very limited inventory of scat
as defined in terms of frequency correlates very strongly with segments and structures
that are characterized as relatively unmarked according to basic tenets of phonological
markedness theory. Consider then the cognitive impact of deviation from what has
been identified as the high frequency, phonologically unmarked norm for scat. The
introduction of novelty into an established, familiar sequence will command immediate
auditory attention and interest.
179

Patricia A. Shaw

For example, as noted earlier (1.2.1) in the discussion of Heebie Jeebies, it is after
a extended auditory train of 22 [d]-initial syllables, minimally broken by two [b]-initial
syllables, that Louis Armstrong interjects the alliterative sequence of three syllables with
an [sk] onset cluster. The auditory impactphonologically and musicallyis undeniably
powerful. A variation on Armstrongs creative use of deviant phonology to musical effect
occurs in Hotter Than That, where the trio of non-English clusters [zw], [mw], and
[bw] occur within a few bars of one another, establishing a brief articulatory leitmotif
that through its disruption of the familiar scat canon simultaneously produces auditory
tension and artistic interest. Betty Carters repertoire, as observed in 2.3.1, is similarly
enriched and the listeners attention engaged by unanticipated occurrences of the highly
marked [dl] or [ly] clusters. Chet Bakers implementation of phonological deviance is
illustrated by the distribution of [h]. As pointed out in 2.2, in each of the 1955 and the
1989 versions of Everything Happens to Me, [h] occurs only once, in different places, but
in both cases its uniqueness functions to demarcate the initial syllable, a particularly
prominent prosodic position, of a musically important phrase. The hypothesis, then,
is that phonological deviance and low frequency may functionally conspire in scat to
enhance perceptual salience. By challenging the limits of phonological markedness
constraints manifest in scat, a jazz artist can effectively arrest the listeners attention,
strategically manipulate the cognitive tension of phonological deviance, and creatively
expand the expressive repertoire at the interface of language and vocal music.

5.

Conclusions

Scat extends the vibrant improvisational genre of instrumental jazz to the human voice. As
an oral idiom, scat draws on the same articulatory apparatus as natural human languages
do. Because it is uniquely situated at the interface of the cognitive and performative
systems that underlie both music and language, scat can potentially deepen our insight
into the complex organizational structure of each of these particularly human creative
systems, as well as of their interaction.
Through investigation of the form of scat syllables used by three renowned jazz
vocalistsLouis Armstrong, Chet Baker, and Betty Carterin performances that range
across time from 1926 to 1989, the analysis in 2 establishes that, despite their markedly
different musical styles and individual creativity, the set of consonantal sounds that these
diverse performers draw on in the creation of scat syllables is strikingly consistent and is
extremely limited in comparison to the extensive range of segments and combinatorial
possibilities that define the inventory of syllables in English. Moreover, as shown in 3,
the observed generalizations apply not only within the classical scat canon of these scat
masters, but also in scat-derived pop music lyrics of the early 60s rocknroll era.
Given that the articulatory range of the human vocal apparatus far exceeds what
is found in scat, the fundamental goal pursued in 4 is to explore what might account
for the attested sound patterns in scat. Three hypotheses are investigated. The first (4.1),
familiar from the jazz literature, holds that jazz vocalization is essentially imitative of

180

Scat syllables and markedness theory

jazz instrumental expression. Although this hypothesis holds considerable popular appeal,
it is difficult to substantiate from an empirical perspective.
The second hypothesis (4.2) is that phonological markedness theory provides an
insightful framework for characterizing why certain overwhelmingly common patterns
emerge in the scat syllables of different artists. It is argued that there is substantial support
for this approach. Specifically, all the defining properties of scat codas in the current
sample fall directly within the explanatory framework of independently movitated
markedness constraints: codas may be exclusively sonorant, in accordance with (36)
*Obstruent Coda; if obstruent, they are voiceless, following (35) *Voiced Coda; they
are solely coronal and labial, to the exclusion of dorsals, in conformity with (37) the Place
Markedness Hierarchy; and they are overwhelmingly homorganic with a following onset,
in adherence to (38) the Coda-Condition positional markedness constraint. With respect
to onsets, the clear preference for stops accords with the cross-linguistic preference for low
sonority onsets, captured by the positional markedness constraint in (40a) that defines
Optimal Onset Sonority, combined with the obstruent sonority hierarchy in (40b). The
fact that dorsals, the most marked of the English stops, are unattested as scat onsets, as
well as the preferred status of coronal /d/ over labial /b/ follow, as in the case with codas,
from the Place Markedness Hierarchy in (37). In sum, phonological markedness theory
provides considerable explanatory coverage of the highly constrained scat inventory.
Not all of the observed scat data is interpretable in terms of phonological
markedness, however. A valuable strength of the theory is precisely this consequence of
its identifying two sets of empirical residue that are not amenable to general linguistic
explanation. In 5 it is argued that such violations of phonological markedness are
systematic, and function to enhance performative components at the interface of vocal
music and the phonological system.
The most striking and consistent property of scat syllables that challenges
markedness theory is the overwhelming preference for the voiced obstruents [d] and [b]
as onsets, in conjunction with the extreme rarity of their voiceless counterparts [p] and
[t]. Whereas in a natural language system, the perceptual contrast between onset and
nucleus is optimized by a [-voice] onsets, in a vocal music system, the pitch level of the
melodic line can only be carried by [+voice] segments. What is hypothesized here is that
the musical imperative for melodic/voiced realization over-rides (in optimality terms,
outranks) the natural language markedness constraint.
The second type of scat anomaly involves a diversity of infrequent attestations
of relatively marked sounds. Because the introduction of deviation into a stream of
high frequency, phonologically unmarked scat has considerable cognitive impact, it is
hypothesized that phonological deviance and low frequency may functionally conspire in
scat to enhance perceptual salience. By defying the familiar bounds of the scat inventory,
a jazz singer can effectively capture the listeners attention, extend the articulatory
repertoire that he can creatively work with, and transcend the familiar.
At the heart of improvisional creativity in music, as in language, is the challenge of
innovation under the constraints of structural limitations. Although an intricate variety
of cognitive and performative factors contribute to scat improvisation, what has been
argued in this paper is that phonological markedness theory provides an explanatory
181

Patricia A. Shaw

framework of the structural constraints that largely define the articulatory domain of
scat. Interfacing with this phonological framework, and sometimes over-riding specific
constraints within it, are performative exigencies of melodic realization and the ineffable
creative workings of improvisational genius.

References
Bauer, William R. 2002a. Open The Door: The Life and Music of Betty Carter. Ann Arbor:
University of Michigan Press.
Bauer, William R. 2002b. Scat Singing: A Timbral and Phonemic Analysis. Current
Musicology 7173: 303322.
Bastian, Jim, and John Alexander. 1995. Chet Bakers Greatest Scat Solos. Smithfield, RI:
Coastal Publishing and Educational Resources.
Berliner, Paul. 1994. Thinking in Jazz: The Infinite Art of Improvisation. Chicago:
University of Chicago Press.
Chomsky, Noam, and Morris Halle 1968. The Sound Pattern of English. NY: Harper and
Row.
Clements, G.N. 1990. The Role of the sonority cycle in core syllabification. In Papers in
Laboratory Phonology I: Between the grammar and physics of speech, ed. J. Kingston
and Mary E. Beckman. Cambridge: Cambridge University Press. 283333.
de Lacy, Paul. 2001. Markedness in prominent positions. In MITWPL 40: HUMIT 2000.,
eds. O. Matushansky et al. Cambridge, MA. 5366. [also ROA 542]
de Lacy, Paul. 2002. The formal expression of markedness. University of Massachusetts,
Amherst: Doctoral dissertation.
de Lacy, Paul. 2007. Themes in phonology. In The Cambridge Handbook of Phonology, ed.
Paul de Lacy. Cambridge: Cambridge University Press.
Dell, F. and M. Elmedlaoui. 1985. Syllabic consonants and syllabification in Imdlawn
Tashlhiyt Berber. Journal of African Languages and Linguistics 7: 105130.
Dell, F. and M. Elmedlaoui. 2008. Poetic meter and musical form in Tashlhiyt Berber songs.
Cologne: Rdiger Kppe.
Francis, W.N. and H. Kucera. 1982. Frequency analysis of English usage: Lexicon and
grammar. Boston: Houghton Mifflin.
Gordon, Matthew. 2007. Functionalism in phonology. In The Cambridge Handbook of
Phonology, ed. Paul de Lacy. Cambridge: Cambridge University Press.
Greenberg, Joseph H. 1963. Memorandum concerning language universals. In Universals
of Language, ed. J.H. Greenberg. Cambridge, MA.
Greenberg, Joseph H. 1966. Universals of Language. Cambridge: MIT Press.
Greenberg, J.H. 1978. Some generalizations concerning initial and final consonant
clusters. In Universals of human language 2: Phonology, ed. J.H. Greenberg. 243280.
Stanford: Stanford University Press.
Hurch, Bernard. 2005. Studies on reduplication. Empirical approaches to language typology,
No. 28. Berlin: Mouton de Gruyter.

182

Scat syllables and markedness theory

Ito, Junko. 1989. A prosodic theory of epenthesis. Natural Language and Linguistic Theory
7: 217259.
Jakobsen, Roman. 1941/1968. Child language, aphasia and phonological universals. The
Hague and Paris: Mouton.
Kager, R. 1999. Optimality Theory. Cambridge: Cambridge University Press.
Kaye, Jonathan, and Jean Lowenstamm. 1984. De la syllabicit. In Forme sonore du
language: Structure des reprsentations en phonologie, ed. F. Dell, D. Hirst and J.-R.
Vergnaud. Paris: Hermann. 123159.
Kernfeld, Barry. 1995. What to Listen for in Jazz. New Haven, Conn.: Yale University
McCarthy, John, and Alan Prince. 1986. Prosodic morphology. Technical report 32. Rutger
University Center for Cognitive Science. (online revised version: http://ruccs.
rutgers.edu/pub/papers/pm86all.pdf )
McCarthy, John, and Alan Prince. 1994. The emergence of the unmarked: Optimality
in prosodic morphology. In Proceedings of the North East Linguistic Society 24, ed.
Merce Bonzalez. Amherst, MA: GLSA Publications. 333379.
McCarthy, John, and Alan Prince. 1995. Faithfulness and Reduplicative Identity. In
University of Massachusetts Occasional Papers in Linguistics 18, eds. Jill Beckman,
Laura Walsh Dickey, and Suzanne Urbanczyk. 249384. Amherst, MA; GLSA
Moravcsik, Edith. 1978. Reduplicative constructions. In Universals of human language 3:
Word structure, ed. J. H. Greenberg. 297334. Stanford, CA: Stanford University Press.
Peperkamp, S. 2003. Phonological Acquisition: Recent Attainments and New Challenges.
Language and Speech 46, 23: 78113.
Prince, Alan, and Paul Smolensky. 2004. Optimality Theory: Constraint Interaction in
Generative Grammar. Oxford: Basil Blackwell. [1993. ROA 537]
Robinson, J. Bradford. 2002. Scat Singing. In New Grove Dictionary of Jazz, Vol. 3, ed.
Barry Kernfeld. 515516.
Sapir, Edward. 1933. The Psychological Reality of the Phoneme. In Selected Writings of
Edward Sapir in Language, Culture and Personality, ed. David Mandelbaum. 1986.
Berkeley: University of California Press.
Stemberger, Joseph Paul. 1990. Wordshape errors in language. Cognition 35: 123157.
Steriade, Donca. 1999. Phonetics in phonology: The case of laryngeal neutralization. In
Papers in Phonology 3, ed. Matthew Gordon. UCLS Working Papers in Linguistics
2. UCLA. 25146.
Stewart, Milton L. 1987. Stylistic Environment and the Scat Singing Styles of Ella Fitzgerald
and Sarah Vaughan. Jazzforschung/Jazz Research 19: 6176.
Stoloff, Robert. 2003. Blues Scatitudes. Brooklyn: Gerard and Sarzin Publishing Co.
Trubetzkoy, Nikolaj. 1939. Grundzuge der Phonologie. Gottingen: Vandenhoeck & Ruprecht.
Vennemann, Theo. 1988. Preference laws for syllable structure. Berlin: Mouton de Gruyter.
Zec, Draga. 2007. The Syllable. In The Cambridge Handbook of Phonology, ed. Paul de
Lacy. Cambridge: Cambridge University Press.

183

Patricia A. Shaw

Appendix 1. Transcription conventions


Reducing the dynamic auditory flux of articulatory movement in scat vocalization
to discrete transcriptional conventions that are defined in terms of phonologically
independent unitary segments and syllables entails both informed choice and compromise.
Some of the major factors impacting on the transcriptions presented in Appendix 2 are
discussed here.
First, there are some different notational symbols that are aligned with different
transcription traditions. Given the American roots of jazz, certain Americanist symbols,
like [] and [], are here adopted, rather than their corresponding IPA counterparts []
and [t]. Nasalization is indicated by a tilda over the vowel, e.g. []. A syllabic resonant
is marked with a subscript dot, e.g. [].
More nuanced are issues related to levels of abstraction away from phonetic
detail. The transcriptions in Appendix 2 are basically very broad, ignoring most aspects
of phonetic realization that are entirely regular, such as pre-tonic aspiration of stops.
However, certain other features that are normally non-contrastive in English, but that
surface prominently in unpredictable environments are explicitly marked, for example,
the nasalization in Louis Armstrongs string of syllables initiated in bar 49 of 2.1.2.
A particularly complex domain is the representation of vowels, as their articulation
tends to be highly mobile. Bauer opts for a relatively abstract transcription, based on the
Trager-Smith system for phonemicization of English. Bauer defines the interpretation
of vowels with reference to the words in the table below (see Chart 1 in Bauer (2002a:
238); and Table 3 in Bauer (2002b: 306307)). In order to standardize the transcription
system used for all the scat data considered in the present study, the Trager-Smith
(abbreviated T-S) representations are here interpreted as in the PAS column below.
The major differences are in the representation of lax vowels and of the T-S post-vocalic
/h/. Bauers transcriptions reproduced in Appendix 2 include both his T-S notation and
a transliteration in terms of the general correspondences set out in (1).
(1) English Word

T-S

PAS

English Word

T-S

PAS

uw

uw

beat

iy

iy

boot

pit

put

bait

ey

ey

boat

ow

ow

pail

eh

caught

oh

pet

pot

pat

cut

Both systems neutralize the considerable variation in vowel realization that may relate to
an individual singers articulation, phonological context, or melodic interpretation.
An inherent limitation of the Trager-Smith transcriptions that impacts on the
present analysis derives from the fact that the T-S system does not include glottal stop,
since [] is non-phonemic in standard English. As a consequence, there is systematic

184

Scat syllables and markedness theory

ambiguity in the onsetless status of syllables that are transcribed by Bauer as vowel-initial.
There are three different types of contexts where this observation is relevant.
First, consider the post-rest in what is transcribed as / duw .../ in bar 6
of Betty Carters Thou Swell (2.3.1). In the context of the present evaluation of the
relative markedness of scat syllable structure, the question is whether this syllable is
truly onsetless, i.e. simply [], which would be a marked syllable structure, or whether
there is a sub-phonemic epenthetic glottal stop functioning as an onset and creating
an unmarked structure, i.e. []. My auditory assessment of the phonetic realization of
these contexts basically accords with Bauers phonemic transcriptions: generally Carters
mellifluous voice transitions very smoothly into a vocalic realization both phrase-initially
and in phrase-internal vowel-vowel sequences, as in [... ly a m ...] in bar 7. Despite the
musical appropriateness of these seamless transitions to different vowel targets, within
the linguistic analysis such syllables are tallied as onsetless, and hence marked.
The second context is exemplified in the last two syllables of bar 6 of 2.3.1 Thou
Swell, where Bauers transcription /duw uw/ implies the second syllable /uw/ has no
onset. Here there are two other potential interpretations: (i) a glottal onset, or (ii) a transsyllabic perseveration from the preceding glide into an onset role. The retranscription
[du wuw], adopted here, interprets the intervocalic glide as an onset. Alternatively the
[w] closure may be considered ambisyllabic. Either way, such cases are not onsetless, and
hence not categorized as marked in the present analysis.
A third and similar type of case where there plausibly is dual functionality of
an intervening consonant is the situation where a syllabic nasal is preceded by a syllable
headed by a lax vowel and closed by a stop that is tautosyllabic to the following nasal,
e.g. the sequences [dt ] and [dt ] in line 7 of Chet Bakers Everything Happens to Me
(2.2.2). At a very surface level, such [t]s are arguably ambisyllabic, functioning as both
coda to the preceding syllable and onset to the following one. Consequently, the syllabic
resonant in such cases is not classified in the present analysis as onsetless.
Space limitations here unfortunately preclude inclusion of the corresponding
musical transcription for the full repetoire of scat renditions analyzed here (but see
citations to musical notation by Bauer (2002a, b) and Bastian and Alexander (1995)).
However, because there are, as one might expect, certain phonological correlations in
positions of prosodic prominence, both bar divisions and rests are encoded in some
(but not all) of the transcriptions here. Bar divisions are represented by | . Rests are in
standard musical notation: sixteenth note , eighth note , quarter note , and half note .
A hold, where a syllable is held across a bar line, is indicated by a dash on both sides of
the bar, e.g. [... dip| b ...].
The reality, of course, is that a wealth of auditory information that springs from
this improvisational conjunction of creative and physical forcesthe finessed range of
articulatory movement, the rich and highly individualistic molding of acoustic shapes
and tonalitiesis not captured in conventional phonetic transcription. For the present
purposes, however, the notation adopted in Appendix 2 provides considerable insight
into the linguistic issues under investigation.

185

Patricia A. Shaw

Appendix 2: Scat Transcriptions


2.1.1. Armstrong, Louis. Heebie Jeebies scat solo (February 26, 1926)
Okeh Records 8300. Transcription by W.R. Bauer (2002b: 308).
Transliteration (2nd line) follows the principles outlined in Appendix 1.
[Note: see Bauer for a full musical transcription of the melodic line.]

WRB: eh | iyf gf | mf diy b | diy d la bam | rip ip di duw diy duwt |


e | iyf gf | mf diy b | diy d la bam | rp p d duw diy duwt |

WRB: duw duw diy duw d | diy d d dow diy | dow di dow duw duw |
duw duw diy duw d | diy d d dow diy | dow d dow duw duw |

WRB: b duw biy dey d | skiyp skm | ski bp diy d di d |


b duw biy dey d | skiyp skm | sk bp diy d d d |

WRB: dip dw diy dip | duw d dw d ...


dp dw diy dp | duw d dw d ...

2.1.2. Armstrong, Louis. Hotter Than That scat solo (December 13, 1929).
Hotter than That, Track 1, 1:18. Okeh Records 8535.
Transcription BK by Barry Kernfeld (1995: 168).
Transcription PAS and time markers by Patricia A. Shaw.
Transcription WB/R by Bauer (2002b: 308), citing Reeves (2001): bars 4954.
[1:19]

BK1: Dip deh doop da, doe doe doe doe.


PAS: dp di dup d da do do do

BK2a: Dah dew dah doot doot dew, da dee dee doot,
PAS: da daw da dut dut du d di di dut

BK2b: daw bee do bee dup baw lahp baw.


PAS: da bi du bi dup ba la(p-) baw
[1:26]

BK3a: Bah bee boop, buh dee bee doop bee,


PAS: ba bi bup b di bi dup bi
BK3b: hew law bah de bohm, bah bah bah bough
PAS: h lo ba di bom ba ba ba baw
[1:30]

BK4a: Wah bee bah bee bee, low bah dah-oh-ah,


PAS: wa bi ba bi bi lo ba d o wa
BK4b: lah dah bee bop bah deep bah feh.
PAS: la da bi bap b dip ba b
[1:35]

BK5: Dah to dit dit dew dup, dee duh doe.


PAS: da tu dt dt du dp di d dol

186

Scat syllables and markedness theory

BK6a: Rip dee duh duh dew dah daw-ee-ya doe doe dip dip,
PAS: rp di d d du da do wi yo d
dt dp
[1:40]

BK6b: baw buh bah bah baw beep bah beep baw bah baw bah bah beep bah beep
WRB: | boh b boh | ba b biy | b biy | bow b bow | b ba biy | ba biy |
| b b b | ba b biy | b biy | bow b bow | b ba biy | ba biy |
PAS: b ba b b bw bip bw bip b ba b
b b bip b bip

BK6c: thiz dit duh duh.


PAS: di dt d d
[1:49]

BK7: Reap dew dit done, dah nah naw naw deep dah dee, dah done dah dew.
PAS: rip diw dut dn da d na n dip d di da dan d dl
[1:52]

BK8: Bah bah dah beep bew.


PAS: ba b da bip biw
BK9: Bah bee dut zuh bow.
PAS: ba bi dp da b
[1:56]

BK10: Wah-oh dove dew, duh boop bee dew the boop, wah-oo-lough.
PAS: wuw dag du d bup bi du di bum wu lw
[guitar]
[2:02]

BK11: Zwee boo bee dew um-wow dah-dah-wow.


PAS: zwi bu bi du mwaw d d waw
[guitar]
[2:06]

BK12: Oooooo dah-dum-wah um-mough hmaf hwow.


PAS:
da dm ww mw
waw
[guitar ... ]
[2:11]

[2:16]

BK13: Reap deh diddle dee tih duh, boo wuh buh bow.
PAS: rp di du dl di b dp bwa
b bo

2.2.1. Baker, Chet. Everything Happens to Me. (1955)


Verve Jazz Masters 32, Verve CD 314 516 9392.
Track = 3.31 minutes, Scat bridge = [2:272:58].
Transcription JB by Jim Bastian (Bastian and Alexander 1995: 14).
Transcription PAS and time markers by Patricia A. Shaw.
[2:27]

JB: | Det n deh dah n duh | dit dah dah dah


PAS: | dt d dt d | dt d d
[2:34]

JB: | ah det n deh deh deh | det dee yah dah


PAS: | h dt d d | dt di y d
[2:42]

yah dah deh | deh


JB:
PAS: y dn d | d

187

Patricia A. Shaw
[2:44]

dah ee dah dut n dah dee dah | dah yah dah deh deh
JB:
PAS: d i d dt d de d | d y d
[2:50]

JB:
bah | deh deh deh deh dah dah dah det n deh dah dah dah | det n deh
PAS: b | d d t d d d | t

2.2.2. Baker, Chet. Everything Happens to Me. (1989)


Chet Baker Sings and Plays from the Film Lets Get Lost, Novus CD3054-2-N.
Track = 5:15 minutes, Scat bridge = [3:474:21].
Transcription JB by Jim Bastian (Bastian and Alexander 1995: 15).
Transcription PAS and time markers by Patricia A. Shaw.
JB: Hoo day dut n dah dah deh deh deh deh | deh
PAS: hu de dt de y dey d d d |
[3:47]

[3:57]

JB: yah deh deh dah deh |


PAS: d d dey d di |
[4:06]

JB: dah deh dah deh deh deh deh deh deh deh deh dee |
PAS: dw d d d d d du dw du d d di |
[4:10]

JB: dee dee dee dee dee dee dee dee dee dee deh det deh |
PAS: di du d de du d d du
d dt d |
[4:15]

JB: yeh deh det n deh dit n deh dee ee | day doo ee doo
PAS: d d dt d dt d duiy | d du

2.3.1. Carter, Betty (ne Lillie May Jones) 19291998. Thou Swell. (1955)
Meet Betty Carter and Ray Bryant. Columbia/Legacy CK 64936, 6. (from A
Connecticut Yankee. 1927. Rodgers and Hart, WB Music Corp.)
Transcription by W. R. Bauer (2002a: 251; includes musical transcription).
Transliteration (2nd line) follows the principles outlined in Appendix 1.

(nasal)

WB: h l dow di dl y d | hiy d d ba yow bow || ba ba duw b ba biy |


h l dow d d y d | hiy d d ba yow bow || ba ba duw b ba biy |

WB: di dn di d yow bow | ba ba duw b ba ba duw b |


d d d d yow bow | ba ba duw b ba ba duw b |

WB: uw d duw ly duw uw | uw duw ly a m | bey dow |


uw d duw ly du wuw | u w duw ly a m | bey dow |

WB: la m biy b duw b duw wiy | da yuw duw b | du du di duw b |


la m biy b duw b duw wiy | da yuw duw b | du du d duw b |

WB: di dl d i duw m | ba b d iy ra | la m biy b duw b |


d dl d i du wm | ba b d iy ra | la m biy b duw b |

188

Scat syllables and markedness theory

WB: iy b duw b ba b | di dliy duw bw | hey ba b spi d l di d l |


iy b duw b ba b | d dli y duw bw | hey ba b sp d l d d l |

WB: ba b di d l bey (ay || fiyl ) ||


ba b d d l bey I feel

2.3.2. Carter, Betty. Open the Door (1979)


Words and music by Betty Carter, 1964. MyKag Publ. Co.
The Audience with Betty Carter. Bet-Car MK 1003; reissue Verve 835 6841.
Transcription by W.R. Bauer (2002a: 294303; includes musical transcription).
Transliteration (2nd line) follows the principles outlined in Appendix 1.

WB: | d | dow | ... | ... |

WB: duw duw | duw |


WB: h | duw duw | diy | duw | ... | ... |

WB: duw iy uw iy uw | duw | | ... | ... |


du wi yu wi yuw | duw | | ... | ... |

WB: le d dey dey | dey dey dey dey | dey | | ...


l d dey dey | dey dey dey dey | dey | | ...

3.1. Barry Mann. Who Put the Bomp? (1961)


Words and Music by Barry Mann and Gerry Goffin. ABC-Paramount 45 NK-10237.
Lyrics from The Official Barry Mann and Cynthia Weil Website:
http://www.spectropop.com/hmannandweil.html.
Transcription by Patricia A. Shaw.
Due to space limitations, only lines that have scat are included below, and only scat
syllables are transcribed. For reference, lines are numbered at the left.
1.

Who put the bomp

7. Who put the dip

2.

In the bomp bah bomp bah bomp?

8. In the dip da dip da dip?

3.

Who put the ram

4.

In the rama lama ding dong?

9. Boogity boogity boogity

5.

Who put the bop

10. Boogity boogity boogity shoo

bamp

bam b bam b bam


rm

r m l m d d
bap

6. In the bop shoo bop shoo bop?


bap bap bap

dp

dp d dp d dp

...
bu g di bu g di bu g di
bu g di bu g di bu g di u

...

189

Patricia A. Shaw

3.2. Sha Na Na Mr. Bass Man


Words and Music by Johnny Cymbal (1963). Original release: Kapp 503.
Re-recorded by Sha Na Na, Rama Lama Ding Dong (1980).
Transcription by Patricia A. Shaw.
[Note: The lyrical lines are marked for BM Mr. Bass Man, W I wanna be...,
and Back for the back-up singers line. English is in italics and in standard
orthography. Only scat syllables are transcribed.]
For reference, scat lines are numbered at the left.
BM: 1. baw b b baw b baw b baw baw
2. b b baw b b baw b baw b baw bm bm
W:

Mr. Bass Man, youve got that certain something...


Mr. Bass Man, you set that music thumpin
To you its easy
when you go 1-2-3

3. b b b bam ba?
BM:
W:

[0:33]

BM:
W:

Yeah! Mr. Bass Man, youre on all the songs


5. s d d b b bum bum
6. And the d dt ba ba bw
Hey Mr. Bass Man, youre the hidden King of Rock n Roll
7. b b b ba bw??
No, no!
8. b b ba b b bu ba b ba ba bw
Oh, it dont mean a thing when the leaders singin
Or when he goes

9. ay yay yay yay y y


Hey Mr. Bass Man, Im askin just one thing
Will you teach me, mmm, yeah, the way you sing
Cause Mr. Bass Man, I wanna be a bass man too
10. b b b ba b?

[0:41]

BM:
[1:00]

You mean:
4. b b baw b b b baw b baw baw ba ?

W:

BM:

Try this:
11. b b b bu bu b ba b bw
Oh Mr Bass Man, I really think Im winnin
12. With the bp bum b
13. And a d dt d d d
Oh Mr. Bass Man, now Im a bass man too
14. d d d d
Thats it!
15. bu b b bu b bu ba b ba

Back: 16. bm bm bm b b bm bm b bm b b b bm bm ba b bm
BM: Now you!
[1:18]

W: 17. d d d d b bum bum bo bm b b b b bom bm


BM: With me

190

Scat syllables and markedness theory

BM/W: 18. bm b b bm b b bm ba bum ba bm


19. bm b b bm b b bm _ bm bm
20. bm b b bum b b bm b b bm b b bm bm
21. bm b bm b b bu b bu b b bm
[1:30]
[1:35]

W: Oh, it dont mean a thing oh when the leaders singin


Or when he goes
21. ay yay yay yay y ya
Hey Mr. Bass Man, Im askin just one thing
Will you teach me, mmm, yeah, the way you sing
Cause Mr. Bass Man, I wanna be a bass man too
22. ba b
BM:
W:

BM:
[2:08]

Thats it.
27. bu b b bu b bu ba b bw
28. bm bm bm b b b bm bm bm b b b b b bm bm b b bm
Now you!

W: 29. b b bom b b bom b bum


BM:

[2:16]

Oh Mr. Bass Man, I really think Im winnin


With the

24. bp bum b
And a
25. d dt d d da(w)
Oh Mr. Bass Man, now Im a bass man too
26. d d d d d dw

[2:05]

[2:12]

Try this:
23. b b b bu bu b ba b bw

Soundin good ...


With me

BM/W: 30. bm b b bm b b bm ba bm ba bm
31. bm b b bm b b bm _ b bm
32. bm bm b b bm bm b b bm b b bm b b ba b
33. b bm b b b b bm b bm
34. _ bum bum b bm b b b b bm
35. b b b b bm bm b bm b b ...

191

Anda mungkin juga menyukai