Anda di halaman 1dari 7

The frequency scale of speech intonation

Hermes, D.J.; Gestel, van, J.C.

Published in:
Journal of the Acoustical Society of America

DOI:
10.1121/1.402397

Published: 01/01/1991

Document Version
Publisher’s PDF, also known as Version of Record (includes final page, issue and volume numbers)

Please check the document version of this publication:

• A submitted manuscript is the author’s version of the article upon submission and before peer-review. There can be important differences
between the submitted version and the official published version of record. People interested in the research are advised to contact the
author for the final version of the publication, or visit the DOI to the publisher’s website.
• The final author version and the galley proof are versions of the publication after peer review.
• The final published version features the final layout of the paper including the volume, issue and page numbers.
Link to publication

Citation for published version (APA):


Hermes, D. J., & Gestel, van, J. C. (1991). The frequency scale of speech intonation. Journal of the Acoustical
Society of America, 90(1), 97-102. DOI: 10.1121/1.402397

General rights
Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners
and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.

• Users may download and print one copy of any publication from the public portal for the purpose of private study or research.
• You may not further distribute the material or use it for any profit-making activity or commercial gain
• You may freely distribute the URL identifying the publication in the public portal ?
Take down policy
If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately
and investigate your claim.

Download date: 08. Nov. 2016


The frequency scale of speechintonation
Dik J. Hermes and Joost C. van Gestel
Institute
for Perception
Research/IPO,
P.O.Box515,NL 5600MB Eindhoven,
TheNetherlands

(Received8 May 1990;accepted


for publication
3 January1991)
In intonationresearch, prominence-lending pitchmovements haveeitherbeendescribed on a
linearor on a logarithmicfrequencyscale.An experimenthasbeencarriedout to check
whetherpitchmovements in speechintonationare perceivedon oneof thesetwo scalesor on a
psycheacoustic scalerepresenting the frequencyselectivityof theauditorysystem.This last
scaleis intermediarybetweenthe othertwo scales.Subjectsmatchedthe excursionsizeof
prominence-lending pitchmovements in utterances resynthesized in differentpitchregisters.
Their taskwasto adjusttheexcursion sizein a comparison stimulusin sucha waythat it lent
equalprominenceto the corresponding syllablein a fixedteststimulus.The comparison
stimulusand the teststimulushadpitchesrunningparallelon eitherthe logarithmicfrequency
scale,the psycheacoustic scale,or the linearfrequencyscale.In one-halfof the experimental
sessions, the teststimuluswaspresented in the low register,whilethecomparison stimuluswas
presented in the high register,and,conversely, for the otherhalf of the sessions.The resultis
that, in all cases,stimuliare matchedin sucha way that the averageexcursionsizesin different
registers areequalon thepsycheacoustic scale.
PACS numbers:43.71.Bp,43.66.Hg,43.71.Cq

INTRODUCTION wood, 1961, 1990). This has beenverifiedanatomicallyfor


In physics,frequencyis generallyexpressed in termsof cat by Wilson and Evans (1977) and by Liberman (1982),
the unit hertz (Hz). In variousbranchesof hearingresearch, who alsosuppliedformulasrelatingfrequencyto placeon
other unitsare used.In musicperception,the relativedis- the basilar membrane.In this study, the formulasfrom
tancebetweentwo tonesis expressed in a musicalinterval Greenwood ( 1961, 1990) for man were used:
such as the semitone and the octave. In this musical scale, E ----16.7Iogm( 1 +f/165.4), ( 1)
equal distancesrepresentequal frequencyproportions,
which amountsto usinga logarithmicfrequencyscale.In f= 165.4(10ø'ø6• -- 1), (2)
psycheacoustics, the Mel scalehas been used,basedon a wherefisfrequency in Hz, andE istheERB-ratein ERB.
subjectivemeasureof pitch "magnitude" (Stevenset al., Theseexpressions giveaboutthe samevaluesasthe ERB-
1937; Stevensand Volkman, 1940). More often, a related ratescalepublished byMooreandGlasberg { 1983)(Moore
frequencyscaleisused,theBarkscale.Thisisapproximately and Glasberg, 1986, p. 254). Therefore,the psychoacoustic
linear for frequencies below 500 Hz and approximates a scale as used in this study will alsobe indicated with ERB-
logarithmicfrequencyscalefor higherfrequencies. The Bark rate scale.
scaleis derivedfrom measurements of the frequencyselec- In intonation research, pitchof speech haseitherbeen
tivity of the humanauditorysystem,as measuredby the so- expressed in Hz (e.g.,CooperandSorensen, 1981)or in
called critical bandwidth (Fletcher, 1940; Zwicker et al., semitones (e.g.,'t Hart etal., 1990).In anexperiment setup
1957;Zwicker, 1961). Analytical expressions for the Bark to find out whetherprominence-lending pitch movements
scalearepresentedby ZwickerandTerhardt (1980), andby shouldbe expressed in Hz or in semitones, Rietveldand
Traunmiiller (1990). A newer variety of this scale is the Gussenhoven (1985) concluded thatprominence judgments
equivalent-rectangular-bandwidth-rate(ERB-rate) scale of theirsubjects werein betteragreement with an Hz scale
(Patterson, 1976), for which analyticalexpressions
are pre- than with a scale of semitones. This result was based on com-
sentedin Moore and Glasberg (1983) and Glasbergand parisonsof the prominence of syllableswith pitch move-
Moore (1990). In the ERB-rate scale, the critical bands are mentsin differentfrequencyregions.As their stimulusset
narrower,especiallyat lower frequencies,than in the Bark comprised onlysentences recordedfrom a femalespeaker,
scale.For frequencies below500 Hz, the ERB-rate scaleis whichwereresynthesized in an equalor in a lowerpitch
neitherlinear,suchas the Bark scale,nor logarithmic,but register,theyconcluded that it wouldbeprematureto con-
somethingin between.A detailedand quantitativediscus- cludethatprominence-lending pitchmovements shouldbe
sionon this scaleis givenby Glasbergand Moore (1990). expressed in Hz. Accordingto Gracldol( 1986,p. 228), the
The frequencyscalesderivedfrom the frequencyselec- pitchrangeusedbymostwomenseems toberatherlessthan
tivity of the auditorysystemhavebeenassociatedwith dis- that usedby mostmen,whenexpressed in semitones,but
tancesalongthe basilarmembrane(Fletcher, 1953;Green- larger,whenexpressed in a linearscale.Choosingbetween

97 d. Acoust. Sec. Am. 90 (1), July 1991 0001-4966/91/070097-06500.80 ¸ 1991 Acoustical Society of America 97
I
the two, he concludesthat, "wheneverintervals in pitch Rise
mustbecomparedat differentfrequencies, a logscaleis to be
preferred."Only Traunmiilleret al. ( 1989) havesofar con-
sideredthepossibilitythat pitchmovements in speechinton-
ation may bestbe expressedin a scalederivedfrom the fre- I { I I [ I { I
Rise-fall
quency selectivity of the auditory system. Following
Graddol (1986), theyshowedpreferencefor the logarithmic
frequencyscale,however.
This problem has consequences for variousapplica- 5OO
tions.If in syntheticspeech,e.g.,onewantsto givethe same
prominenceto accentedsyllablesin maleasin femalespeech, •oo
the excursionof the pitch movementsmust be the sameon
the frequencyscalein which the prominenceof pitch move- •o
0 0.3 0,6 0.9
mentsis perceived.As femaleand male voicesdiffer by al-
time (s)
most 1 oct, the differencebetweentheseapproachescan
causeconsiderable discrepancies.For example,an excursion FIG. 1. The three differentprominence-lending pitch movements.The
of 120 to 180 Hz in a male voicewould correspondto an pitch movements superimposed on the declinationline giveprominenceto
excursion of 240 to 300 Hz in a female voice if a linear scale the secondsyllable,the vowelonsetof whichis indicatedby the crossbar.
The rise starts from low declination 70 ms before vowel onset and ends at
were used, whereas an excursion of 240 to 360 Hz would high declination50 ms after the vowelonsetof the secondsyllable.The
provideequal prominenceif a logarithmicscalewere used. risingpart of the rise-fallhasthesametimingandis followedby the falling
On an ERB-rate scale,equalprominencewould requirean part, which starts80 ms and ends200 ms after the vowel onset.The fall
excursion of 240 to 325 Hz. startsfrom high declination20 ms beforethe vowelonsetof the second
syllable,and endsat low declination 100ms after the vowelonset.The same
In orderto decideon whichscaletheexcursions of pitch timing is usedin the IPO text-to-speechsystem.
movementsare perceived,subjectsadjustedthe variableex-
cursionsizeof a pitchmovementin a comparison stimulusto
the fixedexcursionsizeof a pitchmovementin a teststimu-
lusresynthesized in a differentfrequencyregister.This was
This was fixed within one run. The secondstimulus,referred
done both in sessions in which the test stimulus was in a low
to asthe comparison
stimulus,waspresented
in anotherreg-
register,whilethe comparisonstimuluswasin a high regis- ister and had a variable excursion size. In the first trial of a
ter, and for sessions
in which the teststimuluswasin a high
run, the excursionsizein the comparison stimuluswaszero.
register,whilethecomparison stimuluswasin a low register.
Subjectswere askedfirst to increasethe excursionsizein the
Furthermore, this was done for six differentexcursionsizes,
comparisonstimulusto suchan extentthat the prominence
and for three different prominence-lending
pitch move-
of its accentedsyllableclearlyexceeded the prominence of
ments, a rise, a rise-fall, and a fall.
the accentedsyllablein the teststimulus.In the nexttrials,
I. EXPERIMENT the subjectswereaskedto decreaseand increasethe excur-
sionsizein the comparison stimulusuntil it wasjudgedto
A. Materials
givethe sameprominenceas the pitch excursionin the test
The stimuli consisted of modified versions of one utter- stimulus.When the subjecthad donethis,the nextrun start-
ance,/mamfima/, spokenby a male speaker.Its duration ed. In each session,there were six runs for six different ex-
was0.77 s. The secondsyllablecarriedan accent.Pitch mod- cursion sizes in the test stimulus. The six different excursion
ificationswere appliedwith the pitch-synchronous overlap sizesin the test stimuluswere presentedin randomorder
and add (PSOLA) technique(Hamon etal., 1989), result- with a different order in each session.
ing in very naturalsoundingspeechstimuli.Durationand As mentioned,the two stimulimatchedin prominence
amplituderelationswere kept constant. werepresentedin differentregisters.In onesetof sessions,
These stimuli were resynthesizedwith one of three theteststimuliwerepresented in the highregister,whilethe
prominence-lending pitchmovements, a rise,a rise-fall,and comparisonstimuliwerepresentedin the low register(see
a fall, superimposedon declinationlinesasdisplayedin Fig. Fig. 2). Thesewill be referredto as downwardsessions. In
1. Thesepitchcontoursconsisted of linesthat werestraight anothersetof sessions, the teststimuliwerelow in register,
on either a linear frequency scale, an ERB-rate scale, or a and the comparison stimuli high. These will be referred to as
logarithmicfrequencyscale,resultingin threedifferentver- upwardsessions (seeFig. 3). Theseupwardanddownward
sions, which will be referred to as LIN, ERB, and LOG, sessionstook place for all three pitch movementsand for
respectively.
All versionshad a declinationend point of ei- eachof the threedifferentfrequencyscales,givingeighteen
ther 75 or 180 Hz, definingthe low versionsand the high different sessions.
versions.The high versionssoundedlike a male falsetto In eachsession,
the comparisonstimuliformeda setof
voice. ten stimuli with increasingexcursionsize. They were con-
All sessionsconsistedof adjustmentruns in which two structedin sucha way that, within one register,they were
stimuli,a low anda highversion,wererepeatedlypresented almostidenticalin all three frequencyscales[compareFig.
to the subjectwith an interstimulusintervalof 1 s. The stim- 2 (b), (d), and (f), andFig. 3 (b), (d) and (f) ]. Exactequa-
ulus presentedfirst will be referredto as the test stimulus. lity was impossibleas the linesthat made up the pitch con-

98 J. Acoust.Sec. Am., Vol. 90, No. 1, July 1991 D.J. Hermes and J. C. van Gestel: Frequencyscale of intonation 98
te•t stimuli comparisonstimuli te.t stimuli comparisonstimuli
)arallel on •arallel on
caJe of: (a), , ,q , , eft)* ' ' , , . ,tale of:

'(C) (c) :(d)

ERB ERB
5OO
-(e) '(el -if)

200!
50
0.3 0.6 0.9
• so[ H2
0.3 06 09

time (•) ttme (s)

FIG. 2. Stimulusconfigurationof the downwardadjustmentsessions for FIG. 3. Stimulusconfiguration of the upwardadjustmentsessions


•or the
therise-fall.The rangeof theteststimuliisdisplayed
in (a), (c). and (e), rise-fall;otherwise.as Fig. 2.
showingthe teststimuluswith the lowestand the highestexcursionsize.
The rangeof thecomparison stimuliisshownby thearrowin (b), (d), and
(f). The continuous linesshowthe prominence-lending pitchmovements,
which,in the frequencyscalementionedat the right, run parallelto the
displayedteststimuli. Notice that the comparisonstimuli in the threefre-
quencyscalesare verymuchthe same,whereasthe teststimuliare different for theLOG versions,
in Fig. 2(c) for the ERB versions,
and
in both the startfrequencyof the declinationand in the sizeof the excur- in Fig.2(e) fortheLIN versions.Theircorresponding
com-
sions.in anticipationoflhe result,all stimuliarepresented
with a ERB-rate parison stimuli,i.e.,thosecomparison
stimulithatrunparal-
scale as ordinate.
lel to themin thecorresponding
frequencyscale,arepresent-
edin Fig.2(b), (d), and(f). Similarly,Fig.3 shows
thetest
andcorresponding comparison stimulifor the upwardses-
sions.Observethat, for eachteststimulus,therewasa com-
tour wererequiredto bestraightin oneof thethreefrequen- parison stimulus
witha pitchcontourrunningexactly paral-
cy scales.A transformationof one scaleto anotherbeing lel in the corresponding
frequency scale.The shiftsin the
nonlinear,the pitchcontourscouldonlybe straightin one threefrequency scales
resultedin teststimulithatdiffered
frequencyscale.The differenceswere very small, however. muchmorefromeachotherthan thecomparison
stimuli,as
The low versionsof thesecomparison stimulihada common canbeseenin Figs.2 and3. An upwardshiftin semitones
endpoint of their low declinationline of 75 Hz and a start resulted
in a steeper
declination
andlargerpitchexcursions
than a shift in Hz. The shift in the ERB-rate scalewas some-
pointof 93 Hz. This amountsto a declinationof 4.85 semi-
where in between. The converseof this was true for the
tonesper second(st/s). This is derivedfrom the rule for a
downward shifts.
declinationof 11/(t + 1.5) st/s, t beingthe durationof the
utterance in this case0.77 s. (This rule for the declination In anticipationof the results,the pitch contoursshown
line is used in the Institute-for-Perception-Research di- in Figs. 2 and 3 are presentedon an ERB-rate scale.For
phone-speech-synthesis system.It datesfrom timeswhen clarity,if subjects
perceived prominence of pitchmovements
intonationwasdescribed in semitonesat IPO.) For thehigh on a logarithmicfrequencyscale,theywouldmatchthe up-
versions,the commonend point was 180 Hz, and the start per stimulusin Fig. 2(a) to the upperstimulusin Fig. 2(b),
point was 223.3 Hz. The pitch movementswere superim- and the upperstimulusin Fig. 3(a) to the upperstimulusin
posedon thesedeclinationlines.The smallestexcursionsize Fig. 3(b). However, if they perceivedprominenceon an
was zero, and, for the low versions,the highestwas 12.17 ERB-rate scale,they would match thesetest stimulusto a
semitonesfor the LOG versions,2.00 ERB for the ERB ver- highercomparisonstimulusin the downwardsessions, and
sions,and 76.5 Hz for the LIN versions.For the high ver- to a lower comparisonstimulusin the upward sessions, as
sions,thelargestexcursionwas8.25semitones for the LOG can be discernedfrom Figs. 2 and 3.
versions,2.00 ERB for the ERB versions,and 109.9 Hz for There wereninesubjects, all of whomwerestudentsor
the LIN versions. The intermediate excursion sizes were staffmembersof this instituteinvolvedin speechand hear-
such that they were equidistantin the correspondingfre- ing research.Somewere specialists in intonationresearch,
quencyscale.As mentioned,the result was such that the whileotherswerenot. Someweremusicallytrained,andone
comparisonstimuliin all threefrequencyscaleswere very had absolutepitch. None reportedhearingdefects.Each
much the same.The six teststimuli were then producedby subjectcompletedall the sessions.
an upward (for the downwardsessions)or downward (for
B. Results
theupwardsessions) pitchshiftof thesixmiddleversionsof
thesecomparisonstimuli in sucha way that the endpointsof There appearedto be no significantdifferencebetween
thelow declinationlineconcurredwith thefixedendpointof the results for the rise, the rise-fall, and the fall. Therefore,
the low declinationline in the other register.For the down- the resultsof thesethree conditionsare collapsed.The aver-
ward sessionswith the rise-fall, the test stimuli with the age resultsacrossall subjectsare presentedin Fig. 4. The
smallestandthelargestexcursions
arepresented
in Fig. 2(a) coordinatesrepresentthe frequencyscalein which the low

99 J. Acoust.Soc. Am., Vol. 90, No. 1, July 1991 D.d. Hermes and d.C. van Gestel: Frequencyscale of intonation 99
AVERAGES ACROSS ALL SUBJECTS
represent thestandarddeviationof the results.The straight
downward sessions
linewith a slopeof 45 degrepresentstheexpectedoutcomeif
upward sessions
thesubjecthad matchedprominence in thefrequencyscale
in which the resultsare plotted.In Fig. 4(a) and (b), the
resultsare presentedfor the LOG versions,in 4(a) for the
downwardadjustments, andin 4(b) for the upwardadjust-
ments.In Fig. 4(c) and (d), the resultsfor the ERB versions
are presented,and in Fig. 4(e) and (f), the resultsfor the
LIN versions. The results show that for the sessionswith the

semLtones
.... ;o....
semLtones
t's.... 2'0 LOG versions, a teststimuluspresented in a highregisteris
matchedto a comparison stimulus,which,in semitones, has
a higherexcursionsize[ Fig. 4 (a) ]. On theotherhand,when
(d) theteststimulusispresented in thelowregister,it ismatched
to a comparisonstimuluswhich, in semitones,hasa lower
excursionsize [Fig. 4(b) ]. For the ERB versions,the test
stimuliarematchedto a comparison stimulusthathasabout
anequalexcursion sizein theotherregister,in thisfrequency
scale.There isa tendencyto deviatefrom the ERB-rate scale
.... • .... ; .... •
for the lowestand the highestexcursions
of the teststimuli,
but it will be shown that this could be attributed to a tenden-

If)
cyto matchtheexcursion sizeof thecomparison stimulusto
the averageof all excursionsizes.For the sessionswith the
LIN versions, a teststimuluspresented in a highregisteris
matchedto a comparison stimulus,which,in Hz, hasa lower
s
excursion size.Whentheteststimulus ispresentedin thelow
register,thereis a tendencyto matchit to a comparison
so.... •o •o .... •o stimuluswhich,in Hz, hasa higherexcursionsize.
Hz Hz

C. Some comments
FIG. 4. The averageresultsacrossall ninesubjectswho participatedin the
experiments. The coordinatesrepresentthe frequencyscalein which the
Not everysubjectperformedthe experimentequally
pitchesof the low and the high stimulusran parallel.Thus (a) and (b)
showsthe resultsof the sessionswith the LOG versions,(c) and (d), those consistently. Somecomplainedaboutthe difficultyof the
for the ERB versions,and (e) and (f) for the LIN versions.The resultsfor task.Suchsubjects producedmatchingsthat tendedmoreto
the downwardsessions are presentedin (a), (c), and (e), while (b), (d), the averageof the comparison stimuli,producinga higher
and (f) presentthe resultsof the upwardsessions. The abscissaof the dia-
mondshowsthe endpoint of the lowerdeclinationline of the teststimulus,
variancein the results.Thesesubjects couldbe selectedby
whiletheordinateof thediamondshowstheendpointof thelowerdeclina- comparingtheirresponses in theupwardandthedownward
tionline of thecomparison stimulus.The sixdifferentexcursions of thetest sessions.Ifa teststimulusin thelowregisterismatchedto a
stimuliare presented astheintervalbetweenthe abscissa of thedatapoints comparison stimuluswith somespecificexcursionsizein the
andtheabscissa of thediamond.The averages of thematchings by all sub-
jectsare presentedas the intervalsbetweenthe ordinateof the data points highregister,a teststimulusin thehighregisterwithsuchan
and the ordinateof the diamond.The verticalbarsrepresentthe standard excursion sizeshould,in itsturn,bematchedto a compari-
deviationof the results.The straightline with a slopeof 45 degrepresents sonstimulusin the low registerwith aboutthe sameexcur-
theexpected outcomeif thesubject hadmatchedprominence in thefrequen- sionsizeastheoriginalteststimulus.Thisamountsto com-
cy scalein whichtheresultsareplotted.
bining the resultsof a downwardsessionwith thoseof a
corresponding upwardsession. When the excursionsizesof
the lowerstimuliare plottedagainstthe excursionsizesof
the higherstimuli,thereshouldbe a highcorrelation.This
andthe highstimulusran parallel.The abscissa
of the dia- correlation coefficient was calculated for all sessionsand for
mondshowstheendpointof thelowerdeclinationlineof the all subjects.This resultedin a quantitativemeasurethat
teststimulus( 180Hz for the downwardsessions,
and 75 Hz couldbeusedto selectsubjects responding consistently.
All
for the upward sessions),while the ordinate of the diamond subjectsparticipated in 18 sessions,while for the determina-
showsthe endpoint of the lowerdeclinationline of the com- tion of one such correlation coefficient two sessions were
parisonstimulus(75 Hz for the downwardsessions, and 180 necessary.So, a total of nine correlation coefficientswas ob-
Hz for the upwardsessions).-Thedatapointsrepresent the tained for each subject.The variancesand the bias in the
averages across
all subjects.
Theabscissaof a datapointrep- directionof theaverage
weremuchlesswhenonlythosefive
resentsthe endpointof the upperdeclinationline of the test subjectswere selectedfor whom this correlationcoefficient
stimulus,whileitsordinaterepresentstheaverageendpoint exceeded
0.75 in morethansixof theninesessions. Figure5
of the upperdeclinationline of the comparisonstimulus. showsthe averageresultsfor the fiveconsistently
respond-
This means that the interval between a coordinate of a data ing subjects.
point and the correspondingcoordinateof the diamond After beingtold that they had matchedprominenceof
givesthe excursionsize of the stimulus.The vertical bars pitch movementson an ERB-rate scale,a few subjectsfelt

100 J. Acoust.
Sec.Am.,Vol.90, No.1,July1991 D.J. HermesandJ. C. vanGestel:Frequency
scaleofintonation 100
AVERAGES ACROSS CONSISTENT SUBJECTS ONE "MUSICALLY" LISTENING SUBJECT

downward sessions upward sessions downwardsessions upward sessions

(a)

(b)
(a)

/.
.... ,'o.... ,'•....
•emLto•es

(d) (d)
ß• /

EaR

If) (f)

.... ,•o.... • .... •


Hz
Hz

FIG. 5.Averageresultsofthematchings
byfivesubjects
showing
consisten- FIG. 6. Resultsof sixsessions
with a fall aspitchmovement,
in whichthe
cy in theirresponses;
otherwise,
asFig.4. subjectmatchedtheslimnitona musicalscale.andignoredtheprominence
of thesyllables;
mhcrwise,asFig. 4.

challenged to repeattheexperiment, matchingthestimulion


a musical,i.e., a logarithmicfrequencyscale.Someof them
succeeded in this,andanexampleisshownin Fig. 6 for a fall. The experimentincludedthree different prominence-
Thesesubjectsreported,however,that this task was very lendingpitch movements.For all cases,thesameconclusion
difficultfor the kind of stimulususedhere.It requireda couldbe drawn.Thesethreepitch movementslasteda rela-
differentway of listening,in whichthe relativepitchesof tively short time and extendedover not much more than one
successive syllableshad to be analysedin termsof musical syllable.The experiment
did notincludeslowerpitchmove-
intervalsbetweenthe highand the low declinationline.This mentsthat extendover varioussyllables,asoccurin Dutch.
observation showsthat listeningmusicallyis a completely Declinationwasalsoincluded,butnot variedasan indepen-
differenttask, in which anotherperceptualmechanismis dentvariable.So,theoretically,it remainspossiblethat dec-
usedthan whenprominence of accentedsyllablesis per- linationandslowpitchmovementsareperceivedon another
ceived.Incidently,the onesubjectwith absolutepitchalso scale.Nothingwasobservedthat supportedthisidea,how-
matchedpitch movementson an ERB-rate scale. ever. Therefore, it is concluded that the number of critical
bandscrossedby a pitch movement,or the velocitywith
II. DISCUSSION
which critical bands are crossed,determinesthe extent to
The resultsshowthatpitchmovements
in speech
inton- whicha pitchmovements
contributes
to theprominence
of a
ation can bestbe expressedin a frequencyscalethat is de- syllable.
rivedfromthefrequency
selectivity
of theauditorysystem. There are somedata in the psycheacoustic literature
Excursionsthat are equalwhenexpressed
in Hz or in semi- whichcorroborate theseresults,whenspeechis considered
tonesdo not havethe sameprominencewhen presentedin as a frequency-modulated (FM) soundsignal.Basedon
differentregisters.
Theseresultsdo not sayanythingabout Fechner'shypothesis that magnitudeperception canbeob-
whatin a pitchcontourlendsprominence to a syllable,e.g., tained by integrationof just-noticable-differences(jnd's)
theexcursionsize.or theslopeof thepitchmovements.Since [seealsoSuchowerskyj (1977) andHoutsmael al. (1980) ],
the traditional critical-band scale with the Bark as unit is Zwislocki (1965, p. 49) deriveda nearly exact Mel scale
linear under 500 Hz, these resultsshow that the ERB-rate from jnd measurementsfor FM sine waves.On the other
scaleis to be preferredas far as speechintonationis con- hand,MooreandGlasberg(1989) foundthatjnd'sstill dif-
cerned. feredby a factorof 2, whenexpressed
asfractionsof ERBs.

101 J. Acoust.Sec.Am.,Vol.90, No. 1, July1991 D.J. HermesandJ. C. vanGestel:Frequencyscaleof intonation 101


Theycomparedthesejnd'sfor FM withinonetonewithjnd's Fletcher,H. (I940). "Auditorypatterns,"Rev.ModernPhys.12, 47-65.
for frequencydiscriminationof two successive tonesof con- Glasberg,B. R., andMoore,B.C. J. (I990). "Derivationof auditoryfilter
shapesfrom notched-noise data," Hear. Res.47, 103-138.
stant frequency,and concludedthat different mechanisms Graddol, D. (1986). "Discoursespecificpitch behavior,"in Inwnation in
mayaccountfor thedifferencebetweenthesejnd's.This may Discourse,editedby C. Johns-Lewis(Croom Helm, London), p. 221-
237.
alsohint at differentmechanisms underlyingthe perception
of pitch in intonationand the perceptionof musicalmelo- Greenwood,D. D. (I96I). "Criticalbandwidthandthefrequency
coordi-
natesof the basilarmembrane,"J. Acoust.Soc.Am. 33, 1344-1356.
dies, which, there is no doubt, are perceivedas identical Greenwood,D. D. (1990). "A cochlearfrequency-position function for
when they have pitchesthat run parallel on a logarithmic severalspecies--29yearslater," J. Acoust.Soc.Am. 87, 2592-2605.
frequencyscale.Klatt (1973) hasmeasuredjnd's for pitch Hamon,C., Moulines,E., andCharpentier,F. (1989). "A diphonesynthe-
sis systembasedon time-domainprosodicmodificationsof speech,"
discriminationand 't Hart (1981) for pitch distancedis-
Proc.IEEE Int. Conf.Acoust.SpeechSignalProcess.ICASSP-89,pp.
criminationin soundswhich "includedthe dynamicquali- 238-241.

tiescharacteristicof speech"(Klatt, 1973,p. 8). Their data Houtsma,A. J. M., Durlach, N. I., and Braida,L. D. (1980). "Intensity
are not accurateenough,however,to draw any conclusions perceptionXI. Experimentalresultson the relationof intensityresolu-
tion to loudnessmatching,"J. Acoust.Soc.Am. 68, 807-813.
concerningthe scaleon which thesejnd's are constant. Klatt, D. H. (1973). "Discriminationof fundamentalfrequencycontours
Also in vowel perception,thereare somediscussions on in syntheticspeech:implicationsfor modelsof pitch perception,"J.
whetherformantfrequencyis perceivedon a logarithmicor Acoust. Soc. Am. 53, 8-16.
a scalederivedfrom the frequencyselectivityof the auditory Liberman,M. C. (1982). "The cochlearfrequencymapfor thecat:Label-
ingauditory-nerve
fibersof knowncharacteristic
frequency,"
J. Acoust.
system (e.g., Nearey, 1989; Miller, 1989). No conclusive Soc. Am. 72, 1441-1449.
experiments havebeenreported,however.This is probably Miller, J. D. (1989). "Auditory-perceptual
interpretationof the vowel,"J.
partlydueto the factthat no operationalexperimentalpara- Acoust. Soc. Am. 88, 2114-2134.
digm is known in which equalityin differentfrequencyre- Moore,B.C. J.,andGlasberg,B. R. (1983). "Suggested
formulaefor calcu-
latingauditory-filter
bandwidths
and excitationpatterns,"J. Acoust.
gionscan be established. Soc. Am. 74, 750-753.
From theresultobtainedin thisstudy,someconclusions Moore, B.C. J., and Glasberg,B. R. (1986). "The roleof frequencyselec-
can be drawn on the way in which pitch is coded in the tivityin theperception of loudness,pitchandtime,"in Frequency Selec-
central nervoussystem.In speechintonation,the promi- tivityin Hearing,editedbyB. C. J. Moore(Academic,London),pp.251-
308.
nencethat a pitchmovementlendsto a syllableappearsto be Moore,B.C. J., andGlasberg,B. R. (1989). "Mechanisms
underlyingthe
well definedperceptually,sothat excursionsizesin different frequencydiscrimination
of pulsedtonesandthedetectionof frequency
frequencyregionscanbe compared.This wasusedto deter- modulation," J. Acoust. Soc.Am. 86, 1722-1732.
Neary, T. M. (1989). "Static,dynamic,and relationalpropertiesin vowel
mine on whichfrequencyscalepitch movementsin speech perception,"J. Acoust.Soc.Am. 85, 2088-2113.
intonationarejudgedequal.A frequencyscalederivedfrom Patterson,R. D. (1976). "Auditory filtershapesderivedwith noisestimu-
the frequencyselectivityof the auditory systemfitted the li," J. Acoust. Soc. Am. 59, 640-654.
resultsbest.Since,in speech, mostharmonics havefrequen- Rietveld, A. C. M., and Gussenhoven,C. (1985). "On the relationbetween
pitchexcursionsizeand prominence,"J. Phon.13, 299-308.
cieshigherthan 500 Hz, and alsothe ERB-scaleis nearly Stevens,S.S., Volkman, J., and Newman, E. B. (1937). "A scalefor the
logarithmicabove500 Hz, prominence of pitchmovements measurement
of thepsychological
magnitude
pitch,"J. Acoust.Soc.Am.
would be perceivedon an approximatelylogarithmicfre- 8, 185-190.
quencyscale,if perceivedprominencewerebasedon a com- Stevens,
S.S.,andVolkman,J. (1940). "The relationto pitchandfrequen-
cy: a revisedscale,"Am. J. Psychol.53, 329-353.
binationof the excursions of the harmonics.Sincethis ap- 't Hart, J. (1981). "Differentialsensitivity
to pitchdistance,
particularlyin
pearsnot to bethe case,it mustbeconcludedthat perceived speech,"J. Acoust.Soc.Am. 69, 811-821.
prominence isbasedonthecourseof pitchitselfandnotof its Suchwerskyj, W. yon (1977). "Beurteilungyon Unterschiedenzwischen
harmonics. Thismeansthat thereisa pitch-coding arrayin aufeinanderfolgenden Schallen,"Acustica38, 131-139.
't Hart,J.,Collier,R., andCohen,A. (1990).A PerceptualStudyoflntona-
thehumanspeechprocessor. It hasnowbeenshownthatthis tion (CambridgeU. P., Cambridge,England).
array hasthe samelinear organizationasthe array of filters Traunmfiller,
H., Branderud,
P., andBigestans,
A. (1989). "Paralinguistic
in the peripheralauditory system. speechsignaltransformations,"PhoneticExperimentalResearch,Insti-
tute of Linguistics;Universityof Stockholm(PERILUS) I0, 47-64.
Traunmfiller,
H. (1990). "Analyticalexpressions
for thetonotopic
sensory
scale," J. Acoust. Soc. Am. 88, 97-100.
ACKNOWLEDGMENTS
Wilson,J.P., and Evans,E. F. (1977). "Cochlearfrequencymap for the
cat," in Psychophysicsand Physiology of Hearing,editedby E. F. Evans
This work wassupportedby the Instituut voor Doven,
and J.P. Wilson (Academic,New York), p. 69.
Sint-Michielsgestel.
We are gratefulto Hans 't Hart andJac- Zwicker,E., Flottorp, G., andStevens,S.S. (1957). "Critical bandwidth in
quesTerken for their fruitful discussions
and constructive loudnesssummation," J. Acoust. Soc. Am. 29, 548-557.
commentson the manuscript. Zwicker,E. ( I96I ). "Subdivision
of theaudiblefrequency
rangeintocriti-
cal bands(Frequenzgruppen),"J. Acoust.Soc.Am. 33, 248.
Zwicker,E., andTerhardt,E. (1980). "Analyticalexpressions
for critical-
bandrateand criticalbandwidthasa functionof frequency,"J. Acoust.
Cooper,W. E., and Sorensen,J. M. (1981). FundamentalFrequencyin Sen- Soc. Am. 68, 1523-1525.
tenceProduction(Springer-Verlag,New York). Zwislocki,J. (1965). "Analysisofsomeauditorycharacteristics,"
in Hand-
Fletcher, H. (1953). Speechand Hearing in Communication(Van Nos- bookof Mathematical
Psychology,Vol.Ill, editedby R. D. Luce,R. R.
trand, New York), pp. 153-175. Bush,andE. Galanter(Wiley, New York), pp. 1-97.

102 J. Acoust.Soc.Am.,Vol.90, No.1, July1991 D.J. HermesandJ. C. vanGestel:Frequency


scaleofintonation 102

Anda mungkin juga menyukai