paper - Institute for Systems Research

Report 3 Downloads 203 Views
Acoustic measures for linguistic features distinguishing the semivowels/wj r 1/in American English Carol Y. Espy-Wilson Electrical, Computer andSystems Engineering Department, Boston University, 44Cummington Street, Boston, Massachusetts 02215andResearch Laboratory ofElectronics, Room36-545,MIT, Cambridge, Massachusetts 02139

(Received3 December1990;accepted for publication 23 April 1992)

Acousticproperties relatedto thelinguistic features whichcharacterize thesemivowels in AmericanEnglishwerequantified andanalyzedstatistically. Thefeatures canbedividedinto thosewhichseparate thesemivowels fromothersounds andthosewhichdistinguish amongthe semivowels. The featuresof interestaresonorant, syllabic,consonantal, high,back,front, and retroflex. Acousticcorrelates of thesefeatures wereinvestigated in thisstudyof the semivowels.The acousticcorrelates,which are basedon relative measures,were testedon a

corpusof 233polysyllabic words,eachof whichwasspoken onceby twomalesandtwo females.For themostpart,theappropriate distinctions aremadeby thechosenacoustic properties forfeatures. However,foreachproperty, therewassomeoverlapin theacoustic correlates of featuresfor thesounds beingdistinguished. An examination of thesounds in the overlapregions revealsthattheirsurface manifestation variessubstantially fromthecanonical form.In largepart,theobserved variabilitycanbeexplainedin termsof changes dueto feature spreadingandlenition.

PACS numbers:43.70.Fq,43.72.Ar,43.70.Hs

variabilityof someacousticproperties assumed to be associatedwith thesemivowels. For example,notall prevocalic An acousticstudyof the sounds/wj r l/was conducted andintervocalic/1/'sareassociated with spectraldiscontinas part of the development of a semivowelrecognitionsysuities.Finally,theacoustic properties of someof theother tem (Espy-Wilson,1987}. Recognitionof the semivowelsis sounds also change with context. In particular, someundera challengingtask since,of the consonants,the semivowels lyingly voiced obstruents surface as SOhorant consonants aremostlike the vowelsand,dueto phonotacticconstraints, and, in somecases,they resembleone or more of the theyalmostalwaysoccuradjacentto a vowel.Thus,acoustic semivowels.Some of these variations were also examined. changes betweensemivowels andvowelsareoftenquitesubobserved in theacoustic tle sothat therearenoclearlandmarksto guidethesampling We will arguethatthevariability manifestation of the semivowels and some of the other of acousticproperties. soundscanbecharacterizedin termsof changesin the phoMany studieshave examinedsomeof the acousticand perceptualproperties of oneor moreof the semivowels in nologicalfeatures. INTRODUCTION

English(Lisker, 1957;O'Connoret al., 1957;Lehiste,1962; Kameny, 1974; Daiston, 1975; Bladon and A1-Bamerni, 1976;Bond, 1976). Thesestudieshaveprimarily focusedon the acousticandperceptualcuesthat distinguishamongthe semivowels and

the

coarticulatory

effects between

semivowels and adjacentvowels.For the mostpart, these studieshavelookedat simplecontextsand a limited setof acousticproperties. While the resultsof pastwork were usedto guidethe presentexaminationof the semivowels, this studydiffers from previousresearchin that the acousticpropertiesinvestigatedwerechosento be closelyrelatedto the abstractlinguisticfeatureswhich comprisea phonologicaldescription of the semivowels.We examineacousticpropertiesfor featuresthat not only distinguishamongthe semivowels,but that alsoseparatethe semivowelsfrom other sounds.The acousticproperties wereanalyzedto quantifyhowthe surfacemanifestationof the semivowels changeswith context. The resultsobtainedsupportpreviousfindings,namelythat formantinformationcan be usedto distinguishamongthe semivowels.In addition,there are new findingsabout the 736

I. REVIEW OF THE ACOUSTIC PROPERTIES OF SEMIVOWELS

Spectrograms of the semivowels are shownin Figs. l and 2 wherethey occurin word-initialpositionbeforethe front vowel/i/and

the back vowel/u/. As can be seen,the

semivowels haveproperties that are similarto bothvowels and consonants.Like the vowels,the semivowelsare pro-

dueedorallywithoutcompleteclosureof thevocaltractand withoutanyfrieationnoise.Asisalsotrueforthevowels,the degreeof constrictionneededto producethe semivowels doesnot inhibitvoicing.Thus,asshownin thesefigures,the semivowels and vowels are both voiced with no evidence of

frication noise.In addition, the slowerrate of changeof the constriction size for the semivowels than other consonants

resultsin slowerspectrumchangesfor thesesoundscom-

paredto otherconsonants. Forexample, thespectrogram of theword"you"in Fig. 1 showsthattheformantsduringthe /j/stay relativelyconstantfor about130ms beforethey movetowardtheappropriate valuesforthefollowingvowel. Thus,asin the caseof vowels,a voicedsteadystateis often

J. Acoust.Soc.Am.92 (2), Pt. 1, August1992 0001-4966/92/080736-22500.80

@ 1992Acoustical Societyof America

736

"lee"

we sooo

500O

5000

4000

4000

4000

"re"

.•

,I •

5000-

I I %#

4000-

3000 -

@ooo

3ooo-

Hz

Hz

Hz

2000

8000-

1ooo

I000-

o

Hz POO0

1000-

o

0.454

o

0 o

0.25

5000

4000

4000

3000-

3000 F:•

h '1

1000

'"'•'

''

0.2

0.48

--

5000-

I 'p

4000•oood

Hz 2000-

•000]

! 000-

logo

•%'!

'rqis

F• . iI•111111m, .,. , 0

5000-

0.25

I'rou"

HZ

1

0

"Jou"

3000-

2000 F2

0 FI .ram

0.556

4000-

Hz

1000

o

mmyoum'

5000-

2000

0.48

0 F 0.428

0

0.2g

0.50G

o

TIME (seconds)

0.582

0.556

TIME (seconds)

FIG. 1. Widebandspectrograms of the words"we" and "ye" (top), and

FIG. 2. Widebandspectrograms of the words"lee" and "re" (top), and

"woo" and "you" (bottom).

"rou" and "1ou" (bottom).

observedin spectrograms of the semivowels. frequencies than/u/, and/j/has a lowerF 1 frequency and Liketheotherconsonants, thesemivowels usuallyoccur usuallya higherF2 or F3 frequencythan/i/. Thesedifferat syllablemargins.That is, they generallydo not haveor encescanbe seenin the words"woo" and "ye" of Fig. 1. constitute apeakof sonority.{Sonority,in thiscase,isequatThe glidesoccurin prevocalicandintervocalicpositions edwithsomemeasure ofacoustic energy.)Asshownin Figs. within a word,suchasthe/j/in "you" and "yo-yo"andthe 1and2, oneor moreof theformantsduringthesemivowels is /w/in "we" and"away."In addition,theyoftenoccurphoconsiderably lowerin amplitudethanit isduringthefollow- netically{eventhoughthey are not phonologically speciing vowels.In the caseof/w/, it is F3 and the higherfor- fied) aspart of the transitionbetweentwo adjacentvowels. mantswhichareweaker.In thecaseof/j/, F 3 andthehigher An exampleof thismanifestation of a glideistheintervocalic formantsarefairly strong,butF2 isnot.For/1/, thereisless /j/sound oftenobserved between/i/and/o/in thepronunenergyin the high-frequency rangestartingaroundF4 for ciationof"radiology" ( [redijologi]vs [rediologi]). The semivowels/1/and/r/are often referred to as li"lee"andF3 for "1ou."Finally,F3 andthehigherformants arelowerin amplitudeduring/r/. The relativelylowampli- quids.Sproatand Fujimura {submitted)foundfrom articutudeofthesemivowels ascompared tothevowels isprobably latory and electromyographicdata obtainedfrom several due to a combinationof factors:a low-frequency first for- speakersthat the productionof all English/l/'s involves mant (Fant, 1960),a largeF 1bandwidthcausedbythe nar- both an apical and a dorsalgesturalcomponent.The key rowerconstriction(BickleyandStevens,1986),or interac- articulatorydistinctionbetweenthe two well established tion betweenthe vocal folds and the constriction(Bicklcy variants of/1/, light or clear/1/(as in "Lee") and dark/1/ andStevens,1986).At present,thisphenomenon isnotwell {as in "feel"), is that in dark/1/the tonguebody is more understood. retractedthan in light/1/, resultingin a much lower F2. The semivowels/w/and/j/are often referredto as Sproatand Fujimuraarguethat thisallophonicvariationis glidesor transitionalsounds.They are producedwith con- not categorical,but is the degreeto which the apicaland stantmotionof thearticulators. Consequently, theformants dorsalgestures arerealizedandthe timingbetweenthe two in the transitiontowardor awayfromadjacentvowelsexhib- gestures.Specifically,theyfoundthat in additionto a signifiit a smoothglidingmovement.The semivowels/w/and/j/ cantly greaterretractionof the tonguedorsumfor dark/1/ are producedwith vocal-tractconfigurations similar to comparedto light/1/, the maximumtonguedotsumposithoseof thevowels/u/and/i/, respectively, butwitha more tion for the dark/1/is achieved well in advance of the maxiextreme constriction.As a result,/w/has lower F 1 and F2 mum tonguetip position.On the otherhand,duringthe ges737

J. Acoust. Sec.Am.,VoL92, No.2, Pt.1, AugustI gg2

CarolY. Espy-Wilson: Acoustic measures forsemivowels

737

turefor thelight/I/, thetonguetip positionisreachedbefore the tonguedorsumpositionis achieved. In the caseof light/l/, the apicalgestureusuallyinvolyesthe placementof the centerof the tonguetip against the alveolarridge.The oftenrapidreleaseof the tonguetip fromtheroofof themouthresultsin a spectraldiscontinuity betweenthe/1/and the followingvowel (Daiston, 1975). JoGs(1948), reportedthat/1/is alwaysmarkedat itsbeginningand/or endby an abruptshiftin the formantpattern. Along this line, Fant (1960) observedthat the identification of an/1/reliesona suddenshiftupoff I fromthe/l/into the followingvowel. Finally, Daiston (1975) found that this abruptshiftin F 1isoftenaccompanied bya transientclickin the acousticspectrum.Someof thesepropertiescanbe ob-

that asthepharyngealconstrictionisnarrowed,theauditory impression of the/r/is enhanced. Regardlessof whether a bunchedor retroflexed/r/is produced,lip roundingmayoccurwhen/r/is eitherprevocalic or intervocalic and before a stressedvowel (Delattre

andFreeman,1968).The acousticconsequence of lip roundingisa loweringof all formants.Thiseffectmayaccountfor the lower F 1, F2, and F3 Lehiste (1962) observedfor initial

/r/ allophonesrelativeto final/r/allophones, for whichlip roundingdoesnot usuallyoccur. In summary,many of the acousticpropertieswhich characterizethesemivowels havebeenexaminedin previous studies.However,thiswork haslargelyfocusedon formant measurements. In this investigationof the semivowels, served at theboundary between the/1/and thefollowing acousticpropertiescalculatedfrom formantmeasurements vowelsin Fig. 2. andenergy-based parametersarequantifiedandanalyzedso In the caseof dark/1/, Sproatand Fujimura (submit- that we canstudyto a greaterextentsomeof the variability that occurs in the surface forms of the semivowels. Furtherted) reportthat apicalcontactis lessrobusteventhoughit was madeduring all of the/1/productions in their study. more, the acousticpropertiesare related to the linguistic GilesandMoll (1975) foundin anx-raystudyof English/1/ featureswhich providea frameworkfor understanding the that apicalcontactfor dark/l/'s wasnot alwaysachievedfor changesthat occur. all speakersand is dependentupon phoneticcontextand speakingrate. In addition,theyfoundthat the meanpeak II. MœTHOD velocityof thetongueapexmovementissignificantly slower A. Stimuli for dark/1/. Furthermore,they foundthat dark/1/shows A data base of 233 polysyllabicwords containing undershoot ofarticulatorypositions with increases in speaksemivowels in a varietyof phoneticenvironments wasselectingrate.Thisslowerandincomplete apicalgesturemayhelp ed from the 20 000-word Merriam-Webster Pocket dictioexplainwhy dark/1/productionsarenot associated with an nary.The semivowels occuradjacentto voicedandunvoiced abrupt spectralchange. consonants,as well as in word-initial, word-final, and interAlthough the distributionof dark and light/1/varies vocalicpositions.(Note that only/1/and/r/occur postvoacrossspeakers,canonicalsyllable-final/1/is dark and, in calieally.) The semivowels occuradjacentto vowelswhich manydialects,syllable-initial/1/is light. Sproatand Fujiare stressed and unstressed, high and low, and front and mura(submitted)foundin theirstudyof preboundaryl inback. In developing the database, wordswere chosenthat tervocalic/1/inthefallingstress context/i_ [/that thequalcontained several semivowels so that theysatisfymorethan ity of the/1/dependsuponthephoneticdurationof therime one category. The distribution of the semivowels in termsof whichcontainsit. (They alsoshowa correlationbetweenthe word position and stress is given in Table I. Examples of durationof the preboundaryrime and the strengthof the words in these categories are given in Table II. While most of phonologieal boundary.)Specifically theyfoundthat asthe the contextscontain severalwords, there are a few for which rimebecomes longer,thetonguebodyfor/1/becomeslower and more retracted.Therefore,intervocalic/1/occurring beforea major intonationboundaryis dark. On the other hand,they foundthat the preboundary/1/preceding the TABLE I. Distribution of semivowels in the test words. The number in the weakest boundariesare as light as initial /1/. These data parentheses specifiesthe numberof semivowelsoccurringnext to a vowel supportpreviousfindingsby Lehiste (1962) and Blandon with primarystress. and AI-Bamerni{ 1976) whichshowthat certainprebounCategory w j r i daryintervocalic/1/productions arelighterin qualitythan preboundary/1/in prepausalposition. Prevocalic 65 41 63 52 word-initial(prestressed) 11(9) 8(5) 10(5) 6(3) American/r/may beproducedwith eithera retroflexed or bunched articulation (Delattre and Freeman, 1968). If

the upperconstrictionis at the palate,it is madewith either

the tonguetip or the tongueblade.If, instead,the upper constrictionis further back near the velum, it is made with

thetonguebody.It isthepalatalor palato-velarconstriction which lowersF3 (Delattre and Freeman, 1968;Stevens,in preparation), whereasthe pharyngealconstrictionlowers F 2 and raisesF 1 (Delattre and Freeman, 1968). In termsof perception,Delattre and Freemanfoundwith the useof an electricmouth analogthat the palatal constrictionis primary in termsof producingthe/r/. However,they found 738

J. Acoust.Soc. Am., Vol. 92, No. 2, Pt. 1, August1992

stopcluster(prestressed)

18(8)

fricativecluster (prestressed) 19(13) stopfricativecluster(prestressed) 10(7) adjacentto SOhorantconsonant 7 Intervocalic

ll(4)

18(10) 10(4)

7(3) 8(3) 7

18(9) 10(7) 7

18(6) 10(5) 9

9

6

29

32

poststressed prestressed

2 5

I 4

13 10

16 8

unstressed

2

I

Postvocalic

word-final(poststressed)

6

8

25

26

13(10) 19(14)

obstruent cluster

6

4

adjacentto sonGrantconsonant

6

3

Carol Y. Espy-Wilson:Acousticmeasuresfor semivowels

738

TABLE II. Examplesof testwords.

Category

w

j

r

I

Prevocalic

word-initial:prestressed

walnut wa!loon aquarius quadruplet

yell euroiogist pule bucolic

requiem rhinoceros brilliant fibroid

swollen

view

frivolous

flourish

Swahili disqualify misquotation carwash

behavior spurious promiscuously brilliant

anthrax astrology widespread walrus

grizzly exclaim exploitation harlequin

poststressed prestressed

forward

Ghanaian

caloric

astrology

unaware

reunion

fluorescence

unilateral

unstressed

unctuous

diuretic

correlation

fraudulent

clear

dwell

other

stopcluster:prestressed other

fricativecluster:prestressed other

stopfricativecluster:prestressed other

adjacentto Sohorantconsonant

leapfrog linguistics bless chlorination

Intervocalic

Postvocalic

work-final:poststressed other

memoir

whippoorwill

obstruent cluster

cartwheel

oneself

adjacentto sorterantconsonant

forewarn

walnut

only a small number of words were availablein the dierio-

nary.For example,only the word "Ghanaian"had a poststressedintervocalie/j/. B. Speakers and recordings

For recording,the wordswereembeddedin the carrier

phrase" pa."The final"pa" wasaddedin orderto avoidglottalizationand other typesof utterance-finalvariability.Eachword wasspokenonceby two malesand two females. Giventhat thereareseveralwordsin mostcategories (see Table I), one repetitionof each word by each speakerprovidesat least 24 to 260 productionsof each semivowel in eachmajorcategory,e.g.,prevocalic/w/. In fact,asexplainedin Sec.II C, somecategories maycontain more instances of a semivowel than what is indicated in Ta-

bleI sincespeakers ofteninsertsemivowels between adjacent vowels.

The speakerswerestudentsandemployees at the Massachusetts Instituteof Technology.The femalespeakers werefromthenortheast andthemalespeakers werefromthe midwest.All werenativespeakers of Englishand reported havingnormalhearing.Thespeakers wererecorded in a quiet roomwith a pressure-gradient close-talking noisecanceling microphone(part of SennheiserHMD 224X microphone/headphone combination).They were instructedto saythe utterances at a naturalpace.

and measurementprocedureswill be indicatedin Sec.III. Segmentation and labelingof the waveformswas performedby theauthorwith the helpof playbackanddisplays of severalattributesincludingLPC and wide-bandspectra, thespeechsignalandvariousbandlimitedenergywaveforms (Cyphers,1985;Shipman,1982;Zue etal., 1986).The Merriam-WebsterPocketdictionaryprovideda baselinephonemic transcriptionof the words.However,modifications of someof thelabelsweremadebasedon thespeakers' pronunciations.In addition,whentranscribingthedatabase,we did not normallyconsiderthe/w/and/j/offglides of diphthongsas beingseparatefrom the vowel.In someinstances, however,the offglideof a diphthongwhichwasfollowedby anothervowel was articulatedwith a narrow enoughconstriction that a semivowel label was inserted. On the other

hand,someunderlyingpostvocalic liquids,particularly/1/, in words like "almost" were not alwaysclearly heard. In theseinstances,the liquid wasoften omittedfrom the transcription. D. Feature analysis

Distinctivefeaturetheorywasusedto providea guideto understanding what acousticpropertieswe shouldlook for to characterize the semivowels.In addition, distinctivefea-

ture theory providesa basisfor understandinghow the acousticpropertiesof the semivowels maychangeasa function of context.A featurespecificationof the semivowelsis C. Initial processing givenin TablesIII and IV. Table III containsfeaturesthat The utterances weredigitizedusinga 6.4-kHz low-pass separatethe semivowelsas a classfrom other soundsand filteranda 16-kHzsamplingrate.The speechsignalswere Table IV contains features that distinguishamong the alsopre-emphasized to compensate for the relativelyweak semivowels. spectralenergyat high frequencies (a particularissuefor The featureslistedare modifications of onesproposed *3norants). Finally, the test words were excisedand hand by Jakobsonet al. (1952) and later by Chomsky and Halle transcribed.This processresulted in 2378 vowels, 1689 (1968). For example,in Table IV, we list boththe features semivowels, 479 nasals,and 1894obstruents(stops,frica- backandfront. For practicalreasons,we choseto useboth tives, and affricates).Specificcharacteristics of measures featuresand classify/r/and prevocalic/1/as -back and 739

J. Acoust.Sec.Am.,Vol.92, No.2, Pt. 1, August1992

CarolY. Espy-Wilson: Acousticmeasuresforsemivowels

739

TABLE

III. Features which characterize various classes of consonants. A

"+ "indicatesthatthedesignated featureispresentin therepresentation of

TABLE IV. Featureswhichdiscriminateamongthe semivowels. A" +" indicatesthat the designatedfeatureis presentin the represention of the

the sound, and a" --" indicates the absenceof the feature.

sound,and a" --" indicatesthe absenceof the feature.

Sohorant

Syllabic

Nasal

Consonantal High

Back Front

Fricatives,stops,affricates

--

--

--

/w/

-

+

+

--

Semivowels

+

--

--

/j/

--

+

--

+

Nasals

+

-

+

/r/

....

Vowels

+

+

-

prevocalic/1/ postvocalic/1/

+ --

Retroflex --

-+

.... --

+

--

--

-front.2TheirF 2 values clearlyliebetween thoseoftheback androundedsemivowel/w/and the front semivowel/j/. We alsofoundit necessary to distinguish betweeninitial and final/1/allophones on the basisof the featuresconsonantal and back. As statedearlier, severalresearchershave observed a sharpspectraldiscontinuity betweena prevocalic /1/and a followingvoweldue to the rapid releaseof the tonguetip fromthealveolarridge,aswewouldexpectwith a changein the featureconsonantal. On the otherhand,in the productionof postvocalic/1/, alveolarcontactis often not realizedor is realizedonly gradually,so that the spectral changebetweenit anda preceding vowelis usuallygradual. In addition, a final/1/is more velarized than an initial/I/. Thus, F2 is much lower and closein value to that of the back and rounded/w/.

Finally, the featureretroflexis usedto distinguish/r/ fromall othersounds. Althoughtheterm"retroflex"isused, this featurerelatesto the acousticconsequence of either a bunchedor retroflexedtongueshape. TableV showstheacoustic properties for thefeaturesin Table III and IV (with the exceptionof the featurenasal) andthe parametersusedfor their extraction.To makethem insensitive to variationsin speaker, speaking rate,andspeakinglevel,all of the propertiesarebasedon relativemeasures insteadof absolutethresholds.That is, a measureeither ex-

semivowels and separatethem from othersounds.The results also indicate how the characteristics of the semivowels

andothersoundschangeasa functionof context. III. RESULTS A. Sonorant

trum.

In the followingsections, we will examinehow well the properties listed in Table V distinguish among the

measure

Like the vowels, the semivowelsare sonorant sounds.

That is, the main sourceof excitationis at the glottis,sothat all of the naturalfrequencies of the vocaltract are excited. Thus, unlike the obstruents,wherethe main sourceof excitation is furtherforwardin the vocaltract, thereis significant energyat low frequencies. The only other consonants that sharethesepropertiesare the nasals. The parameterusedto extractthe acousticcorrelateof the sonorant feature is the bandlimitedenergycomputed from 100to 400 Hz. More specifically,the valueof the parameter in each frame is the difference (in dB) betweenthe

maximumenergywithin the word and the energyin each frame.An exampleof this parameteris shownin the lower partof Fig. 3 for theword"chlorination."The energydifferenceis smallin the sonorantregions(vowels,semivowels, and nasals), and is large in the obstruentregions (stops, fricatives, and affricates).

aminesan attributein onespeech frame3 in relationto anotherframe,or, within a givenframe,examinesonepart of the spectrumin relationto anothernearbypart of the spec-

AND DISCUSSION

Figure4 showshowall of thesoundsdifferin sonority, as determined with this measure.For each sound, the mini-

mum energydifferenceoccurringwithin the hand-transcribedregionisused.Thereisconsiderable overlapbetween the distributionsof the vowels,semivowels,and nasals.If we set a threshold of - 20 dB to divide SOhorant and nonsonor-

TABLE V. Mappingof featuresinto acousticproperties.B0, B 1, B2, B3, andB4 are the bark transformations ofF0, FI, F2, F3, andF4, respectively. Feature

Acousticcorrelate

Parameter

Sonorant

No significantdeereacein energy

Nonsyllabic

at low frequencies Dip in midfrequency energy

Consonantal High

Abrupt amplitudechange Low F 1 frequency

Back

LowF2 frequency

B 2-B 1

low

Front

High F2 frequency

B 3-B 2

low

B 4--B 3

low

Retroflex

Low F3 frequency&

B 4-B 3

high

Close F2 and F3

B 3-B 2

low

Energy 100-400Hz Energy640-2800 Hz Energy2000-3000 Hz First differenceof adjacentspectra B I-B 0

Property

higha lowa low' high low

• Relative to a maximum value within the utterance.

740

J. Acoust. Sec. Am., Vol. 02, No. 2, Pt. 1, August 1992

Carol Y. Espy-Wilson:Acoustic measures for semivowels

740

"chlorination"

"everyday"

?

6 õ

kHz4

kHz

o.o

az

03

0.4 o.5

Time(seconds)

I

I

I FIG. 5. A spectrogramof the word "everyday"whichcontainsthe two obstruents /v/ and/d/ thathaveundergone lenition.

sonorontregions (b) FIG. 3. An illustration of theparameter usedtocapturethefeaturesorterant. (a) Widebandspectrogram of lhe word"chlorination."(b} The differ-

quencyenergytheyexhibitis presumably causedby vibrationsof the vocalcordswhichare transmittedthroughthe

encebetween themaximum valueofthelow-frequency energy(computed tissues around the neck. from 100 to 4•0 Hz) in the word and the value in each frame. B. Consonantal

measure

Consonantalsoundsare producedwith a narrow con-

ant sounds, thenonlyabout12.5%of thetypicallynonson- strictionat somepointalongthe midlineof the vocaltract. orantconsonants overlapwith the sohorantsounds.Of these consonants,72% are producedwith a weakenedconstriction (this processreferredto as lenitionis discussed in Catford, 1977) so that they are realizedas sonorants.Two ex-

amples areshown in Fig.5,whichcontains a spectrogram of theword"everyday."Boththe/v/and/d/surface assonorant consonants.

Theremainingsegments whichoverlapwiththesonorantsare the closedportionsof voicedstops.The low-fie-

Due to the narrow constriction, the releaseof the consonan-

tal soundinto the followingvowelinvolvesrapidmovement of some of the formants. The result of this formant move-

ment is an abruptchangein the spectrumoverat leastsome part of the frequencyrange(Stevensand Keyset, 1989).

The parameterusedto capturethe rate of spectral changebetweenconsonants and vowelsis basedon the outputsera bankof 40 linearcriticalbandfiltersto whichsome nonlinearities(designedto model the hair-cell/synapse transduction process in theinnerear) areappliedto enhance onsetsand offsets(Seneft,1986). An exampleis shownin part (b) of Fig. 6 for the word "correlation."The wave-

formsthat are spacedabouta half bark apart showsharp onsetsandoffsetsbetween/l/and thesurroundingvowelsin the frequencyregionbetween800 and 1200Hz and between 1800 and 2400 Hz.

Based on the first differences in time of waveforms like

the onesshownin part (b) of Fig. 6, we computedglobal onset and offset waveforms

vowels semivowels na•abobslruenls

for each consonant. The onset

waveformis computedby summing,in eachframe,all the negativefirst differencesin time. Similarly,the offsetwaveform is obtainedby summing,in eachframe,all the positive firstdifferences in time of the channeloutputs.The resulting onset and offset waveforms for the word "correlation"

Class

of

sound

FIG. 4. Averages andstandard deviations ofthechange(in dB} in thelowfrequency energycomputedfrom 100to 400 Hz withinSOhorant andnonsonorantsoundswith respectto themaximumenergywithintheword. 741

J. Acoust.Sec. Am.,Vol.92, No. 2, Pt. 1. August1992

are

shownin parts (c) and (d) of Fig. 6 wherethe sharpamplitude changesbetweenthe/1/and the surroundingvowels showup asa valleyanda peak,respectively. Note that since a 25-ms time window is used,there is a limit to the maximum CarolY. Espy-Wilson: Acousticmeasuresfor semivowels

741

40

"CORRELATION" 4.5



.

'r•1

I•!!

%', c,', • ?' .

'

kHz ,•

,

"

'"

•I?

(a)

20

.. ' . :•

10

'.•.

o o.o

30

0

0.85

w,j,r

I

nasals

obstruents

Sound

(b)

6400Hz.

40

200

Hz

30

0.0

0.85

Amplitude 0J•-'--'"'N/-•-'-"/"-• (c)

-Oj.o * ffset Onse%

I 0.85

2O

[] A

I

lO

nasal obstruent

o -25

0.0

0.85

TIME(seconds) FIG. 6. An illustrationof parameterswhich captureabrupt amplitude changes. (a) Widebandspectrogram of"correlation."(b) Channeloutputs of an auditorymodelwhichshowabruptspectralchanges in two frequency regions between the/1/and adjacentvowels.(c) Onsetwaveform(computed from the sumof the negativefirstdifferences of the channeloutputs) whichshowsa sharpvalleyat the onsetof the/l/. (d) Offsetwaveform (computedfromthesumof thepositivefirstdifferences of thechanneloutputs) whichshowsa sharppeakat theoffsetof the/1/.

-15

-10

-5

Intensity

o

(c) .5

-lO.

-t5.

rate of changethat canbe capturedby thismeasure. The onsetand offsetwaveformswereexaminedduring the time intervalbetweeneachconsonantand its neighboringvowel(s). We definedthe onsetof the consonantto be the maximum absolutevalue of the onsetwaveformoccurring betweenthe precedingvowel and the consonant.Likewise, we defined the offset of the consonant to be the maximum

value of the offsetwaveform occurringbetweenthe consonant and the followingvowel.The time at which thesevalues occurare indicatedby arrowsin parts (c) and (d) of Fig. 6. Figure 7 showsthe data on the onsetsand offsetsacross

.2o.

-25

I

nasals

obstruents

Sound

FIG. 7. Averagesand standarddeviationsof (a) the offsetsbetweenprevocalicconsonants and followingvowels,(b) the onsetsand offsetsbetween intervocalicconsonantsand adjacentvowels,and (c) the onsetbetween postvocalic consonants and precedingvowels.

all wordsand all speakers.The unitsof the onsetand offset valuesarelike dB sincethechanneloutputsafter nonlinearitieshavebeenappliedare approximatelylinear with ampli-

is oftena wide spreadin the distributionof onsetand offset

tude at low signal levels and logarithmic at higher signal levels (Seneft, 1986, p. 88). The data for/1/are separated from/w j r/since, of the semivowels,/1/is most associated

stresspattern of the words and the rate of spectral change betweenthe consonantsand adjacentvowels.That is, onsets

values.

We also observeda strong relationshipbetweenthe

with spectraldiscontinuities.Severalobservationscan be made from the data. First, in general,the spectralchanges betweenobstruentconsonants andadjacentvowelsaremore rapidthan thespectralchangesbetweensemivowelsand adjacentvowels.Second,the spectralchangebetween/1/and adjacentvowelstendsto be more abruptthan the spectral changebetweenthe othersemivowels and adjacentvowels.

and offsetsassociated with consonants that precedestressed vowelsare significantlystrongerthan thoseassociated with consonants that precedeunstressed vowels,presumablybecausetheconstrictionistighterandthe releaseis morerapid. For example,comparethe rate of spectralchangebetween the prevocalic/1/and adjacentvowelsin the words"blurt" and "linguistics,"and betweenthe intervocalic/1/and surroundingvowelsin "walloon" and "swollen"shownin Fig.

However, as can be seenfrom the standard deviations, there

8. The offsetassociatedwith the/1/in

742

J. Acoust. Soc. Am., Vol. 92, No. 2, Pt. 1, August 1992

"blurt" (at about 130

Carol Y. Espy-Wilson:Acoustic measures for semivowels

742

"blurt"

"linguistics"

dB

,'

o.o

oJ

oz

•s

o.,

I I,'

't

i

o.s

Time(seconds)

Time Iseconds)

Amplitude

s ?

FIG. 9. A schematicof an energywaveformfor a vowel-consonant-vowel (VCV) sequence. The extremawithin the waveformare usedto compute the energydifference betweenconsonants andvowels.In the caseof prevocalicconsonants(V, doesnot exist), pointsC and B are used.In the caseof postvocalic consonants (V 2doesnotexist),pointsA andB areused.Finally, in the caseof intervocalicconsonants,the smallerof the differencebetweenpointsC and B and betweenpointsA and B is takenasa measureof the energydip.

õ

kHz 4

quencyrange640--2800Hz because,relativeto the vowels, the lowerF 1 for the semivowels is expectedto causea deereasein theamplitudesof the formantsin thisregion.However, we found that severalintervocalic/r/'s have energy levelsin this rangewhich do not differ from thosefoundon surroundingvowels,presumablybecause of theproximityof F2 and F3. To avoid this problem,we also examinedthe bandlimitedenergyfrom 2000 to 3000 Hz. SinceF3 is nor-

FIG. 8. An illustration of therateof spectral change associated withthe /l/'s in "blurt," "linguistics,""wallach," and "swollen."(a) Wideband spectragrams.(b) Offsetwaveform.(c} Onsetwaveform.

ms) is muchmoreabruptthan the oneassociated with the /1/in "linguistics" (at about145ms). Similarly,the onset and offset associatedwith the intervocalie /l/ in "wallcon"

(at 190and 260 ms,respectively) are muchmoreabrupt than those associatedwith the intervocalic /i/ in "swollen"

(at 350 and410 ms, respectively). C. Syllabic measure

Becausetheyare moreconstrictedand hencehavea rel-

ativelylowF 1,thesemivowels usuallyhaveconsiderably less energyin the low- to midfrequencyrangethan the vowels. Likeotherconsonants, thesemivowels usuallyoccurasnonsyllabicsoundsadjacentto syllablenucleiat a syllable boundary.That is,theygenerallydo not haveor constitutea peakof sonority,whereweareequatingsonorityin thiscase with a mid-frequency acousticenergymeasure.An acoustic

manifestation of a syllableboundary appears to bea significantdip withinsomebandlimitedenergycontour. To accessthe differencein energybetweensemivowels and vowels,and, moregenerally,betweenconsonants and vowels,we usedto bandlimitedenergies in the frequency ranges640-2800 Hz and 2000-3000 Hz. We chosethe fre743

J. Acoust.Sec.Am.,VoL92, No.2, Pt. 1, August1992

mally between2000 and 3000 Hz for vowels,but fallsnearor below 2000 Hz for/r/, /r/ will usuallybe considerably weakerin the 2000-to 3000-Hzrangethananadjacentvowel(s).

Measurements of the midfrequency energy of semivowels are basedon energycontourslike theonein Fig. 9. All measures are relativeto energyin an adjacentvowel. The depthof theenergydip isconsidered to bethedifference (in dB) betweenthe minimumenergywithintheconsonant, pointB, andthe maximumenergywithin the adjacentvowel(s), pointA and/or pointC. In the caseof syllableswith prevocalicconsonants, the difference in energy between the prevocalic consonant (pointB) andthe followingvowel(pointC) wascomputed. For syllableswith postvocalie consonants, the differencein energybetweenthepastvocalic consonant(point B) andthe precedingvowel(point A) wascomputed.Finally, for intervocalicconsonants, both the differences in energyat points C and B and pointsA and B werecomputed.The depthof the energydip wastakento be the smallerof the two differences.

As a basisfor comparisonwith the semivowels,the depthsof severaltypesof intravowelenergydipswerecomputedas well. An illustrationof this procedureis shownin Fig. 10, which showsa schematicrepresentation of an energy contour of a vowel. First, an estimateof the natural risc in

energywithin word-initialvowelswascomputedby calculating the energydifferenceat pointsW and T. This vowel energyonsetiscomparedwiththeenergydifference between CarolY. Espy-Wilson: Acoustic measuresforsemivowels

743

(a)

55' 45' 35'

ß

vowel

0 [] A

semivowel nasal obstruent

25' 15'

dB

5' -5

I'"

V

•'ltime

' 15' ' 25' ' 35' '4'5'55' ' 65

(b)

55 45

FIG. 10.A schematic of anenergywaveformfor a vowel.The energydifferencebetweenpointsW and T is usedto determinethe energyrisewithin word-initialvowels.The energydifferencebetweenpointsW and Z is used to determinetheenergytaperwithinword-finalvowels.The smallerof the energydifferences betweenpointsW, X andbetweenpointsX andY isused in all vowelswith the appropriateenergywaveformto determinewithin vowelenergydips.

35 o o

25 15 5 -5

5' • '1'5'2'5'3'5'4'5'5'5'65

c

LIJ

prevocalic consonants andfollowingvowels.Second, anestimate of the natural energytaper within word-finalvowels wascomputedby calculatingtheenergydifference at points W and Z. This vowel energyoffsetis comparedwith the energydifference betweenpostvocalic consonants and precedingvowels.Finally,in caseswheretherewasan intravocalicdip, X, wecomputedthedifference betweentheenergy at pointsW andX andbetween pointsY andX. In thiscase,

(c)

55' 45 •

35' 25' 15'

5-' -5

the smaller of the two differenceswas recorded. Of course,

5' • '1'5'2'5'3'5'4'5'5'5'65

not all vowelswill havethistypeof energywaveformshape sothat there will not alwaysbe a point X and a point Y. In thesecases,the intravowelenergydip is simply0 dB. This energymeasure is comparedwith the energydifference betweenintervocalicconsonants and surroundingvowels. The resultsof thesemeasurement proceduresare plotted separatelyin Fig. 11 for prevocalic,intervocalic,and

Energy Dip 640-2800 Hz (dB)

FIG. l I. Averagesand standarddeviations of the midfrequency energy changes (in dB) between consonants andadjacent vowelsandwithinvowels. Data are shownseparatelyfor the (a) prevocalic,(b) intervocalic, and(c) postvocalic consonants.

postvocalic consonants. In eachplot,theconsonants aredividedinto obstruents,nasals,and semivowels.Also included

tweensemivowels and followingvowelsif the semivowelis

in thefigurearethedatafor theenergychanges withinvowels.The data showthat the differencein midfrequencyenergybetweentheconsonants andvowelsis, on average,much greaterthantheenergychangewithinvowels.Of theenergy changes betweenconsonants andadjacentvowels,the energy changeassociated with the semivowels is almostalways

not in a clusterwith another consonant,but is word-initial.

Furthermore,if the semivowelis in a clusterwith another consonant,thereis a greaterenergychangebetweenit and the followingvowelif the precedingconsonantis voiced, ensuringthat the semivowelis alsocompletelyvoiced.In additionto the contextualinfluenceof precedingconsonants,thedegreeof stressof thefollowingvowelalsomatters. smallest.In addition, as the standarddeviationsshow, there energychangebetweenthe is sometimes considerable overlapbetweenthe distributions There is a more pronounced of theenergychanges withinvowelsandtheenergychanges semivowel and vowel if the vowel is stressed. betweensemivowels and adjacentvowels.On closerexamiconsonants nationof thesedata, patternsin their distributionemerge 2. Intervocalic across consonantal contexts. In intervocalicpositions,most of the semivowels showedsubstantial differences in energycomparedto neighI. Prevocalic consonants boringvowels.The energydip computedfor theseVCV segOf In general,the differencein midfrequencyenergybe- mentswasgreaterthan2 dB for 90% of thesemivowels. the other semivowels which did not show a substantial diftween the prevocalicsemivowelsand followingvowelsis ferencein energyrelative to adjacentvowels,33% were greaterthanthemidfrequency energychangewithinthebeginningportionsof word-initialvowels.However,wordpo- /j/'s, 14% were/r/'s and 5% were /l/'s. Most of these semivowels follow a stressedvowel and precede an unsitionhasa strongeffecton the phoneticrealizationof the stressed vowel,suchasthe/1/in "astrology"andthe/r/in semivowels. There is a more significantenergychangebe744

J. Acoust.Soc. Am., Vol. 92, No. 2, Pt. 1, August1992

Carol Y. Espy-Wilson:Acousticmeasuresfor semivowels

744

"cartwheel"

"guarantee."This lack of an energydip for semivowels in thisenvironment maybea caseof phoneticlenition. Whilethemajorityof vowelsdonotnormallyhavesuch energydips,therewereseveralinstances of vowelswithenergy dipscomparableto thosebetweenintervocalicconson-

8

ants and adjacentvowels.An examinationof suchvowels

6

showedthat, in general,thosewith suchsignificant energy dips were either and/•,/, suchas the one in "plurality" wherean intervocalic /r/ wasnot includedin thetranscrip-

5

tion,or a diphthong, suchasthe/i •/ in "queer"andthe /u'

?

kHz 4

/ in "flour." 3

3. Postvocalic

consonants

The patternsin energychangewithin word-finalvowels and betweenvowelsand following semivowels(including only the liquids/r/and/1/) are very similar. Two factors contributeto the overlap.First, the/j/or/w/offglides of word-finaldiphthongsoftenresultin energychangesthat are comparable to the changesobserved betweena wordfinal liquid and the precedingvowel.Sucha largeenergy tapercanbeseenin Fig. 12for the word"view" whichhasa substantialenergychangein the frequencyrange640-2800 Hz. Second,postvocalicliquidsthat arefollowedby another consonant areoftenasstrongastheprecedingvowels,ascan be seenby comparingthe amplitudesof the formantsin the /ar/region (0.1 to 0.2 s) in the word "cartwheel"shownin Fig. 13.In manysucheases,thereissignificant assimilation betweenthepostvocalic consonant andtheprecedingvowel. The fairly constantformantamplitudesand the steadyF3 frequencyduring the/or/region of "cartwheel"suggest

2

0 0.0

0.[

0.2

0.3

0.4

0.5

Time (seconds) FIG. 13.A spectrogram withautomatically extracted formanttracksoverlaid on the word "cartwheel."

that the/u/and/r/are coarticulatedso that they are realized acousticallyas one segment.

This energycontinuitybetweenvowelsand following liquidsalsooccurs whenthepostvocalic liquidsarefollowed byanotherSohorant consonant whichisnotin thesamesyllable,suchas the/1/in "bellwether."In wordslike this, thereisa nonsyllabic regionbetween thevowelpreceding the postvocalic liquidandthevowelafterthesecond sonorant consonant(inthis case,the/e/before

the/1/and the after the/w/). However,there is little energychangebetweenthe postvocalic liquidand the precedingvowel.Instead,the energyoffset(referredto asconsonant onsetin See.III B) betweenthenonsyllabic regionandthepreceding voweloccursafterthe liquidandbeforethefollowingsonorant consonant. On theotherhand,whennasalsoccupythis

"view" 8

postvocalic position, thereis substantial energychange betweenthemandthepreceding vowelsothattheenergyoffset occursbeforethe postvocalicnasalconsonant. This differencein wheretheenergyoffsetoccursis illus-

kHzI-

(to

o.[

o.2

o.3 o.4

o.5

Tim e (se conds) '90

-90

tratedin Fig. 14 whichcontainsinformationrelatingto the intersonorant sequences/rm/and/nr/in the words"harmonize"and "unreality,"respectively. In the caseof "harmonize"(shownontheleft), thenonsyllabic dipoccursduringthe/m/and, asindictedby thearrows,theenergyoffset betweenthe/rm/cluster and the previousvowel occurs after the/r/, at the beginningof the/m/. In contrast,the energyoffsetbetweenthe/nr/cluster and the preceding vowel in "unreality" (shownon the right) occursat the pointof implosionfor the/n/at about 175msasindicated by the arrow.

(b! FIG. 12. Illustrationof a largeenergytaperin word-finaldiphthongs.The energytowardstheendof thevowelis 30 dB or morelessthanthemaximum valuewithinthevowel.(a) Widebandspectrogram of theword"view." (b) Energy640 to 2800 Hz.

745

J. Acoust.Soc.Am.,Vol.92, No. 2, Pt. 1, August1992

To capturethis differencein the temporalpropertiesof the energy offset for nasal-sonorantconsonant sequences

andliquid-sonorant consonants sequences, wecomputedthe durationof the intersonorantnonsyllabicregion.The durationof thisenergydip regionwastakento bethedifferencein CarolY. Espy-Wilson: Acousticmeasuresfor semivowe•s

745

"harmonize"

"unreality"

8 7

6 5

kHz 4

I

0.3

o.4 o.s

oo

o.?

oo

0.4

Time(s•nds)

o.s o.o oz

Time(second)

Amp,ilud[m Amplitude•[L• • (c) (b)

(c)

FIG. 14.(a) Wideband spectrograms ofthewords"harmonize" and"unreality." (b) Onsetwaveforms thatshowa valleyattheenergy offset indicating the beginning of thenonsyllabic region.(c) Offsetwaveforms whichshowa peakat theenergy onsetindicting theendof thenonsyllabic region.

timebetweentheenergyoffsetandtheenergyonsetimmediately surroundingthe energydip. In the caseof "harmonize," the energyonsetoccursafter the/r/and beforethe followingvowelat about295 ms.Thus the energydip region includesonly the/m/and is 75 msin duration.In the caseof "unreality," the energy onset also occursafter the second sonorantconsonantand beforethe followingvowel. However,in thiscase,theenergydip regionincludesbothconsonants and is 120 ms in duration.

Data acrossall wordscontainingintersonorant clusters are shownin Fig. 15. For comparison, we alsoincludedthe durationof the energydip regionswhen there is only one sonorantconsonantoccurringbetweentwo vowels,an intervocalicnasalor semivowel.In thiscase,theenergyoffsetand energyonsetwill correspond to the consonant onsetandoffset, respectively.Although there is no normalizationfor variabilityin speakingrates,the resultsin Fig. 15 showa distinctpattern.The distributionsof the durationof energy dip regionsassociated with only onesonorantconsonantand those associated with two sonorant consonants where the

firstconsonant isa liquidareessentially thesame.However, the averagedurationof the energydip regionsassociated

syllablenucleus.Thusit maybemoreappropriateto thinkof the liquid and precedingvowelas a diphthongwherethe liquid,like theglidesin thiscontext,isconsidered to bea part of the vowel.

The assertionthat postvocalic/1/acts as the second elementof a diphthonghasalsobeenmadeby GilesandMoll (1975). Basedon x-ray data of prevocalicand postvocalic

/1/, they foundthat postvocalic/1/showsrelativelyslow movementcharacteristicsand undershootof articulatory position.On the otherhand,prevocalic/1/hada relatively highrateof articulatorymovementandno undershoot char-

•'

1704

•'

150 -

E o

'•

130 -

;•

110-

m

with two sonorant consonants where the first is a nasal is

LU

considerably longerthan thoseof the other cases.We can infer from this patternthat the energyoffsetin the cluster wherethe first memberis a postvocalicliquid occursafter the postvocalic liquidsothat onlyoneof the sonorantconsonantsis containedin the energydip region.On the other hand,the energyoffsetin the clusterwherethe firstmember is a postvocalic nasaloccursbeforethe postvocalic nasalso that both sonorantconsonants are part of the energydip region. Theseresultsshowthat postvocalicliquidswhich are followedby anothersonorantconsonant arenot a part of the energydip region.Instead,theyappearto be a part of the

'•

74(}

J.Acoust. Soc.Am.,Vol.92,No.2, Pt.1,August 1992

-

e,-

90-

70-

o



5o

Q

30 SC Intersonorant

liquid+SC

nasal+SC

Consonants

FIG. 15.A comparison oftheaverage durations of thenonsyllabic regions ofwordscontaining oneintervocalic sonorant consonant (SC), wordscontaininganintervocalic liquid-sonorant consonant sequence (liquid+ SC) and wordscontainingan intervoealienasal-sonorantconsonantsequence (nasal + SC).

CarolY. Espy-Wilson: Acoustic measures forsemivowels

746

acteristics. Thus theyconcludethe prevocalic/I/ functions as a consonantwhile postvocalie/1/is vocalicin nature. Alongthisline,SproatandFujimura(submitted)postulate that a gestureinvolvinga nonperipheral articulator(suchas tonguedotsumretraction) is attractedto the syllablenucleuswhereasa gestureinvolvinga peripheralarticulator (suchas the tonguetip) is attractedto syllablemargins. With this assumption, they too concludethat postvocalic /l/, whichhasa moresignificanttonguedorsalretraction thanprevocalic/1/,shouldbeconsidered morevocalic.

(F 1, F2, and F31. Given minimal-pairwords,it has been shown(Lisker, 1957;O'Connoret al., 19571that F l separatesthe glides/w/and/j/from the liquids/1/and/r/, F2 separates/w/from/1 r/from/j/, and F3 separatesthe liquids/1/and/r/. The data in thisstudyconcurwith these observations.

D. Formant frequency measures

Important informationfor distinguishingamong the semivowelsare the frequenciesof the first three formants

A formant tracker (Espy-Wilson,19871was usedto automaticallyextractthe first four formantsduringthe sonorant regionsof the wordsin the database.The frequencies off 1,F 2,F 3, andF4 wereestimatedbyaveragingthevalue at the time of a minimum or maximumin a particularformant track and the samplesin the precedingand following frameswithin the hand-transcribedsemivowelregion.In the caseof/w/and/l/, the valuesof the formantswereaveraged

TABLE VI. Formantfrequencies (in Hertz) andformantdifferences (in Hertz andin bark) of semivowels averaged acrossall speakers. Prevocalic

(Hzl FI

F2

F3

F4

w I

381 399

848 1074

2320 2553

3525 3767

r

419

1285

1779

3350

j

317

2142

2827

3661

(Hz) FI-FO

F2-FI

F3-F2

(bark) F4-F3

F4-F2

B I-B0

B2-B

I

B3-B

2

B4-B3

B4-B2

w

241

467

1472

1204

2676

2.4

3.6

6.4

2.5

8.9

I

258

675

1479

1214

2693

2.6

4.9

5.5

2.3

7.8

r

242

866

493

1571

2064

2.8

6.0

2.1

3.9

5.9

j

174

1825

684

1518

1.7

1.7

1.6

3.2

834

10.2

Intervocalic

(Hz) FI w

F2

F3

F4

349

771

2340

3508

445 460

1060 1240

2640 1720

3762 3433

361

2270

2920

3824

(Hz) FI-FO

F2-FI

(bark)

F3-F2

F4-F3

F4-F2

B I-B0

B2-B

1

B3-B2

B4-B3

B4-B2

211

422

1570

1169

2737

2.1

3.4

7.0

2.4

9.4

305 317

610 783

1580 473

1123 1717

2707 2190

3.0 3.1

4.5 5.4

5.8 2.1

2.1 4.2

7.9 6.2

213

1910

648

906

1554

2.1

10.1

1.5

1.6

3.1

Postvocalic

(Hz)

I r

F!

F2

465 503

898 1300

F3

F4

2630 1830

3650 3391

(Hz) FI-FO

I r

747

323 363

F2-FI

433 799

F3-F2

1740 531

(bark) F4,-F3

1015 1554

J. Acoust.Soc. Am., VoL92, No. 2, Pt. 1, August1992

F4-F2

2752 2088

B I-B0

3.2 3.5

B2-B

3.2 5.4

I

/13 //2

6.9 2.1

B4•3'3

1.9 3.7

Carol Y. Espy-Wilson:Acousticmeasuresfor semivowels

D4--B2

8.8 5.8

747

aroundthe time of the F2 minimum.For/j/, the formant valueswereaveragedaroundthe time of the œ2 maximum, and for/r/the formant valueswere averagedaroundthe

effects andspeaker differences andtobettercapture someof theacoustic properties. Chistovich andLublinskaya (1979) havepostulated thatwhentwoformants arewithina critical

time of the F 3 minimum. Thus the formants were measured

distanceof 3.0 to 3.5bark of eachother,theyareinterpreted

duringthe timewhenthe vocaltract couldbeexpectedto be

bytheauditorysystem asonespectral peakwhose frequency is at the centerof gravityof the prominence. Syrdaland Gopal(1986),in an acoustic studyusingthePeterson and Barney(1952) voweldata,investigated whetherthiscon-

most constricted.

We normalizedthe formantsby computingbark differencesto reduce the acousticvariability due to contextual

Prevocali½ Semivowels

(a) 1

1

2

3

4

5

6

B1-B0 (Bark)

Prevocalic Semivowels (b) l

r-.. r

r

/' r ryrrr r

2

4

6

8

10

12

B3-B2 (Bark)

FIG. 16.Scatter plots oftheprevocalic semivowels spoken bytwomales andtwofemales according tothebarktransformed (a) F 2-F 1vsF I-F0 and(b) F 4F3 vsF3-F2. A two-dimensional 90% confidence regionisdrawnaroundthedatafor eachsemivowel.

748

J. Acoust. Soc.Am.,Vol.92, No.2, Pt.1, August1992

CarolY. Espy-Wilson: Acoustic measures forsemivowels

748

stantin auditoryunitsheld betweenseveralformantsand

greatlythe acousticvariabilitybetweenvowelsspokenby

between thefirstformantandthefundamental frequency

different talkers.

(F0). Theyfoundthat witha criticaldistanceof 3 bark,the difference between F 1 andF0 provideda reasonable repre-

The formant frequencies obtainedin this studyare in agreementwith previouslyreporteddata.The resultsacross sentation of thehigh-nonhigh voweldistinctions indepen- speakers areshownin TableVI for prevocalic, intervocalic, dentofspeaker, andthedifference between F 3andF2 repre- and postvocalic semivowels. Also includedin the tableare

sentedthefront-backvoweldistinctions. In addition,they

normalized tbrmant values (F1-F0, F2-F1, F3-F2, and F4-F 3 ) and bark differences(B 1-B 0, B 2-B 1,B 3-B 2, and

found that the bark difference transformations reduced

Intervocalic

Semivowels

(o) 1

w

I

0

1

2

3

4

1

5

B1-B0 (Bark)

Intervocalic Semivowels

lb) I

0

2

4

6

8

10

12

B3oB2(Bark)

FIG. 17.Scatterplotsof the intervocalicsemivowels spokenby two malesand two femalesaccordingto the bark transformed(a) F2-F 1 vsF I-F0 and (b) F4-F3 vsF3-F2. A two-dimensional 90% confidence regionisdrawnaroundthedata for eachsemivowel. 749

J. Acoust. Soc. Am_.Vol. 92. No. 2. Pt. 1. August 1992

Carol Y. Espy-Wilson:Acousticmeasures for semivowels

749

B 4-B 3). (F0 wasobtainedautomatically with thepitchde-

marizesthe classification of the semivowels accordingto a

tector describedin Gold and Rabiner, 1969. ) The distribu-

3.5-bark critical distance criterion and the five bark-differ-

tionsof the bark differences are shownin Figs. 16-18 for the prevocalic,intervocalic,andpostvocalic semivowels, respectively. A two-dimensional 90% confidenceregionis drawn aroundthe datafor eachsemivowel.Finally, Table VII sum-

encedimensions.A +- indicatesthat a majority of the semivowels are within 3.5 bark in the bark-difference dimen-

sion. Conversely,a -- indicatesthat a majority of the semivowels exceedsthe 3.5 bark in the bark-difference di-

Postvocalic Semivowels (o) 1

....... rf

,

rr

r

.................

? r [ rrr•. r•rrS rrrrrrrr ................. i., r r_ _'"-......

•r ',,. r•t •

• r•.•