Take-home lab exercise 5 on "Korean Stops"

Ling 600.01 Phonetic Theory


Due date.

The writeup for this exercise is due by the beginning of class on Tuesday, November 29.


The data.

The data for this exercise are in the directory KoreanStops on the Course Packet web site. They were given to me by Sun-Ah Jun, and are from an experiment that provided some of the background for her description of the Korean prosodic hierarchy (see, e.g., Jun, 1998). This exercise concerns the phonetic representation of the three-way contrast in phonation type for plosive consonants at each of the four places of articulation for plosives in the language. The contrast is among the aspirated plosives (which we will transcribe with a superscript [h], as in [pʰal]), the "plain" or "lax" plosives (which we will transcribe as simply voiceless, as in [pal]), and the "tense" or "fortis" plosives (which we will transcribe with the IPA diacritic for ejectives, as in [pʼal]). The audio files in the KoreanStops/WAVfiles directory are typical examples of utterances from Prof. Jun's database, which placed the target consonants [pʰ], [p], or [pʼ] in several different positions in a somewhat nonsensical dialogue about animals in a children's story. A sample dialogue is shown below, with the word containing the target consonant underlined in the word-for-word gloss line under the phonemic transcription. (Other dialogues are identical except for the identity of the target consonant.)

The notes to the right of utterances A1, Q2, A2, and A3 identify the position in the prosodic hierarchy where the target consonant occurs: IP stands for "intonational phrase', AP stands for "accentual phrase" (a prosodic unit that corresponds roughly to the eojeol in the Korean writing system), and W stands for "(prosodic) word". The spaces in the text of the dialogue separate prosodic words. Higher-order grouping into intonational phrases and accentual phrases is indicated by the subscripted brackets.

Jun (1999) characterizes the two higher-level constituents in terms of their tone patterns, as follows. The accentual phrase is delimited by a [LH ... LH] or a [H ... LH] pattern -- i.e., a rise to high tone both at the beginning and at the end of the phrase, with a more or less smooth fall in pitch in between. The intonational phrase is marked at the beginning by a pitch range reset, which makes the rise at the beginning of the first accentual phrase especially prominent. The intonational phrase is also marked at the end by the paradigmatic choice from among several boundary pitch movements, which over-ride the AP-level rise. For example, the two intonational phrases in sentence A1 in the file HPI2.WAV each end in a fall to a L% boundary tone, whereas the one intonational phrase in sentence Q2 in file HPAI2.WAV ends in a rising boundary tone.

The rising tone pattern at the beginning of the accentual phrase also interacts with the segmental specification. That is, one of the cues to the contrast between the lax stop, on the one hand, and the aspirated and tense stops, on the other, is the timing of the rise to the H tone. When the word begins with a lax stop, the first syllable of the phrase is very low relative to the second syllable. By contrast, when the accentual phrase begins with an aspirated or tense stop, pitch is nearly as high as its peak value from the first syllable. This is why Jun (1998) describes two patterns for the accentual phrase. The [H ... LH] pattern is for when the phrase begins with an aspirated or tense consonant, whereas the "default" [LH ... LH] "double rise" pattern is for the lax plosives and all other segment types.

The audio files in the directory illustrate this interaction. There are examples of each of the three plosive types in each of the four prosodic positions identified in the dialogue. The file names identify the speaker, the plosive type, the prosodic position, and the token number (out of the ten productions of each dialogue that each speaker produced). For example, ...

The H_TextGrid directory contains TextGrid files that I created for the IP-initial and AP-medial audio files produced by speaker H. In creating these files, I used the praat script markBurstsEtc.praat which is in the scripts directory of our course page as well as in the KoreanStops directory with the data. The file H_results.txt is a record of five numbers that were extract using the markBurstsEtc.praat script.

The first three numbers are the times for three events that are marked in these TextGrid files. The three events are illustrated in Figures (a) and (b). (The script that I wrote to make these figures is drawKoreanExamples.Praat in the scripts directory on our course web page.) The time tagged with "c" was placed at the beginning of the silence or stop closure; the time tagged with "b" was placed at the release burst after the stop closure; and the time tagged with "v" was placed at the onset of voicing in the following vowel -- i.e., the VOT. The criterion that I used to identify this "v" time is that I put the cursor at a point near the rise for the first period after the burst where Praat showed an F0 value. The script then moved the cursor to the nearest zero crossing and inserted the tag for the point marked with the "v" tag. Thus, the VOT value that is derived by subtracting the time marked "b" from the time marked "v" is positive even for tokens where there was voicing throughout the closure, as in the utterance shown in Figure (b).

(a)

(b)

The last two numbers in each row in the H_results.txt file are the energy at the first and second harmonic just after the voice onset time -- i.e., the energy in the H1 and H2 identified in a spectrum calculated over a 20 ms Hanning window beginning at the point marked with "v" in the TextGrid file. The method of identification is illustrated in Figure (c), which shows the spectrum calculated for utterance HPI3.WAV -- the same utterance shown in Figure (a). (The script that I wrote to make this figure is drawKoreanExamples2.Praat in the scripts directory on our course web page.) The difference between these two amplitudes (i.e., H1 minus H2) is sometimes used as a measure of the "spectral tilt" -- i.e., an estimate of the degree to which energy at the fundamental dominates in the glottal source waveform.

(c)


What to do.

Download the audio files and the TextGrid files and examine the points that I have marked for a few of them. Then examine the following three figures. (These figures were made using the R script KoreanStops.R, which you can find in the scripts directory under our course web page as well as in the KoreanStops directory.)

(d)

(e)

(f)

Figures (d) and (e) show the spectral tilt measure plotted as a function of the VOT duration for each token of a target labial stop produced in IP initial and in AP medial position. Figure (f) plots the same tokens as in Figure (e), but this time putting the closure duration on the y-axis. The numbers next to the points for the plain lax stops in Figures (e) and (f) identify the particular token being plotted. Note that in utterances HPAM1.WAV, HPAM3.WAV, HPAM4.WAV, and HPAM8.WAV, the duration plotted for the VOT is not really what we would normally measure as VOT if we were to follow the definition given in Lisker & Abramson (1964), since there is voicing throughout the closure, as you can see if you look at the waveform and spectrogram for these utterances. Notice also how HPAM5.WAV, HPAM6.WAV, and HPAM9.WAV seem to be outliers in a different way for VOT duration in figures (e) and (f). Look at these files and see if you agree with the positioning of the "v" tag for VOT.

Once you have examined the figures and as many of the audio files as you need to listen to in order to understand the differences among the different plosive types in the two positions plotted here, answer the following questions.


Questions to answer.

  1. Are the three outliers in figures (e) and (f) true outliers or is there evidence of voicing before the "v" tag in these three files? If you find evidence of voicing starting earlier than by the criterion I used for placing the cursor for the "v" marker, what is that evidence? (Hint: try using the "Spectogram settings" command under the "Spectrum" pulldown menu in the edit window to change the "View range (Hz):" to show just the energry from 0 to 1000 Hz and also change the "Window length (s):" to be 0.05 seconds instead of 0.005 seconds.)
  2. Some textbooks for Korean as a foreign language describe the "lax" stops as being "mildly aspirated" (as compared to the true" aspirates), and Han and Weizman (1970) show that, on average, the VOT in plain "lax" stops is shorter than the VOT in aspirated stops. The data in the H_results.txt file reproduce Han and Weizman's results, showing a mean VOT value of 55 ms for the "mildly aspirated" stops as compared to 12 ms for the tense stops and 75 ms for the true aspirates. A t-test shows that the 20 ms difference in mean VOT values for the lax versus aspirated stops is very unlikely to have resulted from chance variation (t[34]=-2.3031, p=0.0275). Looking at the distribution of values along the x-axis in Figures (d) and (e), evaluate this characterization of the difference. That is, if we described Korean plosives as being simply a three-way contrast in degree of aspiration, with values ranging from no aspiration (very short lag VOT) for the tense stops to heavy aspiration (very long lag VOT) for the aspirated stops, how reliably could we differentiate the lax stops from the other two types?
  3. In a recent paper in the journal Phonology (Silva, 2006), David Silva suggests that Korean has an incipient tone contrast between words with initial high tone (i.e., words beginning with the tense and aspirated stops) as opposed to words with initial low tone (i.e., all other words). Is there any evidence in the audio files that you listened to that supports this characterization? If so, what is that evidence? Also, is the description appropriate for both of the prosodic positions shown in Figures (d) and (e)?
  4. Can you think of any other "feature" beside degree of aspiration and initial tone that might capture the contrast between the plain lax stops and the true aspirates in intonational-phrase initial position? If so, what is your evidence for this feature? That is, what acoustic cue or cues suggest this characterization? Is this characterization valid also for the lax stops in accentual-phrase medial position?
  5. The Korean hangul orthography writes the tense stops as geminate versions of the lax stop. Is there any phonetic evidence for this analysis in the data for this exercise? If so, what is the evidence, and is the characterization equally valid for the two prosodic positions shown in Figures (d) and (e)?
  6. Can you think of any other "feature" beside gemination that might capture the contrast between the plain lax stops and the tense stop in accentual-phrase medial position? If so, what is your evidence for this feature? That is, what acoustic cue or cues suggest this characterization? Is this characterization valid also for the intonational-phrase initial position?

Acknowledgments and references

The original source code for this html file and the R code that generated the figures are from:

Mary E. Beckman & Janet B. Pierrehumbert (forthcoming). A Laboratory Course in Phonology. In contract with Blackwell, Inc.

We thank Sun-Ah Jun for generously providing the audio files for this exercise. These audio files can be freely downloaded and used for educational purposes and for academic research, so long as this textbook and Prof. Jun are acknowledged as their source. The files may not be used for commercial gain under any circumstances.

The references cited in this file are:

Mieko S. Han & R. S. Weizman (1970). Acoustic features of Korean /P, T, K/, /p, t, k/ and /ph, th, kh/. Phonetica, 22: 112-128.
Sun-Ah Jun (1998). The Accentual Phrase in the Korean prosodic hierarchy. Phonology, 15(2): 189-226.
Leigh Lisker & Arthur M. Abramson (1964). A cross-language study of voicing in initial stops: Acoustical measurements. Word, 20: 384-422.
David Silva (2006). Acoustic evidence for the emergence of tonal contrast in contemporary Korean. Phonology, 23: 287-308.

Copyright © 2006 Mary E. Beckman & Janet B. Pierrehumbert.