Ling 600.01 Phonetic Theory
The writeup for this exercise is due by the beginning of class on Tuesday, November 29.
The data for this exercise are in the directory KoreanStops on the Course Packet web site. They were given to me by Sun-Ah Jun, and are from an experiment that provided some of the background for her description of the Korean prosodic hierarchy (see, e.g., Jun, 1998). This exercise concerns the phonetic representation of the three-way contrast in phonation type for plosive consonants at each of the four places of articulation for plosives in the language. The contrast is among the aspirated plosives (which we will transcribe with a superscript [h], as in [pʰal]), the "plain" or "lax" plosives (which we will transcribe as simply voiceless, as in [pal]), and the "tense" or "fortis" plosives (which we will transcribe with the IPA diacritic for ejectives, as in [pʼal]). The audio files in the KoreanStops/WAVfiles directory are typical examples of utterances from Prof. Jun's database, which placed the target consonants [pʰ], [p], or [pʼ] in several different positions in a somewhat nonsensical dialogue about animals in a children's story. A sample dialogue is shown below, with the word containing the target consonant underlined in the word-for-word gloss line under the phonemic transcription. (Other dialogues are identical except for the identity of the target consonant.)
The notes to the right of utterances A1, Q2, A2, and A3 identify the position in the prosodic hierarchy where the target consonant occurs: IP stands for "intonational phrase', AP stands for "accentual phrase" (a prosodic unit that corresponds roughly to the eojeol in the Korean writing system), and W stands for "(prosodic) word". The spaces in the text of the dialogue separate prosodic words. Higher-order grouping into intonational phrases and accentual phrases is indicated by the subscripted brackets.
Jun (1999) characterizes the two higher-level constituents in terms of their tone patterns, as follows. The accentual phrase is delimited by a [LH ... LH] or a [H ... LH] pattern -- i.e., a rise to high tone both at the beginning and at the end of the phrase, with a more or less smooth fall in pitch in between. The intonational phrase is marked at the beginning by a pitch range reset, which makes the rise at the beginning of the first accentual phrase especially prominent. The intonational phrase is also marked at the end by the paradigmatic choice from among several boundary pitch movements, which over-ride the AP-level rise. For example, the two intonational phrases in sentence A1 in the file HPI2.WAV each end in a fall to a L% boundary tone, whereas the one intonational phrase in sentence Q2 in file HPAI2.WAV ends in a rising boundary tone.
The rising tone pattern at the beginning of the accentual phrase also interacts with the segmental specification. That is, one of the cues to the contrast between the lax stop, on the one hand, and the aspirated and tense stops, on the other, is the timing of the rise to the H tone. When the word begins with a lax stop, the first syllable of the phrase is very low relative to the second syllable. By contrast, when the accentual phrase begins with an aspirated or tense stop, pitch is nearly as high as its peak value from the first syllable. This is why Jun (1998) describes two patterns for the accentual phrase. The [H ... LH] pattern is for when the phrase begins with an aspirated or tense consonant, whereas the "default" [LH ... LH] "double rise" pattern is for the lax plosives and all other segment types.
The audio files in the directory illustrate this interaction. There are examples of each of the three plosive types in each of the four prosodic positions identified in the dialogue. The file names identify the speaker, the plosive type, the prosodic position, and the token number (out of the ten productions of each dialogue that each speaker produced). For example, ...
The H_TextGrid directory contains TextGrid files that I created for the IP-initial and AP-medial audio files produced by speaker H. In creating these files, I used the praat script markBurstsEtc.praat which is in the scripts directory of our course page as well as in the KoreanStops directory with the data. The file H_results.txt is a record of five numbers that were extract using the markBurstsEtc.praat script.
The first three numbers are the times for three events that are marked in these TextGrid files. The three events are illustrated in Figures (a) and (b). (The script that I wrote to make these figures is drawKoreanExamples.Praat in the scripts directory on our course web page.) The time tagged with "c" was placed at the beginning of the silence or stop closure; the time tagged with "b" was placed at the release burst after the stop closure; and the time tagged with "v" was placed at the onset of voicing in the following vowel -- i.e., the VOT. The criterion that I used to identify this "v" time is that I put the cursor at a point near the rise for the first period after the burst where Praat showed an F0 value. The script then moved the cursor to the nearest zero crossing and inserted the tag for the point marked with the "v" tag. Thus, the VOT value that is derived by subtracting the time marked "b" from the time marked "v" is positive even for tokens where there was voicing throughout the closure, as in the utterance shown in Figure (b).
(a)
(b)
(c)
Download the audio files and the TextGrid files and examine the points that I have marked for a few of them. Then examine the following three figures. (These figures were made using the R script KoreanStops.R, which you can find in the scripts directory under our course web page as well as in the KoreanStops directory.)
(d)
(e)
(f)
Figures (d) and (e) show the spectral tilt measure plotted as a function of the VOT duration for each token of a target labial stop produced in IP initial and in AP medial position. Figure (f) plots the same tokens as in Figure (e), but this time putting the closure duration on the y-axis. The numbers next to the points for the plain lax stops in Figures (e) and (f) identify the particular token being plotted. Note that in utterances HPAM1.WAV, HPAM3.WAV, HPAM4.WAV, and HPAM8.WAV, the duration plotted for the VOT is not really what we would normally measure as VOT if we were to follow the definition given in Lisker & Abramson (1964), since there is voicing throughout the closure, as you can see if you look at the waveform and spectrogram for these utterances. Notice also how HPAM5.WAV, HPAM6.WAV, and HPAM9.WAV seem to be outliers in a different way for VOT duration in figures (e) and (f). Look at these files and see if you agree with the positioning of the "v" tag for VOT.
Once you have examined the figures and as many of the audio files as you need to listen to in order to understand the differences among the different plosive types in the two positions plotted here, answer the following questions.
The original source code for this html file and the R code that generated the figures are from:
Mary E. Beckman & Janet B. Pierrehumbert (forthcoming). A Laboratory Course in Phonology. In contract with Blackwell, Inc.
We thank Sun-Ah Jun for generously providing the audio files for this exercise. These audio files can be freely downloaded and used for educational purposes and for academic research, so long as this textbook and Prof. Jun are acknowledged as their source. The files may not be used for commercial gain under any circumstances.
The references cited in this file are:
Mieko S. Han & R. S. Weizman (1970). Acoustic features of Korean /P, T, K/, /p, t, k/ and /ph, th, kh/. Phonetica, 22: 112-128.
Sun-Ah Jun (1998). The Accentual Phrase in the Korean prosodic hierarchy. Phonology, 15(2): 189-226.
Leigh Lisker & Arthur M. Abramson (1964). A cross-language study of voicing in initial stops: Acoustical measurements. Word, 20: 384-422.
David Silva (2006). Acoustic evidence for the emergence of tonal contrast in contemporary Korean. Phonology, 23: 287-308.