From: ddurian@ling.ohio-state.edu Subject: Aug 9 meeting notes Date: Thu, 10 Aug 2006 13:24:57 -0400 DISCUSSION OF THE 3 NORMALIZATIONM READINGS Mary and David began the meeting with a brief discussion of the 3 normalization readings (see last message for citations) that David completed for the meeting. Of the 3, Johnson's JASA paper proved to be the most relevant. In sum, Johnson's findings indicate that speakers appear to make more use of speaker identity (in other words, normalization in perception is based more heavily on external rather than internal factors). This doesn't mean that F0 plays no role in the process; instead, it essentially means that F0 across a larger contextual frame than the word is in play. AN NEW IDEA RE: NORMALIZATION, F0, and F4 Operationally, these finds suggest that, if one wanted to develop an normalization algorithm that accounts for these findings, he or she would want to obtain F0 measurements from vowels occuring at a variety of points occuring throughout a speakers' utterance, rather than internally from vowels in only isolated words. Optimally, these measurement points would be taken from random, non-specific vowels occuring throughout long stretches of speech, so that speaker identity can be robustly quantified. In many speech recording events, say, those used in most laboratory phonetics experiences, the volume of continous speech that would need to be captured to obtain this robust quantification is usually not available. This is due to the nature of the kind of speech captured in these events--either vowels or words in islolation, or short spans of speech contained in one sentence utterances. Sociolinguistic interviews provide an exceptional context which this is not the case, as the very nature of the interview involves speakers producing utterances involving longer stretches of continous speech, in many cases for minutes at a time. Furthermore, as the length of sociolinguistic interviews are often at least 30 mins, a large volume of speech can be easily obtained for the purposes of making the kind of F0 measurements described above. ACTING ON THE IDEA I Johnson's JASA findings show that, in a real very sense, taking speaker identification into account, speakers make use of a range of F0 occurrances when speaker normalization occurs. To quantify this range of variation, we decided the best approach to act on the idea we had above is to obtain measurements of F0 occurring throughout spans of the connected speech and then averaging them across all these different occurrances, so that a gauge of the overall impression left on a hearer of the speakers F0 could be obtained. We also decided to obtain F4 measurements using the same protocol, so that an "upper ceiling" of the hearer's overall perception of the speakers's vowel system could also be obtained. F4 is useful for doing this, because North American English vowels tend to have farily invariant F4.* Combining these measurements allows us to combine extrinsic and intrinsic measurements, while also allowing us a robust method for quantifying speaker identity, as it is defined within the phonetics literature on speech perception and production. Specifically, we decided to create a script in PRAAT to assist us in taking these measurements that does the following: # Get F0 mean and F4 mean over enough useable "vowel intervals" # to add up to 10 seconds/minute in the first 20 minutes = 200 secs total # Each "vowel interval" needs to be a sonorant intervals where both F0 and F4 # are well-defined. Ideally, you also want to have a variety of intonational events # and vowel types represented, but if the average vowel interval is less than a # second, this is taking at least 200 vowels, so happy tagging.... We decided to go with the approach of taking F0 and F4 measurements that add up to 10 second per minute of tape so that enough sampling points were obtained to make a robust measurement of speaker identity. The first 20 minutes of tape are used as this demarcates the section of my sociolinguistic interviews in which participants typically talk in a relaxed, casual speech style. This is so because the questions asked them usually deal with family, friends, co-workers, and their views on the 3 biggest things that have changed in Columbus between the time they were children and the present day (informants range in age from 18-67, so this time frame can be quite variable). We also wanted to ensure that vowels are well-formed within the intervals were measurements are taken, because F4 can otherwise be prone to tracking problems in PRAAT. ACTING ON THE IDEA II To test this method of normalization, I plan to conduct a study, which I will write up for my second pregenerals paper. To conduct the the study, I plan to use /ow/, /i/, and /ae/ as test vowels: - /i/ will be investigated because it is a vowel that is undergoing little, if any variation, in Middle class White Columbus speech. ** Thus, it will serve as a "control" vowel for the method. - /ae/ will be investigated because it is a monophtong that is undergoing variation in Columbus, making it well matched as a counterpart to /i/. - /ow/ will be investigated because it is a diphthong underoing variation in Columbus,and thus, will allow me to specifically compare how the method works for dipthongs versus monophthongs undergoing variation in a speech community. To further test the method, I will use 4 middle class White speakers from my corpus: 1 male and 1 female from my 38-65 age group; and 1 male and 1 female from my 18-32 age group. Doing so will allow me to see how well the method works for making comparisons among speakers of different sex and age backgrounds. This is an important point, since normalization algorithms are used primarily in sociophonetics studies specifically to make these kind of comparisons. SUMMARY At the conclusion of the meeting, Mary and I had gotten as far as discussing the methodological points for my approach, as well as how to operationalize everything so that the method could be applied. We did not have time to actually create the script, although we did begin trying out some ideas. FOOTNOTES: * However, note that F4 has not been heavily studied in the literature, and as far as Mary and I are aware, previous studies of F4 in vowels have only gone as far as noting that it doesn't seem to vary much. This is a topic that I believe needs more research, and is one I plan to investigate as a side issue at some point in the future. Beyond this, F4 variance that have been studied involves consontantal transitions, in situations where retroflex consonants precede vowels. To make sure we didn't miss any studies that might have been done previously on F4, we decided to consult Keith Johnson's (1988) dissertation, as it dealt specifically with normalization and contains an extensive literature review. ** On the other hand, /i/ in white working class speech appears to be possibly undergoing a mild form of the Southern Shift in Columbus, as well as the larger Central Ohio area, according to data analyzed by Erik Thomas in Thomas 1989/1993, 1996, and 2001. In prototyical Southern Shift, /i/ becomes laxer and "wider", such that pronounciations exhibit a dipthong-like quality. Typically, this is represented in the sociophonetic literature as /iy/ (e.g., Labov, 1991). Footnote References: Johnson, Keith. 1988. Process of speaker normalization in vowel perception. Dissertation, The Ohio State University. Labov, William. 1991. The three dialects of English. In Eckert, Penelope (ed.), New ways of analyzing variation. New York: Academic Press. pp. 1-41. Thomas, Erik R. 1989/1993. Vowel changes in Columbus, OH. Journal of English Linguistics 22:205-215. Thomas, Erik R. 1996. A comparison of variation patterns of variables among sixth graders in an Ohio community. In E. W. Schneider (Ed.), Focus on the USA: Variaities of English around the world (General Series, Vol. 16). Amsterdam: John Benjamins Publishing. pp. 309-332. Thomas, Erik R. 2001. An Acoustic Analysis of Vowel Variation in New World English. Publication of the American Dialect Society 85. Durham, NC: Duke University Press. ---------------------------------------------------------------- From: ddurian@ling.ohio-state.edu Subject: Aug 10 meeting notes Date: Fri, 11 Aug 2006 10:49:49 -0400 Today, Mary and I developed the script for the vowel normalization method study that I discussed in my Aug 9 notes. The script does the following: # Get F0 mean and F4 mean over enough useable "vowel intervals" # to add up to 10 seconds/minute in the first 20 minutes = 200 secs total # Each "vowel interval" needs to be a sonorant intervals where both F0 and F4 # are well-defined. Ideally, you also want to have a variety of intonational # events and vowel types represented, but if the average vowel interval is less # than a second, this is taking at least 200 vowels, so happy tagging.... It is available as the script entitled "IntvNorm.praat" on the scripts page of the class web site: http://www.ling.ohio-state.edu/~mbeckman/795.10/scripts/ We also did some further tweaks on the Vowel tagging script ("TagVowels.praat") we developed last week (Aug 1), so that it plays more nicely with "IntvNorm". The new version has been uploaded, and is also now available at http://www.ling.ohio-state.edu/~mbeckman/795.10/scripts/