Term Project, Part 3, Relating the perception and production data

Ling H286, Autumn 2007, Ohio State University)

Copyright © 2007 by the class and Mary E. Beckman


0. Synopsis and due date.

This part of the term project ties together the first two parts. You will be using analysis methods we have learned more recently to relate the perception experiment results for the hod and hawed words that you analyzed in the first part of the term project to the durations and vowel formant measures data that you analyzed in the second part of the term project. You will also analyze the perception and production results for these same two words produced by the 139 Detroit speakers in the Hillenbrand et al. (1995) corpus, as a kind of check on what to expect from a more homogeneous group of talkers than is represented by the 20 members of the class. Your group presentation will be in class during the time scheduled for a final exam (7:30-9:30 on Tuesday, December 4), and your personal final summary of the project as a whole is due during that same final class meeting.

1. The measurement files.

The perception test results are in the Au2007perception.txt file in the termProjectPart1Results subdirectory under the course web page, where you got them for the first interim report. The production measures have been collated into a single large file called classVowels.txt that is in the termProjectPart3 subdirectory of the course web page. This file includes the vowel durations that Group Awesome measured for all 10 vowels produced by all 20 class members, and the corrected formant measures that Group A made for the vowels in hod and hawed for everyone. There are also some corrections of other vowels for some people that Mary made after looking at each class member's vowel chart and seeing numbers that were obvious mistrackings. You can see which values were changed by looking at the code in projectPart3prep.R" in the termProjectPart3 subdirectory. This file contains the R code that was used to collate the vowel formant measures and duration measures.

The perception test results and the vowel formant measures for the 139 Detroiters are still in the HillenbrandHighVowels subdirectory of the class web site, where you got them in order to do the data analyses for report number 4. The relevant files to download from here are bigdata.txt (the file of vowel formant and vowel duration measurements) iddataMinusAsterisk.txt (the file of identification test results).


2. The questions.

Here are the questions about these two data sets that we came up with in class on Monday, November 19.

In our first analysis of the perception data, we assumed a categorical distinction between two sets of people in the class. If we were to treat the Detroiters the same way, we would distinguish:

  1. Class members who distinguish.
  2. Class members who do not distinguish.
  3. Detroiters who distinguish.
  4. Detroiters who do not distinguish.
We defined sets (a) versus (b) by designating Joe as our class "reporter" and having him interview everyone, to determine how each class member self identifies. The first thing to ask, then, is:

Once we have determined a criterion for dividing the Detroiters into these two discrete categories, we might want to ask: The second set of questions looks at the relationship between perception and production in a different way. There is some R code for reading in the data files and for doing various analyses that might be relevant to answering the above questions in the files classHodHawed.R and HillenbrandHodHawed.R in the termProjectPart3 subdirectory under the course web site. These files include sample code that you can use to make histograms and do t-tests, if you think histograms and t-tests would be useful for answer the questions about significant differences between distinguishers and non-distinguishers. They also include code for doing the same kind of normalization to the minimum and maximum values that Group Awesome did for the class formant values in their interim report 2.