#################################################################################################### # diphone-locator-for-TextGrids-GREEDY.ALGORITHM.praat ( Written by Kyuchul Yoon kyoon@ling.osu.edu ) # Same function as diphone-locator-for-TextGrids9.praat # but with Greedy Algorithm ################################################################################## #################### BASIC ALGORITHM ############################################# # Given 400 TextGrids files with diphones tier, this script makes a list of # filename-diphoneTypeCount, remembering all the diphoneTypes and their # total count for each file, gets the name of the file that has the most # diphoneTypeCount, marks all the diphones for that file (and moving the file to a # different 'done' folder), removes all those diphones from the other 399 files # (deleting old 399 files and writing the updated files to the same directory), # and repeat the above procedures until there are no more remaining diphoneTypes. # # This entails, for each TextGrid file, that the script # (1) remembers all distinct diphoneTypes, storing in an array variable # so that those can be removed from the rest of the files in the loop # (2) remembers the number of those diphoneTypes so that comparison # among files can be possible. This comparison is done to choose # the file with the most diphoneTypes. ################################################################################## form Specify forms and files word inFolder lab.TextGrid.after.LTS.scheme.diphone.textgrid natural diphoneTier 5 word inFileExt_(with_dot) .TextGrid.prosodic.diphone word doneFolder_(to_be_created) marked.diphones.GREEDY.ALGORITHM word diphoneCount_(to_be_created) DIPHONE.COUNT endform # Make a list of all inFolder files Create Strings as file list... fileList 'inFolder$'\*'inFileExt$' numFiles = Get number of strings pause 'numFiles' files identified. Continue? ################################################# # FIND THE FILE WITH THE MOST DIPHONE TYPE COUNT ################################################# # Initialize the mostTotalNumDiphoneTypes to zero mostTotalNumDiphoneTypes = 0 # Loop through each file for iFile to numFiles # Read a file select Strings fileList fileName$ = Get string... iFile Read from file... 'inFolder$'\'fileName$' Rename... inFile # Check the number of intervals for the diphone tier and set the total number numIntervals = Get number of intervals... diphoneTier totalNumDiphoneTypes = numIntervals - 2 # Deal with the intervals with nothing and set the totalNumDiphoneTypes accordingly for iIntervalNoText from 2 to (numIntervals-1) intervalTextNoText$ = Get label of interval... iIntervalNoText lenNoText = length(intervalTextNoText$) if lenNoText = 0 totalNumDiphoneTypes = totalNumDiphoneTypes - 1 endif endfor # Loop through each interval for iInterval from 2 to (numIntervals-1) intervalText$ = Get label of interval... iInterval arrayIntervalText'iInterval'$ = intervalText$ # Do the comparison with the rest of the interval texts if iInterval > 2 if iInterval > 2 # Compare the current with all arrayIntervalText$ variables for iCompare from 2 to (iInterval-1) # If the current is the same as the immediately preceding one, # then set the current interval text to nothing and reduce the # number of total diphone types by one if arrayIntervalText'iInterval'$ = arrayIntervalText'iCompare'$ Set interval text... diphoneTier iInterval totalNumDiphoneTypes = totalNumDiphoneTypes - 1 endif endfor endif endfor # Delete the old file and write the new one delete 'inFolder$'\'fileName$' Write to text file... 'inFolder$'\'fileName$' select TextGrid inFile Remove # Compare the current totalNumDiphoneTypes with the previous mostTotalNumDiphoneTypes # and set the filename and number of diphone types for that file if totalNumDiphoneTypes > mostTotalNumDiphoneTypes mostDiphoneTypes = totalNumDiphoneTypes mostDiphoneTypesFileName$ = fileName$ endif endfor ############################################### # MOVE THE FILE TO DONE FOLDER and # REMOVE THE DIPHONE TYPES FROM THE OTHER FILES ############################################### # Now that we have the filename and the number of diphone types for the file # that has the most number of diphone types, move that file to the 'done' folder