Language Modeling
Using statistically- and morphologically-informed
techniques to reduce the out-of-vocabulary problem in Arabic.
Dissertation research. Use the templatic morphology of Arabic
to figure out where stems and affixes are, what the short
vowels might be, and how to predict the next word in a
sequence. Language modeling for eventual use in speech
recognition, especially cross-dialectal.
- Accepted for publication in the LREC 2008 workshop, HLT in
The Arabic World: Arabic Language and local languages
processing: Status Updates and Prospects. Language Modeling for
Local and Modern Standard Arabic
- Poster presentation at the Student Research Workshop, ACL
2008. Arabic Language
Modeling with Finite State Transducers
Modeling Phonological
Category Acquisition
How do children learn the difference between
phonological categories like [t] and [d]? How does parental
or societal feedback affect that process? How does the
process differ cross-linguistically? How do adults adapt to
childrens' speech? How does this compare to category
formation in second language acquisition, if at all? How can
we use this information to do a better job of teaching second
language learners, or hearing- or learning-impaired children?
- Using self-organizing maps to implement our ideas about how
learning and feedback interact. Initial work on /s/ vs. /sh/
vs. /c}/ data presented at MCWOP 2008.
Automatic Speech Recognition I.
Automatic Speech Recognition II.
Automatic Speech Recognition III.
Combining multiple types of features into a
Conditional Random Fields phone recognition system. Use
features derived from support vector machines, neural
networks, and TDNNs. Mix of binary and n-ary features. CRFs
handle redundancy well, therefore they are a good choice in
conducting this study. Which features complement each other?
Which are only redundant? Part of the ASAT project, research
performed in conjunction with Georgia Tech & Rutgers
Universities. - Presented at Interspeech 2007: Detection-based
ASR in Automatic Speech Attribute Transcription Project
Varied Techniques in Arabic
Information Retrieval.
Which stemming methods and retrieval algorithms, and in what
combination, work best for Information Retrieval, considering the
specific morphological characteristics of Arabic? How can complex
morphological information be better incorporated into a stemming
model, and how does that interact with existing retrieval models?
Systematicity in the Arabic
Lexicon.
Using information theoretic
techniques to predict the epenthetic vowel.
Ordering Sentences According to
Topicality.
A short study pertaining to Natural Language Generation. Can
LSA be used to determine the correct order of sentences of a short,
single-topic article? (Answer: not very well, at least not LSA
alone.) Presented at the Midwest
Computational Linguistics Colloquium in 2006.
|