Language Modeling
Using statistically- and morphologically-informed
techniques to reduce the out-of-vocabulary problem in Arabic.
Dissertation research. Use the templatic morphology of Arabic
to figure out where stems and affixes are, what the short
vowels might be, and how to predict the next word in a
sequence. Language modeling for eventual use in speech
recognition, especially cross-dialectal.
Automatic Speech Recognition I.
Automatic Speech Recognition II.
Automatic Speech Recognition III.
Combining multiple types of features into a
Conditional Random Fields phone recognition system. Use
features derived from support vector machines, neural
networks, and TDNNs. Mix of binary and n-ary features. CRFs
handle redundancy well, therefore they are a good choice in
conducting this study. Which features complement each other?
Which are only redundant? Part of the ASAT project, research
performed in conjunction with Georgia Tech & Rutgers
Universities. Presented at Interspeech 2007.
Varied Techniques in Arabic
Information Retrieval.
Which stemming methods and retrieval algorithms, and in what
combination, work best for Information Retrieval, considering the
specific morphological characteristics of Arabic? How can complex
morphological information be better incorporated into a stemming
model, and how does that interact with existing retrieval models?
Systematicity in the Arabic
Lexicon.
Using information theoretic
techniques to predict the epenthetic vowel.
Ordering Sentences According to
Topicality.
A short study pertaining to Natural Language Generation. Can
LSA be used to determine the correct order of sentences of a short,
single-topic article? (Answer: not very well, at least not LSA
alone.) Presented at the Midwest
Computational Linguistics Colloquium in 2006.
|