Lightly Supervised Grammar Induction

Ling 8800 — Seminar in Computational Linguistics
Spring '14, TR 3:55–5:15, Derby Hall 0060
Instructor: Michael White
http://www.ling.ohio-state.edu/~mwhite/

Description

The past decade has witnessed exciting research on semantic parsing, where syntax is treated as a latent variable, and lexico-grammars are induced as part of the process of finding the most likely mappings between strings and meanings. More recently, work in this direction has become more lightly supervised, e.g. using question-answer pairs as even more indirect data for training, and has blended into work on grounded language learning. However, the grammars induced in this work have tended to be rather simplified and domain-specific in comparison to hand-crafted grammars hewing closer to linguistic theory. Meanwhile, approaches to engineering high-precision grammars have evolved to include techniques for customizing starter grammars for a new language using typological properties, and inducing lexico-syntactic resources from existing corpora in diverse formalisms, parallel corpora and inter-linear glossed text. The inference techniques employed for inducing such resources have remained less sophisticated than those used for grammar induction in semantic parsing though, indicating potential synergies for bringing these strands of research together.

In this seminar, we will read and discuss seminal and recent research on resource induction for semantic parsing and grammar engineering, with topics drawn from those listed below, as well as any related ones suggested by student input.

Expectations

Students will be expected to actively participate in the discussion and research carried out in the seminar. As detailed below, students will be required to facilitate discussions and post questions on the readings in advance, as well as locate relevant background/tutorial materials. Additionally, students taking the course for 3 credits will be required to carry out a class project on a topic related to the seminar; alternatively, for students already working on a related topic, integrating their focus into the seminar will be an option.

Prerequisites

Ling 5802 (formerly 684.02) or equivalent, or permission of the instructor.

Carmen

We'll use Carmen to schedule discussion facilitators and post advance questions on the readings, as well as links to background/tutorial materials. We'll also use the dropbox for project documents.

Requirements

Class participation (25%)

We are aiming for a dynamic discussion of papers, not death by powerpoint. Thus, we plan on taking a page from Eric Fosler-Lussier's playbook, and requiring everyone (this includes you!) to post at least one question to the discussion list on Carmen by 8 p.m. the evening before each week's readings will be discussed. Participants should also feel free to share their (initial) thoughts and views of the papers in their posts. In particular, questions of the type "What did they mean by X" or "Why did they do X instead of Y" are encouraged. Remember that most of the papers are targeted to people who are already expert in the area, so you shouldn't expect to alway understand everything. Airing such questions can help everyone gain a better understanding of the paper — even those who thought they understood it!

Additionally, for this year's seminar we are going to split each week's meetings into one session devoted to the primary readings and one session devoted to background/tutorial materials, which students will be responsible for locating and going over in class. Thus, the expected schedule for each week is as follows:

Facilitating discussions (25%)

Each week's sessions will have a discussion facilitator. For the main readings, the facilitator should look over the posted questions and choose a subset for discussion. In class, the facilitator should start the session with a brief, five to ten minute summary of the papers, including the highlights and lowlights. Following the opening summary, the facilitator is responsible for managing the discussion, and ensuring that as many viewpoints are heard as possible.

For the session on background/tutorial materials, the facilitator should come prepared to go over the materials that s/he found, as well as to determine when it would make sense ask other participants to go through the materials they found. Note that participants other than the facilitator should therefore also come prepared to go over the background/tutorial materials they found, at least briefly.

Students will be required to facilitate one or two week's sessions. If the discussion does not take up the entire class period, the remaining time may be used to (informally) discuss class projects.

Term project (50%)

As noted above, students taking the course for 3 credits will be required to carry out a term project, either alone or in a team setting. A project sketch will be required to be presented informally in class for brainstorming during the eighth week, followed by a presentation during the last week of class, and a final report by the day the final exam would be held (if there were one).

For students taking the course for 1 credit, no project will be required, and class participation and facilitating discussions will each count for half of the class requirements.

Topics

The topics and readings we expect to cover are listed below; these will be refined as the course progresses.

Semantic Parsing

Grounded Language Learning

Grammar Customization from Typological Properties

Treebank Conversion and Syntactic Projection

Policy on Academic Misconduct

As with any class at this university, students are required to follow the Ohio State Code of Student Conduct. In particular, note that students are not allowed to, among other things, submit plagiarized (copied but unacknowledged) work for credit. If any violation occurs, the instructor is required to report the violation to the Council on Academic Misconduct.

Students with Disabilities

Students who need an accommodation based on the impact of a disability should contact me to arrange an appointment as soon as possible to discuss the course format, to anticipate needs, and to explore potential accommodations. I rely on the Office of Disability Services for assistance in verifying the need for accommodations and developing accommodation strategies. Students who have not previously contacted the Office for Disability Services are encouraged to do so (292-3307; http://www.ods.ohio-state.edu).

Disclaimer

This syllabus is subject to change. All important changes will be made in writing (email), with ample time for adjustment.