ACL-08:HLT Tutorials

Morning Introduction to Computational Advertising Building Practical Spoken Dialog Systems Semi-supervised Learning for Natural Language Processing
Afternoon Advanced Online Learning for Natural Language Processing Speech technology from research to industry Interactive Visualization for Computational Linguistics

Tutorial Schedule

9:00-10:30 Morning tutorial part 1
10:30-11:00 morning break
11:00-12:30 Morning tutorial part 2
2:00-3:30 Afternoon tutorial part 1
3:30-4 Afternoon break
4:00-5:30 Afternoon tutorial part 2

Introduction to Computational Advertising

(Evgeniy Gabrilovich, Vanja Josifovski, and Bo Pang)

Short abstract:

Web advertising is the primary driving force behind many Web activities, including Internet search as well as publishing of online content by third-party providers. A new discipline - Computational Advertising - has recently emerged, which studies the process of advertising on the Internet from a variety of angles. A successful advertising campaign should be relevant to the immediate user's information need as well as more generally to user's background and personalized interest profile, be economically worthwhile to the advertiser and the intermediaries (e.g., the search engine), as well as be aesthetically pleasant and not detrimental to user experience.

The tutorial does not assume any prior knowledge of Web advertising, and will begin with a comprehensive background survey of the topic. In this tutorial, we focus on one important aspect of online advertising, namely, contextual relevance. It is essential to emphasize that in most cases the context of user actions is defined by a body of text, hence the ad matching problem lends itself to many NLP methods. At first approximation, the process of obtaining relevant ads can be reduced to conventional information retrieval, where one constructs a query that describes the user's context, and then executes this query against a large inverted index of ads. We show how to augment the standard information retrieval approach using query expansion and text classification techniques. We demonstrate how to employ a relevance feedback assumption and use Web search results retrieved by the query. This step allows one to use the Web as a repository of relevant query-specific knowledge. We also go beyond the conventional bag of words indexing, and construct additional features using a large external taxonomy and a lexicon of named entities obtained by analyzing the entire Web as a corpus. Computational advertising poses numerous challenges and open research problems in text summarization, natural language generation, named entity extraction, computer-human interaction, and others. The last part of the tutorial will be devoted to recent research results as well as open problems, such as automatically classifying cases when no ads should be shown, handling geographic names, context modeling for vertical portals, and using natural language generation to automatically create advertising campaigns.

Tutorial outline

Short biographical description of presenter(s)

Evgeniy Gabrilovich
Vanja Josifovski
Bo Pang

Affiliation: Yahoo! Research, Computational Advertising and Search Technology Group

Contact information:
2821 Mission College Blvd, Santa Clara, CA 95054
Fax: 408-349-2270

Evgeniy Gabrilovich is a Senior Research Scientist at Yahoo! Research. His research interests include information retrieval, machine learning, and computational linguistics. He serves on the program committees of ACL-08:HLT, AAAI '08, JCDL '08, CIKM '08 and WWW '08, and in the past he served on the program committees of AAAI, EMNLP-CoNLL, COLING-ACL, served as a mentor at SIGIR '07, as well as reviewed papers for ACM TOIT, IP&M, JNLE, CACM, AAAI, AAMAS, WWW and CIKM. Evgeniy earned his MSc and PhD degrees in Computer Science from the Technion - Israel Institute of Technology.

Vanja Josifovski is a Principal Research Scientist at Yahoo! Research, where he works on search and advertisement technologies for the Internet. He is currently exploring designs for the next generation ad placement platforms for contextual and search advertising. Previously, Vanja was a Research Staff Member at the IBM Almaden Research Center working on several projects in database runtime and optimization, federated databases, and enterprise. He earned his MSc degree from the University of Florida at Gainesville, and his PhD from the Linkoping University in Sweden. Vanja published over thirty peer reviewed publications, authored around 20 patent applications, and was on the program committees of WWW, SIGIR, ICDE, VLDB and other major conferences in the database, information retrieval, and search areas.

Bo Pang is a Research Scientist at Yahoo! Research. Her primary research interests are in natural language processing, machine learning, and information retrieval. She obtained her PhD in Computer Science from Cornell University, where she worked on automatic analysis of sentiment in text and paraphrase extraction and generation in the context of machine translation. She has served on the program committees of ACL, HLT-NAACL, EMNLP, and AAAI, and reviewed for journals including ACM TOIS, JMLR, JAIR, Computer Speech and Language, and Computational Linguistics.


Building Practical Spoken Dialog Systems

(Antoine Raux, Brian Langner, Maxine Eskenazi, Alan Black)

Abstract:

This tutorial will give a practical description of the free software Carnegie Mellon Olympus 2 Spoken Dialog Architecture. Building real working dialog systems that are robust enough for the general public to use is difficult. Most frequently, the functionality of the conversations is severely limited - down to simple question-answer pairs. While off-the-shelf toolkits help the development of such simple systems, they do not support more advanced, natural dialogs nor do they offer the transparency and flexibility required by computational linguistic researchers. However, Olympus 2 offers a complete dialog system with automatic speech recognition (Sphinx) and synthesis (SAPI, Festival) and has been used, along with previous versions of Olympus, for teaching and research at Carnegie Mellon and elsewhere for some 5 years. Overall, a dozen dialog systems have been built using various versions of Olympus, handling tasks ranging from providing bus schedule information to guidance through maintenance procedures for complex machinery, to personal calendar management. In addition to simplifying the development of dialog systems, Olympus provides a transparent platform for teaching and conducting research on all aspects of dialog systems, including speech recognition and synthesis, natural language understanding and generation, and dialog and interaction management.

The tutorial will give a brief introduction to spoken dialog systems before going into detail about how to create your own dialog system within Olympus 2, using the Let's Go bus information system as an example. Further, we will provide guidelines on how to use an actual deployed spoken dialog system such as Let's Go to validate research results in the real world. As a possible testbed for such research, we will describe Let's Go Lab, which provides access to both the Let's Go system and its genuine user population for research experiments.

Attendees will receive a CD with the latest version of the Olympus 2 architecture, along with several tutorials and example systems.

Tutorial Outline:

Presenter Bios:

Antoine Raux
Language Technologies Institute
Carnegie Mellon University
website: http://www.cs.cmu.edu/~antoine/
email:

Antoine Raux is a PhD student at the Language Technologies Institute at Carnegie Mellon University. He has been conducting research and published more than 15 reviewed papers on several aspects of dialog systems, including speech recognition, speech synthesis, dialog and interaction management, and system building. His teaching experience includes two teaching assistantships in natural language-related graduate courses, as well as the ongoing design of online tutorials for the Olympus architecture.

Brian Langner
Language Technologies Institute
Carnegie Mellon University
website: http://www.cs.cmu.edu/~blangner/
email:

Brian Langner is a PhD student at the Language Technologies Institute at Carnegie Mellon University. He has been conducting research and published more than 12 reviewed papers on speech synthesis, natural language generation, and spoken dialog systems. He has six semesters of experience as a teaching assistant for graduate and undergraduate computing- or natural language- related courses, including some course design, in addition to continuing work for the Olympus architecture tutorials.

Dr. Alan W Black
Language Technologies Institute
Carnegie Mellon University
website: http://www.cs.cmu.edu/~awb/
email:

Alan W Black is an Associate Research Professor in the Language Technologies Institute at Carnegie Mellon University. He previously worked in the University of Edinburgh, and before that at ATR in Japan. He received his PhD in Computational Linguistics from Edinburgh University in 1993. He is one of the principal authors of the Festival Speech Synthesis System. In addition to speech synthesis, he also works on two-way speech-to-speech translation systems and, telephone-based spoken dialog systems. He also has served on the IEEE Speech Technical Committee (2003-2006), is on the editorial board of Speech Communications and is a board member of ISCA. He teaches a number of graduate and undergraduate courses and has taught a number of short term tutorials on speech synthesis, speech technology and on rapid support for new languages.

Dr. Maxine Eskenazi
Language Technologies Institute
Carnegie Mellon University
website: http://www.cs.cmu.edu/~max/
email:

Maxine Eskenazi is on the faculty of the Language Technologies Institute at Carnegie Mellon University. She has a BA from Carnegie Mellon University in French and Education and a These de Troisieme Cycle from the Universite de Paris 11 in Computer Science. She has extensive publications on the use of automatic speech processing for spoken dialog systems and on the use of language technologies for computer-assisted language learning. She is the Principal Investigator on the NSF Let's Go project.


Semi-supervised Learning for Natural Language

Processing

(John Blitzer and Xiaojin (Jerry) Zhu)

Website for SSL-NLP tutorial

Statistical natural language processing tools are being applied to an ever wider and more varied range of linguistic data. Researchers and engineers are using statistical models to organize and understand financial news, legal documents, biomedical abstracts, and weblog entries, among many other types of data. Creating high-coverage, accurate labeled datasets for so many different types of data can be prohibitively expensive, but for many tasks we have large amounts of unlabeled data that we can exploit.

This tutorial covers semi-supervised learning for natural language processing. Semi-supervised learning methods use a large amount of unlabeled data and a small amount of labeled data to estimate a statistical model [1]. Our emphasis is on practical application, and we will treat semi-supervised learning methods as tools for building effective models from limited training data. An attendee will leave our tutorial with basic knowledge of the general classes of semi-supervised learning, as well as the ability to decide which class will be useful in her research and where to find detailed information on several methods within each class. We will cover three main classes: self-training, graph-based methods, and structural learning. From each general class we will choose one specific model (or two, in the case of self-training) to cover in detail, with a demo and a detailed discussion of known success and failure cases. There will also be a high-level description of several other methods within each class.

References: [1] Xiaojin (Jerry) Zhu. Semi-supervised Learning Literature Survey. Technical Report 1530, Computer Sciences, University of Wisconsin-Madison 2005.

Outline:

  1. Introduction and overview:
  2. Self-training:
    • Overview:
    • Co-training:
    • Prototype-driven learning:
  3. Graph methods:
  4. Structural learning:
  5. Wrapup and pointers to external references:

Biographical information:

John Blitzer
3330 Walnut Street
Philadelphia, PA 19104
email:

John Blitzer is currently a PhD student in computer science under Fernando Pereira at the University of Pennsylvania. Beginning in February 2008, he will be a visiting researcher at Microsoft Research Labs Asia, and in August 2008, he will start a position as a postdoctoral fellow under Dan Klein at the University of California, Berkeley. John's research area is machine learning for natural language processing, with a primary focus on unsupervised dimensionality reduction of text. Recently, he has worked on empirical and theoretical analyses of structural learning for semi-supervised domain adaptation. He has been a teaching assistant for courses in cognitive science and numerical linear algebra at the University of Pennsylvania.

Xiaojin Zhu
University of Wisconsin, Madison
1210 West Dayton Street
Madison, WI 53706-1685
email:

Xiaojin Zhu is an Assistant Professor in Computer Sciences at University of Wisconsin, Madison. His research interests are statistical machine learning (in particular semi-supervised learning), and its applications to natural language analysis. He received a Ph.D. in Language Technologies from CMU in 2005, with thesis research on graph-based semi-supervised learning. His current research projects aim at bridging the different approaches in semi-supervised learning, and making them more effective for practitioners. He has taught several graduate and undergraduate courses in AI, machine learning and NLP at the University of Wisconsin, Madison.


Advanced Online Learning for Natural Language Processing

(Koby Crammer)

Most research in machine learning has been focused on binary classification, in which the learned classifier outputs one of two possible answers. Important fundamental questions can be analyzed in terms of binary classification,but real-world natural language processing problems often involve richer output spaces. In this tutorial, we will focus on classifiers with a large number of possible outputs with interesting structure. Notable examples include information retrieval, part-of-speech tagging, NP chucking, parsing, entity extraction, and phoneme recognition.

Our algorithmic framework will be that of online learning, for several reasons. First, online algorithms are in general conceptually simple and easy to implement. In particular, online algorithms process one example at a time and thus require little working memory. Second, our example applications have all been treated successfully using online algorithms. Third, the analysis of online algorithms uses simpler mathematical tools than other types of algorithms. Fourth, the online learning framework provides a very general setting which can be applied to a broad setting of problems, where the only machinery assumed is the ability to perform exact inference, which computes a maxima over some score function.

The goals of the tutorial:

  1. To provide the audience systematic methods to design, analyze and implement efficiently learning algorithms for their specific complex-output problems: from simple binary classification through multi- class categorization to information extraction, parsing and speech recognition.
  2. To introduce new online algorithms which provide state-of-the-art performance in practice backed by interesting theoretical guarantees.

Theory and Algorithms

Implementation and Practice

Koby Crammer is a research associate at the University of Pennsylvania (PhD Hebrew University). His research focuses on the design, analysis and implementation of machine learning algorithms for complex prediction problems, and applying them for various natural language processing tasks and other structured problems.


Interactive Visualization for Computational Linguistics

(Christopher Collins, Gerald Penn, and Sheelagh Carpendale)

Interactive information visualization is an emerging and powerful research technique for understanding models of language, and their abstract representations. Much of what computational linguists fall back upon to improve natural language processing and to model language "understanding" is structure that has, at best, only an indirect attestation in observable data. An important part of research progress depends on our ability to fully investigate, explain, and explore these structures, both empirically and as outcomes of grammar design relative to accepted linguistic theory. The sheer complexity of these abstract structures, and the observable patterns on which they are based, however, usually limits their accessibility, often even to the researchers creating or attempting to learn them.

To aid in understanding, visual 'externalizations' are used in CL for presentation and explanation - traditional statistical graphs and custom-designed data illustrations fill the pages of ACL papers. Such visualizations do provide insight into the representations and algorithms designed by researchers, but visualization can also be used as an aid in the process of research itself. There are special statistical methods, falling under the rubric of "exploratory data analysis", and visualization techniques just for this purpose, in fact, but these are not widely used or even known in CL. These novel data visualization techniques, which we have used successfully in the CL domain, offer the potential for creating new methods that reveal structure and detail in data. Instructed by a team of computational linguists and information visualization researchers, this tutorial will bridge computational linguistic and information visualization expertise, providing attendees with a basis from which they can begin to accelerate their own research.

Tutorial Objectives:

This tutorial will equip participants with:

Tutorial Outline:

  1. Introduction
  2. Information Visualization Theory
    • Representational theory, cognitive psychology, preattentitive processing
    • Interaction & animation
    • Assessing and validating visualization
    • i. Evaluation challenges
      ii. Measuring insight
      iii. Metrics for evaluation
      iv. Heuristic approaches to evaluation
  3. Review of Linguistic Visualization
    • Document content visualizations
    • Text collection analysis
    • Literary analysis
    • Streaming data visualization
    • Convergence of language and other data
    • Corpora exploration
    • Visualization of statistical NLP outputs
    • Linguistic analysis
    • Visualization of non-textual linguistic data
  4. Tools for Visualization
    • Software solutions
    • Programming toolkits
    • Online tools
    • Collaborative visualization tools in development
  5. Case Studies in Linguistic Visualization
  6. Open Research Problems
  7. Closing

Tutorial Instructors

Christopher Collins
PhD Candidate, University of Toronto Computer Science

Christopher Collins received his M.Sc. in the area of Computational Linguistics from University of Toronto in 2004. His PhD research focus is inter-disciplinary, combining computational linguistics and information visualization. He is currently in his final year of PhD studies, investigating interactive visualizations of linguistic data with a focus on convergence and coordination of multiple views of data to provide enhanced insight. He has developed various methods for generating, reading, and comparing visual summaries of document thematic content for everyday users and data analysts. Recent publications include a new method for revealing relationships amongst visualizations, and a system for exposing the uncertainty in statistical natural language systems. He recently embarked on a study of visualization use in a team of machine translation researchers and plans to continue collaboration with language engineers to provide them with an enhanced ability to analyse and improve their algorithms.

Gerald Penn
Associate Professor, University of Toronto Computer Science

Gerald Penn's research interests are in computational linguistics, theoretical computer science, programming languages, spoken language processing, and human-computer interaction. He is probably best known as the co-designer and maintainer of the ALE programming language, and has published widely on topics pertaining to logics and discrete algorithms for natural language processing applications. He is a member of the advisory board to Computational Linguistics and the editorial board of Linguistics & Philosophy, and is a past president of the ACL Mathematics of Language Society.

Sheelagh Carpendale
Associate Professor, University of Calgary Computer Science
Canada Research Chair: Information Visualization

Sheelagh Carpendale holds a Canada Research Chair in Information Visualization and an NSERC/SMART/iCORE Industrial Research Chair in Interactive Technologies at the University of Calgary. She is the recipient of several major awards including the British Academy of Film and Television Arts Award (BAFTA) for Off-line Learning, and has been involved with successful technology transfer to Idelix Software Inc. Her research focuses on the visualization, exploration and manipulation of information. Current research includes: visualizing uncertainty particularly in medical data, visualizing biological data, developing visualizations to support computational linguistic research and the development of methodologies to support collaborative data analysis with visualization. Sheelagh Carpendale's research in information visualization and interaction design draws on her dual background in Computer Science (Ph.D. Simon Fraser University) and Visual Arts (Sheridan College, School of Design and Emily Carr, College of Art).

Speech Technology from Research to Industry

Roberto Pieraccini
CTO, SpeechCycle, Inc.,
26 Broadway, 11th floor
New York, NY 10004
Tel.: (646) 792 2744

This tutorial is about the evolution of speech technology from research to a mature industry. Today, spoken language communication with computers is becoming part of everyday life. Thousands of interactive applications using spoken language technology - known also as "conversational machines" - are only a phone call away, allowing millions of users each day to access information, perform transactions, and get help. Speech recognition, language understanding, text-to-speech synthesis, machine learning, and dialog management enabled this revolution after more than 50 years of research. The industry of speech continues to mature with its evolving standards, platforms, architectures, and business models within different sectors of the market. In this tutorial I will briefly trace the history of speech technology, with a special focus on speech recognition and spoken language understanding, from the early attempts to today's commercial deployments. I will summarily describe the most successful ideas and algorithms that brought to today's technology. I will discuss the struggle for ever increasing performance, the importance of data for training and evaluation, and the role played by government funded projects in creating effective evaluation benchmarks. I will then describe the birth of the speech industry in the mid 1990s, with the role played by the Voice User Interface and dialog engineering disciplines in bringing speech recognition from a laboratory "accuracy challenge" to an enabler of usable interfaces. I will describe the rising of standards (such as VoiceXML, SRGS, SSML, etc.) and their importance in the growth of the market. I will proceed with an overview of the current architectures and processes utilized for creating commercial spoken dialog systems, and will provide several case studies of the use of speech technology. I will conclude with a discussion on the current open problems and challenges. The tutorial duration will be of about 3 hours with a short break. Several audio and video samples will be shown during the tutorial. The tutorial is directed to a general HLT audience with no prior knowledge of speech technology.

Tutorial Outline

Roberto Pieraccini spent more than 25 years in the area of speech and language technologies. He worked at research labs such as CSELT in Torino, Italy (the research center of the Italian telephone company in the 1980s), Bell Laboratories, AT&T Shannon Labs, and IBM T.J. Watson Research. He was director of R&D at SpeechWorks, one of the companies that had a major impact in the definition of the current speech technology market in the late 1990s and early 2000s. He is now the Chief Technology Officer of SpeechCycle, a company specialized in complex spoken language interaction systems for technical support and customer care. He is the author of more than 100 publications and book chapters. He is senior member of IEEE, the current chair of the IEEE Speech and Language Technical Committee, and a member of the IEEE Signal Processing Society's Conference Board.

Tutorial Speaker Responsibilities

Accepted tutorial speakers will be notified by February 4, 2008, and must then provide abstracts of their tutorials for inclusion in the conference registration material by February 21, 2008. The description should be in two formats: an ASCII version that can be included in email announcements and published on the conference web site, and a PDF version for inclusion in the electronic proceedings (detailed instructions to follow). Tutorial speakers must provide tutorial materials, at least containing copies of the course slides as well as a bibliography for the material covered in the tutorial, by May 5, 2008.

Chairs:

Ani Nenkova (Univ of Pennsylvania, USA)
Marilyn Walker (Univ of Sheffield, UK)
Eugene Agichtein (Emory University, USA)

Please send inquiries concerning ACL-08 tutorials to .