ACL 2010
July 11-16

Tutorials

Sunday July 11: Venue A, Uppsala University Main Building
  Hall X Hall IX Hall IV
09:00-10:30 T5 T3 T6
10:30-11:00 Break
11:00-12:30 T5 T3 T6
12:30-14:00 Lunch
14:00-15:30 T2 T4 T1
15:30-16:00 Break
16:00-17:30 T2 T4 T1

T1: Annotation
Presenter: Eduard Hovy

T2: From Structured Prediction to Inverse Reinforcement Learning
Presenter: Hal Daumé III

T3: Wide-Coverage NLP with Linguistically Expressive Grammars
Presenters: Josef van Genabith, Julia Hockenmaier and Yusuke Miyao

T4: Semantic Parsing: The Task, the State of the Art and the Future
Presenters: Rohit J. Kate and Yuk Wah Wong

T5: Tree-based and Forest-based Translation
Presenters: Yang Liu and Liang Huang

T6: Discourse Structure: Theory, Practice and Use
Presenters: Bonnie Webber, Markus Egg and Valia Kordoni

Tutorial Chairs

Lluís Màrquez (Technical University of Catalonia, Spain)
Haifeng Wang (王海峰) (Baidu.com Inc., China)

E-mail: tutorials@acl2010.org

Tutorial descriptions

T1: Annotation

Presenter: Eduard Hovy
July 11, 14:00-17:30, Hall IV

Abstract

As researchers seek to apply their machine learning algorithms to new problems, corpus annotation is increasingly gaining importance in the NLP community. But since the community currently has no general paradigm, no textbook that covers all the issues (though Wilcock?s book published in Dec 2009 covers some basic ones very well), and no accepted standards, setting up and performing small-, medium-, and large-scale annotation projects remain somewhat of an art.

This tutorial is intended to provide the attendee with an in-depth look at the procedures, issues, and problems in corpus annotation, and highlights the pitfalls that the annotation manager should avoid. The tutorial first discusses why annotation is becoming increasingly relevant for NLP and how it fits into the generic NLP methodology of train-evaluate-apply. It then reviews currently available resources, services, and frameworks that support someone wishing to start an annotation project easily. This includes the QDAP annotation center, Amazon?s Mechanical Turk, annotation facilities in GATE, and other resources such as UIMA. It then discusses the seven major open issues at the heart of annotation for which there are as yet no standard and fully satisfactory answers or methods. Each issue is described in detail and current practice is shown. The seven issues are: 1. How does one decide what specific phenomena to annotate? How does one adequately capture the theory behind the phenomenon/a and express it in simple annotation instructions? 2. How does one obtain a balanced corpus to annotate, and when is a corpus balanced (and representative)? 3. When hiring annotators, what characteristics are important? How does one ensure that they are adequately (but not over- or under-) trained? 4. How does one establish a simple, fast, and trustworthy annotation procedure? How and when does one apply measures to ensure that the procedure remains on track? How and where can active learning help? 5. What interface(s) are best for each type of problem, and what should one know to avoid? How can one ensure that the interfaces do not influence the annotation results? 6. How does one evaluate the results? What are the appropriate agreement measures? At which cutoff points should one redesign or re-do the annotations? 7. How should one formulate and store the results? When, and to whom, should one release the corpus? How should one report the annotation effort and results for best impact?

The notes include several pages of references and suggested readings.

Participants do not need special expertise in computation or linguistics.

Outline

1. Toward a Science of Annotation
   a. What is Annotation, and Why do We Need It?
2. Setting up an Annotation Project
   a. The Basic Steps
   b. Useful Resources and Services
3. Examples of Annotation Projects
4. The Seven Questions of Annotation
   a. Instantiating the Theory
   b. Selecting the Corpus
   c. Designing the Annotation Interface
   d. Selecting and Training Annotators
   e. Specifying the Annotation Procedure
   f. Evaluation and Validation
   g. Distribution and Maintenance
5. Closing: The Future of Annotation in NLP

Presenter

Eduard Hovy
Information Sciences Institute
University of Southern California
email: hovy@isi.edu
website: http://www.isi.edu/~hovy

Eduard Hovy is the director of the Natural Language Group at USC/ISI. His research focuses on questions in information extraction, automated text summarization, the semi-automated construction of large lexicons and ontologies, machine translation, question answering, and digital government. Much of this work has required annotation. Together with colleagues, students, and visitors, he has had annotation projects in biomedical information extraction (http://www.neuroscholar.org/), coreference, wordsense annotation (http://www.bbn.com/ontonotes/), ontology creation, noun-noun relations, and discourse structure. The smallest of these projects (discourse structure) involved three annotators over a period of three months, and the largest (OntoNotes noun senses) involved more than 25 annotators over several years. Some of these projects used the CAT annotation interface developed at UPitt (http://www.qdap.pitt.edu/cat3.htm), others involved home-grown interfaces, and some of them involved Amazon's Mechanical Turk.


 

T2: From Structured Prediction to Inverse Reinforcement Learning

Presenter: Hal Daumé III
July 11, 14:00-17:30, Hall X

Abstract

Machine learning is all about making predictions; language is full of
complex rich structure. Structured prediction marries these two.
However, structured prediction isn't always enough: sometimes the
world throws even more complex data at us, and we need reinforcement
learning techniques. This tutorial is all about the *how* and the
*why* of structured prediction and inverse reinforcement learning (aka
inverse optimal control): participants should walk away comfortable
that they could implement many structured prediction and IRL
algorithms, and have a sense of which ones might work for which
problems.

The first half of the tutorial will cover the "basics" of structured
prediction the structured perceptron and Magerman's incremental
parsing algorithm. It will then build up to more advanced algorithms
that are shockingly reminiscent of these simple approaches: maximum
margin techniques and search-based structured prediction.

The second half of the tutorial will ask the question: what happens
when our standard assumptions about our data are violated? This is
what leads us into the world of reinforcement learning (the basics of
which we'll cover) and then to inverse reinforcement learning and
inverse optimal control.

Throughout the tutorial, we will see examples ranging from simple
(part of speech tagging, named entity recognition, etc.) through
complex (parsing, machine translation).

The tutorial does not assume attendees know anything about structured
prediction or reinforcement learning (though it will hopefully be
interesting even to those who know some!), but *does* assume some
knowledge of simple machine learning (eg., binary classification).

Outline

PART I (75 minutes):

(10 mins) What is structure? What is structured prediction?

(20 mins) Refresher on binary classification
   - What does it mean to learn?
   - Linear models for classification
   - Batch versus stochastic optimization

(15 mins) From perceptron to structured perceptron
   - Linear models for structured prediction
   - The "argmax" problem
   - From perceptron to margins

(30 mins) Search-based structured prediction
   - Training classifiers to make parsing decisions
   - Searn and generalizations

PART II (75 minutes):

(20 mins) Refersher on reinforcement learning
   - Markov decision processes
   - Q learning

(25 mins) Inverse optimal control and A* search
   - Maximum margin planning
   - Learning to search

(20 mins) Apprenticeship learning

(10 mins) Open problems

Presenter

Hal Daume III
School of Computing
University of Utah
50 S Central Campus Drive
Salt Lake City, UT 84112
801-585-3586
me@hal3.name

Hal Daume III is an assistant professor in the School of Computing at
the University of Utah. His primary research interests are in
understanding how to get human knowledge into a machine learning
system in the most efficient way possible. In practice, he works
primarily in the areas of Bayesian learning (particularly
non-parametric methods), structured prediction and domain adaptation
(with a focus on problems in language and biology). He associates
himself most with conferences like ACL, ICML, NIPS and EMNLP. He
earned his PhD at the University of Southern Californian with a thesis
on structured prediction for language (his advisor was Daniel
Marcu). He spent the summer of 2003 working with Eric Brill in the
machine learning and applied statistics group at Microsoft
Research. Prior to that, he studied math (mostly logic) at Carnegie
Mellon University.


 

T3: Wide-Coverage NLP with Linguistically Expressive Grammars

Presenters: Josef van Genabith, Julia Hockenmaier and Yusuke Miyao
July 11, 09:00-12:30, Hall IX

Abstract

In recent years, there has been a lot of research on wide-coverage
statistical natural language processing with linguistically expressive
grammars such as Combinatory Categorial Grammars (CCG), Head-driven
Phrase-Structure Grammars (HPSG), Lexical-Functional Grammars (LFG)
and Tree-Adjoining Grammars (TAG). But although many young
researchers in natural language processing are very well trained in
machine learning and statistical methods, they often lack the
necessary background to understand the linguistic motivation behind
these formalisms. Furthermore, in many linguistics departments,
syntax is still taught from a purely Chomskian perspective.
Additionally, research on these formalisms often takes place within
tightly-knit, formalism-specific subcommunities. It is therefore
often difficult for outsiders as well as experts to grasp the
commonalities of and differences between these formalisms.

This tutorial overviews basic ideas of TAG/CCG/LFG/HPSG, and provides
attendees with a comparison of these formalisms from a linguistic and
computational point of view. We start from stating the motivation
behind using these expressive grammar formalisms for NLP, contrasting
them with shallow formalisms like context-free grammars. We introduce
a common set of examples illustrating various linguistic constructions
that elude context-free grammars, and reuse them when introducing
each formalism: bounded and unbounded non-local dependencies that
arise through extraction and coordination, scrambling, mappings to
meaning representations, etc. In the second half of the tutorial, we
explain two key technologies for wide-coverage NLP with these grammar
formalisms: grammar acquisition and parsing models. Finally, we show
NLP applications where these expressive grammar formalisms provide
additional benefits.

Who are targeted:

- Researchers, developers and PhD students with a background in
machine-learning and data-driven NLP who have not been exposed to
linguistically expressive computational grammars.

- Researchers, developers and PhD students who have hand-crafted
linguistically expressive computational grammars and who would like to
get acquainted with state-of-the-art treebank-based acquisition of
wide-coverage and robust linguistically expressive grammars.

- Researchers, developers and PhD students in data-driven parsing and
generation who would like to get acquainted with efficient and scalable
parsing and generation models for rich linguistically expressive
grammars.

Outline

1. Introduction: Why expressive grammars
   1.1 What is grammar, what is grammar formalism
   1.2 Shortcomings of context-free grammars
   1.3 Lexicalized formalisms
2. Introduction to TAG
   2.1 Definition: elementary trees, adjunction, substitution
   2.2 Formal properties
   2.3 Example analyses
3. Introduction to CCG
   3.1 Definition: combinatory rules, categories
   3.2 Formal properties
   3.3 Example analyses
4. Introduction to LFG
   4.1 Definition: C-structure, F-structure, functional uncertainty
   4.2 Formal properties
   4.3 Example analyses
5. Introduction to HPSG
   5.1 Definition: signs, principles, schemas
   5.2 Formal properties
   5.3 Example analyses
6. Inducing expressive grammars from corpora
   6.1 Motivation and basic ideas
   6.2 Translating Penn Treebank into TAG/CCG/HPSG/LFG
   6.3 Other treebanks, dependency banks
7. Wide-coverage parsing with expressive grammars
   7.1 Wide-coverage parsers and basic architectures
   7.2 Supertagging
   7.3 Parsing models: generative vs. discriminative
   7.4 Pipeline/integrated architectures for LFG parsing
   7.5 Other issues: efficiency and across-framework comparison
8. Applications
   8.1 Sentence realization
   8.2 Semantics construction
   8.3 IE/IR
9. Summary

Presenters

Julia Hockenmaier, Department of Computer Science, University of
Illinois, 201 North Goodwin Avenue, Urbana, IL 61801-2302,
juliahmr@illinois.edu

Julia Hockenmaier is assistant professor in the Department of Computer
Science at the University of Illinois, Urbana-Champaign. She has been
working on translating the English Penn Treebank and the German Tiger
corpus to CCG, and developed one of the first statistical parsers for
CCG.

Yusuke Miyao, National Institute of Informatics, Hitotsubashi 2-1-2,
Chiyoda-ku, Tokyo 101-8430 Japan, +81-3-4212-2590, yusuke@nii.ac.jp

Yusuke Miyao is an associate professor in National Institute of
Informatics, Japan. He has been engaged in the research of
wide-coverage HPSG parsing, specifically focusing on statistical
models for parse disambiguation, and the treebank-based development of
wide-coverage grammars. He has also been working on the applications
of the HPSG parser, including biomedical IE/IR and wide-coverage
logical form construction.

Josef van Genabith, Centre for Next Generation Localisation, School of
Computing, Dublin City University, Dublin 9, Ireland,
+353-(0)1-7006700, josef@computing.dcu.ie

Josef van Genabith is an associate professor in the School of
Computing at Dublin City University and the director of the Centre for
Next Generation Localisation (CNGL). He has been working on
treebank-based acquisition of wide-coverage LFG resources (for
English, German, Spanish, French, Chinese, Arabic and Japanese) and
data-driven parsing and generation models for these resources.


 

T4: Semantic Parsing: The Task, the State of the Art and the Future

Presenters: Rohit J. Kate and Yuk Wah Wong
July 11, 14:00-17:30, Hall IX

Abstract

Semantic parsing is the task of mapping natural language sentences into
complete formal meaning representations which a computer can execute for
some domain-specific application. This is a challenging task and is
critical for developing computing systems that can understand and process
natural language input, for example, a computing system that answers
natural language queries about a database, or a robot that takes commands
in natural language. While the importance of semantic parsing was realized
a long time ago, it is only in the past few years that the
state-of-the-art in semantic parsing has been significantly advanced with
more accurate and robust semantic parser learners that use a variety of
statistical learning methods. Semantic parsers have also been extended to
work beyond a single sentence, for example, to use discourse contexts and
to learn domain-specific language from perceptual contexts. Some of the
future research directions of semantic parsing with potentially large
impacts include mapping entire natural language documents into machine
processable form to enable automated reasoning about them and to convert
natural language web pages into machine processable representations for
the Semantic Web to support automated high-end web applications.

This tutorial will introduce the semantic parsing task and will bring the
audience up-to-date with the current research and state-of-the-art in
semantic parsing. It will also provide insights about semantic parsing and
how it relates to and differs from other natural language processing
tasks. It will point out research challenges and some promising future
directions for semantic parsing. The target audience will be of NLP researchers
and practitioners but no prior knowledge of semantic parsing will be assumed.

Outline

1. Introduction to the task of semantic parsing
   (a) Definition of the task
   (b) Examples of application domains and meaning representation languages
   (c) Distinctions from and relations to other NLP tasks

2. Semantic parsers
   (a) Earlier hand-built systems
   (b) Learning for semantic parsing
       i. Semantic parsing learning task
       ii. Non-statistical semantic parser learners
       iii. Statistical semantic parser learners
       iv. Exploiting syntax for semantic parsing
       v. Various forms of supervision: semi-supervision, ambiguous supervision
   (c) Underlying commonality and differences between different
semantic parser learners

3. Semantic parsing beyond a sentence
   (a) Using discourse contexts for semantic parsing
   (b) Learning language from perceptual contexts

4. Research challenges and future directions
   (a) Machine reading of documents: Connecting with knowledge representation
   (b) Applying semantic parsing techniques to the Semantic Web
   (c) Future research directions

5. Conclusions

Presenters

Rohit J. Kate
Department of Computer Science
The University of Texas at Austin
Email: rjkate@cs.utexas.edu

Rohit J. Kate is a postdoctoral fellow in the department of Computer Science
at the University of Texas at Austin. He obtained his Ph.D. from the same
place. His research interests are in natural language processing,
especially in semantic parsing and information extraction, and in machine
learning. He has worked extensively in semantic parsing, various forms of
supervisions for semantic parser learners and kernel-based methods for natural
language processing.

Yuk Wah Wong
Google Inc.
Email: ywwong@google.com

Yuk Wah Wong is a Senior Software Engineer at Google Pittsburgh. He
obtained his Ph.D. from the University of Texas at Austin. His
research interests are in natural language processing and machine
learning. His thesis topic was on semantic parsing and generation
using statistical machine translation techniques. Since joining
Google, he has worked on information extraction, data integration, and
natural language processing, with applications in web search and
vertical search.


 

T5: Tree-based and Forest-based Translation

Presenters: Yang Liu and Liang Huang
July 11, 09:00-12:30, Hall X

Abstract

The past several years have witnessed rapid advances in syntax-based machine translation, which exploits natural language syntax to guide translation. Depending on the type of input, most of these efforts can be divided into two broad categories: (a) string-based systems whose input is a string, which is simultaneously parsed and translated by a synchronous grammar (Wu, 1997; Chiang, 2005; Galley et al., 2006), and (b) tree-based systems whose input is already a parse tree to be directly converted into a target tree or string (Lin, 2004; Ding and Palmer, 2005; Quirk et al., 2005; Liu et al., 2006; Huang et al., 2006).

Compared with their string-based counterparts, tree-based systems offer many attractive features: they are much faster in decoding (linear time vs. cubic time), do not require sophisticated binarization (Zhang et al., 2006), and can use separate grammars for parsing and translation (e.g. a context-free grammar for the former and a tree substitution grammar for the latter).

However, despite these advantages, most tree-based systems suffer from a major drawback: they only use 1-best parse trees to direct translation, which potentially introduces translation mistakes due to parsing errors (Quirk and Corston-Oliver, 2006). This situation becomes worse for resource-poor source languages without enough Treebank data to train a high-accuracy parser.

This problem can be alleviated elegantly by using packed forests (Huang, 2008), which encodes exponentially many parse trees in a polynomial space. Forest-based systems (Mi et al., 2008; Mi and Huang, 2008) thus take a packed forest instead of a parse tree as an input. In addition, packed forests could also be used for translation rule extraction, which helps alleviate the propagation of parsing errors into rule set. Forest-based translation can be regarded as a compromise between the string-based and tree-based methods, while combining the advantages of both: decoding is still fast, yet does not commit to a single parse. Surprisingly, translating a forest of millions of trees is even faster than translating 30 individual trees, and offers significantly better translation quality. This approach has since become a popular topic.

This tutorial surveys tree-based and forest-based translation methods. For each approach, we will discuss the two fundamental tasks: decoding, which performs the actual translation, and rule extraction, which learns translation rules from real-world data automatically. Finally, we will introduce some more recent developments to tree-based and forest-based translation, such as tree sequence based models, tree-to-tree models, joint parsing and translation, and faster decoding algorithms. We will conclude our talk by pointing out some directions for future work.

Outline

Part 1: Tree-based translation
* Motivations and Overview
* Tree-to-String Model and Decoding
* Tree-to-String Rule Extraction
* Language Model-Integrated Decoding: Cube Pruning

Part 2: Forest-based translation
* Packed Forest
* Forest-based Decoding
* Forest-based Rule Extraction

Part 3: Extensions
* Tree-Sequence-to-String Models
* Tree-to-Tree Models
* Faster Decoding Methods

Part 4: Conclusion and Open Problems

Prerequisites

Prior experience with statistical machine translation is recommended, but NOT required.
(we'll get you interested!)

Presenters

Yang Liu
Key Laboratory of Intelligent Information Processing
Institute of Computing Technology, Chinese Academy of Sciences
NO. 6 Kexueyuan South Road, Haidian District
P.O. Box 2704, Beijing 100190, China
+86-10-62600667
yliu@ict.ac.cn
http://nlp.ict.ac.cn/~liuyang/
Short Bio: Yang Liu is an Associate Researcher at Institute of Computing Technology, Chinese Academy of Sciences (CAS/ICT). He obtained his PhD from CAS/ICT in 2007. His research interests include syntax-based translation, word alignment, and system combination. He has published five ACL full papers in the machine translation area in the recent five years. His work on "tree-to-string translation" received a Meritorious Asian NLP Paper Award at ACL 2006.

Liang Huang
Natural Language Group
Information Sciences Institute, University of Southern California
4676 Admiralty Way, Marina del Rey, CA 90292, USA
+1-310-448-9184
lhuang@isi.edu
http://www.isi.edu/~lhuang
Short Bio: Liang Huang is a Computer Scientist at Information Sciences Institute, University of Southern California (USC/ISI), and a Research Assistant Professor at USC's Computer Science Dept. He obtained his PhD from the University of Pennsylvania under Aravind Joshi and Kevin Knight. His research interests are mainly in the theoretical aspects of NLP, esp. efficient algorithms in parsing and translation. His work on "forest-based algorithms" received an Outstanding Paper Award at ACL 2008, as well as Best Paper Nominations at ACL 2007 and EMNLP 2008. He has taught two tutorials on Advanced Dynamic Programming at COLING 2008 and NAACL 2009 and is currently (co-)teaching two NLP courses at USC.



T6: Discourse Structure: Theory, Practice and Use

Presenters: Bonnie Webber, Markus Egg and Valia Kordoni
July 11, 09:00-12:30, Hall IV

Abstract

Discourse structure concerns the ways that discourses (monologic,
dialogic and multi-party) are organised and those aspects of meaning
that such organisation encodes. It is a potent influence on
clause-level syntax, and the meaning it encodes is as essential to
communication as that conveyed in a clause. Hence no modern language
technology (LT) – information extraction, machine translation, opinion
mining, or summarisation – can fully succeed without taking discourse
structure into account. Attendees to this tutorial should gain insight
into discourse structure (discourse relations; scope of attribution,
modality and negation; centering; topic structure; dialogue moves and
acts; macro-structure), its relevance for LT, and methods and
resources that support its use. Our target audience are researchers
and practitioners in LT (not necessarily discourse) who are interested
in LT tasks that involve or could benefit from considering language
and communication beyond the individual sentence.

Outline

• PART I – General Overview
1. Introduction
2. Levels and kinds of structure in monologic, dialogic and multiparty discourse
3. Contribution of structure to discourse

• PART II – Computational Approaches (rule-based and statistical)
1. Recognizing the elements of structure
2. Recognizing how elements are combined

• PART III – Applications and Resources
1. Applications in Language Technology
2. Discourse structure resources (mono-lingual and multilingual)

• PART IV – Future Developments

Presenters

Bonnie Webber (bonnie@inf.ed.ac.uk) is a Professor of Informatics at
Edinburgh University. She is best known for work on Question Answering
(starting with LUNAR in the early 70's) and discourse phenomena
(starting with her PhD thesis on discourse anaphora). She has also
carried out research on animation from instructions, medical decision
support systems and biomedical text processing.

Markus Egg (markus.egg@anglistik.hu-berlin) is a Professor of
Linguistics at the Dept. of English and American Studies of the
Humboldt University in Berlin. His main areas of interest are syntax,
semantics, pragmatics, and discourse; the interfaces between them; and
their implementation in NLP systems.

Valia Kordoni (kordoni@dfki.de) is a Senior Researcher at the Language
Technology Lab of the German Research Centre for Artificial
Intelligence (DFKI GmbH) and an assistant professor at the Department
of Computational Linguistics of Saarland University. Her main areas of
interest are syntax, semantics, pragmatics and discourse. She works on
the theoretical development of these areas as well as on their
implementation in NLP systems.