Workshops


WS1: CoNLL-2003: Seventh Conference on Natural Language Learning
Saturday and Sunday May 31 and June 1, 2003
Submission due date: March 16

CoNLL is an international forum for discussion and presentation of research on natural language learning. We invite submission of papers about natural language learning topics, including but not limited to: computational models of human language acquisition; computational models of the origins and evolution of language; machine learning methods applied to natural language processing tasks (speech processing, phonology, morphology, syntax, semantics, discourse processing, language engineering applications); symbolic learning methods (Rule Induction and Decision Tree Learning, Lazy Learning, Inductive Logic Programming, Analytical Learning, Transformation-based Error-driven Learning); biologically-inspired methods (Neural Networks, Evolutionary Computing); statistical methods (Bayesian Learning, HMM, maximum entropy, SNoW, Support Vector Machines); Reinforcement Learning; Active learning, ensemble methods, meta-learning; Computational Learning Theory analysis of language learning; empirical and theoretical comparisons of language learning methods; models of induction and analogy in Linguistics. As in previous editions, CoNLL-03 will feature an invited speaker, a shared task (named-entity recognition) and a special theme (semi-supervised, unsupervised and sample selection techniques for language learning).

WS2: DUC03: Text Summarization Workshop

Saturday and Sunday May 31 and June 1, 2003
Submission due date: February 28

Over the last three years, the DUC series (Document Understanding Conference, http://duc.nist.gov/) has been the main forum for Research and evaluation of text summarization. The workshop requests papers on all aspects of text summarization, including but not limited to: non-extractive summarization, spoken language (including dialogue) summarization, language modeling for text and speech summarization, multi-document and multilingual summarization, integration of question answering and summarization, Web-based summarization, evaluation of summarization systems, etc. The second day of the workshop will be devoted to discussing the results of the DUC 2003 evaluation, administered by NIST, that took place in mid-February.

WS3: Research Directions in Dialogue Processing
Saturday and Sunday May 31 and June 1, 2003
Submission due date: March 21

This workshop will provide a forum for discussion of current directions in dialog research, specifically to assess the current state of the art in the area of dialogue processing, and to identify key themes and directions that are driving research in the field. Progress in research presupposes the existence of a common infrastructure that includes tools and corpora, evaluation techniques as well as some consensus on effective research paradigms. Thus one of the outcomes of the workshop will be a set of recommendations for developing and supporting infrastructure, and encouraging agreement on research paradigms and evaluation methodologies. The motivation to hold a workshop at this time is the need to establish the role of dialogue as a core element in human-human and human-computer communication, to identify the resources that are needed to support research in the area and to define its role in the forthcoming NSF Human Language and Communication program.

WS4: Building and Using Parallel Texts: Data Driven Machine Translation and Beyond
Saturday May 31, 2003
Submission due date: Full papers: March 10 (regular papers) / April 1 (short papers)

This workshop provides a forum for researchers working on problems related to building and using parallel text corpora, which are vital resources for efficiently deriving multi-lingual text processing tools. In addition to regular papers, the workshop also includes a shared task that will result in a comparative evaluation of word alignment techniques. We invite submissions of papers addressing any of the following issues, with work addressing languages with scarce resources being particularly welcome: constructing parallel corpora, including the automatic identification and harvesting of parallel corpora from the Web; evaluating the quality of parallel corpora and word alignments; tools for processing parallel corpora, including automatic sentence alignment, word alignment, phrase alignment, detection of omissions and gaps in translations, and others; using parallel corpora for data driven Machine Translation; using parallel corpora for the derivation of language processing tools in new languages; using parallel corpora for automatic annotation; language learning applied to parallel corpora; translation memory systems as a source of aligned corpora. The invited speaker will be Elliott Macklovitch from the University of Montreal.

WS5: Software Engineering and Architecture of Language Technology Systems (SEALTS)
Saturday May 31, 2003
Submission due date: March 23

Over the past few years a number of significant systems and practices have been developed in what may be called Software Architecture for Language Engineering. Among the most prominent are: RAGS, Reference Architecture for Generation Systems (Brighton and Edinburgh) LT XML (Edinburgh) TEI, CES, XCES (Oxford, Vassar, etc.) ATLAS (LDC, NIST) Galaxy Communicator Software Infrastructure (MIT & MITRE) Protege (Stanford) GATE, a General Architecture for Text Engineering (Sheffield) This workshop represents an opportunity to discuss in a coordinated setting the advances and problems of computational infrastructure and architectures for large-scale and robust Natural Language Processing systems.

WS6: Text Meaning
Saturday May 31, 2003
http://www.research.umbc.edu/~sergei/ Submission due date: March 17

Most, if not all, high-end HLT/NLP applications - from the earliest, MT, to the latest, question answering and text summarization - stand to benefit from being able to use text meaning in their processing. But the bulk of work in the field has not, over the years, pertained to treatment of meaning. The main reason given is the complexity of the task of comprehensive meaning analysis. The principal goal of this workshop is to re-establish the research community of knowledge-based meaning processing and to help to explicate the currently implicit treatments of meaning in knowledge-lean approaches and how the advances in the latter and in formal semantics should influence the task. Please submit papers (not to exceed 8 pages in the HLT/NAACL two-column format) electronically, PDF strongly preferred, to sergei@umbc.edu.

WS7: Building Educational Applications Using Natural Language Processing
Saturday May 31, 2003
Submission due date: March 17 (note new deadline)

There is an increased use of HLT-based educational applications for both large-scale assessment and classroom instruction. This has occurred for two primary reasons. First, there has been a significant increase in the availability of computers in schools, from elementary school to the university. Second, there has been notable development in computer-based educational applications that incorporate advanced methods in HLT that can be used to evaluate students' work. Educational applications have been developed across a variety of subject domains in automated evaluation of free-responses and intelligent tutoring. To date, these two research areas have remained autonomous. We hope that this workshop will facilitate communication between researchers who work on all types of instructional applications, for K-12, undergraduate, and graduate school. Since most of this work in HLT-based educational applications is text-based, we are especially interested in any work of this type that incorporates speech processing and other input/output modalities. We wish to expose the NLP research community to these technologies with the hope that they may see novel opportunities for use of their tools in an educational application.

WS8: Learning Word Meaning from Non-Linguistic Data
Saturday May 31, 2003
Submission due date: March 10

One of the grand challenges of NLP, AI, and Cognitive Science is to develop models of what words mean (lexical semantics) in terms of the non-linguistic world. Recently there has been growing interest in using corpus and data based techniques for this task. In other words, trying to learn what words mean by analysing a 'parallel corpus' of (A) non-linguistic data and (B) linguistic texts that describe or otherwise are based on the non-linguistic data. Recent examples of such work include learning verb semantics from visual-image sequences; learning the meaning of time phrases from a collection of weather forecasts based on numerical weather simulations; and learning the meaning of mathematical predicates from human verbalisations of theorem-prover output. We hope that this workshop will help "gel" this new and exciting research area, by bringing together interested people who may not be aware of what is being done elsewhere. Participants from other area of AI and Cognitive Science are very welcome, including vision and robotics researchers who are interested in learning how to relate sensor data to words, and psychologists who are interested in cognitive models of how people learn to relate words to the non-linguistic world.

WS9: Analysis of Geographic References
Saturday May 31, 2003
Submission due date: March 15

This workshop will discuss how existing NLP techniques can be adapted and new ones developed to advance core technology in geographic reference analysis. Two-page extended abstracts are due March 15 (electronic submissions only, please, mailed to geowkshp@kornai.com).

Student Workshop

Date to be determined
Submission due date: March 15

The Student Research Workshop, a tradition at ACL conferences, is being expanded to include also students from the Information Retrieval and Speech communities. Participants will have the opportunity to get feedback both from a wide audience in general and from selected panelists: experienced researchers who prepare in-depth comments and questions in advance of the presentation. We invite student researchers to submit their work to the workshop. The emphasis will be on work in progress. Original and unpublished research is invited on all aspects of speech, information retrieval, and computational linguistics. We especially encourage research that is in the intersection of two or three of these areas, or that fall in the primary workshop topics:

- Language modeling
- Topic detection
- Information extraction

Additional topics of interest: Dialog systems and spoken language understanding, Speech synthesis and natural language generation, Speech-to-speech translation, Spoken document retrieval, Prosody and parsing, Question answering, Summarization and gisting, User modeling and adaptation, Disambiguation, stemming, lexical chains.