2nd CFP - ACL 2012 Joint Workshop on Statistical Parsing and Semantic Processing of Morphologically-Rich Languages (SP-Sem-MRL)

Tuesday, 12 June 2012 to Saturday, 16 June 2012
South Korea
Marianna Apidianaki
Ido Dagan
Jennifer Foster
Yuval Marton
Djamé Seddah
Reut Tsarfaty
Katrin Erk
Ines Rehbein
Peter Turney
Yannick Versley
Saturday, 31 March 2012

ACL 2012 Joint Workshop on Statistical Parsing and Semantic Processing
of Morphologically-Rich Languages (SP-Sem-MRL)



Submission deadline: Mar 31, 2012
Notification to authors: Apr 19, 2012
Camera-ready deadline: Apr 30, 2012
Workshop dates: TBD, during the ACL-2012 workshop period (Jul 12-14, 2012)


Morphologically Rich Languages (MRLs) are languages in which
grammatical relations such as Subject, Predicate, Object, etc., are
indicated morphologically (e.g. through inflection) instead of
positionally (as in, e.g. English), and the position of words and
phrases in the sentence may vary substantially. The tight connection
between the morphology of words and the grammatical relations between
them, and the looser connection between the position and grouping of
words to their syntactic roles, pose serious challenges for syntactic
and semantic processing. Furthermore, since grammatical relations
provide the interface to compositional semantics, morphosyntactic
phenomena may significantly complicate processing the syntax-semantics
interface. In statistical parsing, which has been a cornerstone of
research in NLP and had seen great advances due to the widespread
availability of syntactically annotated corpora, English parsing
performance has reached a high plateau in certain genres, which is
however not always indicative of parsing performance in MRLs,
dependency-based and constituency-based alike. Semantic processing of
natural language has similarly seen much progress in recent years.
However, as in parsing, the bulk of the work has concentrated on
English, and MRLs may present processing challenges that the community
is as of yet unaware of, and which current semantic processing
technologies may have difficulty coping with. These challenges may
lurk in areas where parses may be used as input, such as semantic role
labeling, distributional semantics, paraphrasing and textual
entailment, or where inadequate pre-processing of morphological
variation hurts parsing and semantic tasks alike.

This joint workshop aims to build upon the first and second SPMRL
workshops (at NAACL-HLT 2010 and IWPT 2011, respectively) while
extending the overall scope to include semantic processing where MRLs
pose challenges for algorithms or models initially designed to process
English. In particular, we seek to explore the use of newly available
syntactically and/or semantically annotated corpora, or data sets for
semantic evaluation that can contribute to our understanding of the
difficulty that such phenomena pose. One goal of this workshop is to
encourage cross-fertilization among researchers working on different
languages and among those working on different levels of processing.
Of particular interest is work addressing the lexical sparseness and
out-of-vocabulary issues that occur in both syntactic and semantic

The workshop will be organised around three broad themes:

- Syntactic Models: Models and architectures that explicitly integrate
morphological analysis and parsing; Cross-language and cross-model
comparison of strengths and weaknesses regarding particular linguistic

- Semantic Models: State-of-the-art semantic analysis and generation
methods for MRLs, including semantic similarity and entailment
criteria and their task-specific instantiation, and suitable
representations for semantic tasks in MRLs.

- Joint Modeling Aspects: Improving lexical coverage and handling of
out-of-vocabulary (OOV) words by utilising lexical knowledge or
unsupervised/semi-supervised learning techniques; The role of parsing
in semantic analysis for MRLs; Preprocessing issues that jointly
affect parsing and semantic analysis; Syntax-Semantics interfaces for
monolingual or multilingual systems.

The areas of interest for this joint workshop include, but are not
limited to, the following topics:

--Syntactic Parsing of MRLs

* parsing models and architectures that explicitly integrate
morphological analysis and parsing
* parsing models and architectures that focus on lexical coverage
and the handling of OOV words either by incorporating linguistic
knowledge or through the use of unsupervised/semi-supervised learning
* Cross-language and cross-model comparison of models' strengths
and weaknesses in the face of particular linguistic phenomena (e.g.
morphosyntactic characteristics, degree of word-order freedom)
* Comprehensive analyses of the strengths and weaknesses of various
parsing models on particular linguistic (e.g. morphosyntactic)
phenomena with respect to variation in tagsets, annotation schemes and
additional data transformations

--Semantic Processing of MRLs

* Semantic distance and entailment criteria in the MRL space (e.g.,
with respect to inflection, derivation, root, pattern, lemma, tense,
and/or aspect, etc.); possibly task-specific criteria
* Lexical resources and morphological analysis tools facilitating
semantic distance measures and semantic relation detection
* Methods and models for semantic similarity/distance calculation,
clustering and paraphrasing relying on MRL properties, and using:
probability, vector/graph representation, data-driven and/or
linguistic rules, pivoting/SMT, machine-learning, etc.
* Paraphrase and textual entailment detection or generation,
specific to MRLs (e.g., task-specific issues of inclusion or exclusion
of certain paraphrase and textual entailment patterns differing in
* Use of morphological analysis for semantic calculation aimed at
reducing sparsity / OOV rate, preferably without losing information
due to mere lemmatization
* Semantic role labeling (SRL) for MRLs; verbal/nominalized
selectional preferences

--The Syntax-Semantics Interface:

* Parsing-based semantic processing tasks, e.g., semantic role
labeling (SRL)
* Processing of compounds and multi-morphemic words: optimal
level(s) of tokenization, representation, and morphological analysis
for either/both tasks
* Syntax-aware semantic distance measures, paraphrasing and textual
* Semantic classes and/or relations as input to syntactic parsing

In addition to the standard (oral or poster) presentations in the
sessions, the SP-Sem-MRL workshop will feature a panel of commentators
for a selection of the talks, allowing for an extended discussion
period. This new feature is introduced in order to foster in-depth
discussions and to nurture interactions among researchers. It is our
hope that these interactions will help to bring ideas (and solutions)
to the fore and promote a more rapid advance of the state-of-the-art
in the field.

Shared Task

There will be no shared task on MRLs this year. However, we will take
this opportunity to disclose, during a special session of SP-Sem-MRL,
the data sets and evaluation procedures for the cross-linguistic
cross-framework shared task which was discussed at previous SPMRL
panels, and which is planned for SPMRL 2013 at IWPT 2013. Researchers
who are interested in participating in the shared task or teams that
wish to add their data sets or extrinsic evaluation procedures to the
task are encouraged to attend the session and contribute to the


Authors are invited to submit long papers (up to 10 pages + any number
of reference pages) and short papers (up to 5 pages + any number of
reference pages). Long papers should describe unpublished, substantial
and completed research. Short papers should be position papers, papers
describing work in progress or short, focused contributions.

Papers may be submitted until 31 March 2012 in PDF format via the START system:


Submitted papers must follow the styles and the formatting guidelines
available from the current ACL recommendations
(http://www.acl2012.org/call/sub01.asp). As the reviewing will be
blind, the papers must not include the authors' names and
affiliations. Furthermore, self-references that reveal the author's
identity, e.g., "We previously showed (Smith, 1991) ..." must be
avoided. Instead, use citations such as "Smith previously showed
(Smith, 1991) ..." Papers that do not conform to these requirements
will be rejected without review. In addition, please do not post your
submissions on the web until after the review process is complete.


General chairs:
Marianna Apidianaki (LIMSI-CNRS, France)
Ido Dagan (Bar-Ilan University, Israel)
Jennifer Foster (Dublin City University, Ireland)
Yuval Marton (IBM Watson Research Center, US)
Djamé Seddah (University of Paris 4, France)
Reut Tsarfaty (Uppsala University, Sweden)

Shared session chairs:
Katrin Erk (University of Texas at Austin, US)
Ines Rehbein (University of Potsdam, Germany)
Peter Turney (National Research Council, Canada)
Yannick Versley (University of Tuebingen, Germany)


e-mail: sp.sem.mrl2012@gmail.com
website: https://sites.google.com/site/spsemmrl2012/

This workshop is endorsed by SIGLEX and SIGPARSE, and
sponsored by INRIA's Alpage project.