<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://www.aclweb.org/aclwiki/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Michells</id>
	<title>ACL Wiki - User contributions [en]</title>
	<link rel="self" type="application/atom+xml" href="https://www.aclweb.org/aclwiki/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Michells"/>
	<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/Special:Contributions/Michells"/>
	<updated>2026-04-27T10:43:06Z</updated>
	<subtitle>User contributions</subtitle>
	<generator>MediaWiki 1.43.6</generator>
	<entry>
		<id>https://www.aclweb.org/aclwiki/index.php?title=DISCo_2011_shared_task_data:_Compositionality_judgments_(Repository)&amp;diff=8894</id>
		<title>DISCo 2011 shared task data: Compositionality judgments (Repository)</title>
		<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/index.php?title=DISCo_2011_shared_task_data:_Compositionality_judgments_(Repository)&amp;diff=8894"/>
		<updated>2011-06-30T09:19:55Z</updated>

		<summary type="html">&lt;p&gt;Michells: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&lt;br /&gt;
* &#039;&#039;&#039;ADCR ID:&#039;&#039;&#039; ADCR2011T007 &lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Name of Dataset:&#039;&#039;&#039; DISCo 2011 shared task dataset, see http://disco2011.fzi.de/ &lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Contributor:&#039;&#039;&#039; Chris Biemann, TU Darmstadt, Germany, biemann@tk.informatik.tu-darmstadt.de&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Copyright:&#039;&#039;&#039; (c) 2011, Chris Biemann. Deposited in the [[ACL Data and Code Repository]] by Chris Biemann.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Licensing:&#039;&#039;&#039; This work is not licensed. You can use it as you wish.&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Citation:&#039;&#039;&#039; If you use the DISCo 2011 shared task dataset in your research, please include the following citation in any resulting papers: &lt;br /&gt;
&lt;br /&gt;
:: Biemann, C. and Giesbrecht, E. (2011): Distributional Semantics and Compositionality 2011: Shared Task Description and Results. Proceedings of the ACL-HLT 2011 Workshop on Distributional Semantics and Compositionality (DISCo 2011), Portland, Oregon, USA. http://aclweb.org/anthology-new/W/W11/W11-1304.pdf&lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Description:&#039;&#039;&#039; The DISCo 2011 shared task dataset contains compositionality judgements for ADJ_NN, V_SUBJ and V_OBJ phrases for German and English. These were aggregated over judgments on 5 sentence contexts each. The sentence-level judgments are also available in this dataset. &lt;br /&gt;
&lt;br /&gt;
* &#039;&#039;&#039;Download&#039;&#039;&#039;:  http://aclweb.org/aclwiki/code/3/38/Disco2011-shared-task-complete-dataset.zip (&#039;&#039;link to download file&#039;&#039;) &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Data and code repository]]&lt;/div&gt;</summary>
		<author><name>Michells</name></author>
	</entry>
	<entry>
		<id>https://www.aclweb.org/aclwiki/index.php?title=Multiword_Expressions&amp;diff=8883</id>
		<title>Multiword Expressions</title>
		<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/index.php?title=Multiword_Expressions&amp;diff=8883"/>
		<updated>2011-06-22T15:44:21Z</updated>

		<summary type="html">&lt;p&gt;Michells: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Multiword expressions (MWEs) are expressions which are made up of at least 2 words and which can be syntactically and/or semantically idiosyncratic in nature. Moreover, they act as a single unit at some level of linguistic analysis. According to Sag et al.&amp;lt;ref&amp;gt;Sag et al (2002), p. 1. &amp;lt;/ref&amp;gt; we could define MWEs roughly as „idiosyncratic interpretations that cross word boundaries“. &lt;br /&gt;
MWEs can be regarded as lying at the interface of grammar and lexicon, usually being instances of well productive syntactic patterns but nevertheless showing a peculiar lexical behaviour.&amp;lt;ref&amp;gt;Calzolari et al. (2002), p. 1934.&amp;lt;/ref&amp;gt; &lt;br /&gt;
&lt;br /&gt;
Besides, they are commonly used in any field of language – Jackendoff&amp;lt;ref&amp;gt; cf. Jackendoff (1997).&amp;lt;/ref&amp;gt; estimates the number of MWEs in a speaker&#039;s lexicon as comparable to the number of single words. Examples for MWEs would be idioms as „kick the bucket“, compound nouns as „telephone box“ and „post office“, verb-particle constructions as „look sth. up“ or proper names as „San Francisco“. Due to the high frequency of MWEs there is a growing awareness in the NLP ([http://en.wikipedia.org/wiki/Natural_language_processing Natural Language Processing]) community for the problems they pose.&lt;br /&gt;
&lt;br /&gt;
{{stub}}&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Classification of MWEs == &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
MWEs can be split up in &#039;&#039;&#039;lexicalized phrases&#039;&#039;&#039; which have at least in part idiosyncratic syntax or pragmatics, and &#039;&#039;&#039;institutionalized phrases&#039;&#039;&#039; which are syntactically and semantically compositional. Lexicalized phrases can be further subclassified into &#039;&#039;&#039;fixed expressions&#039;&#039;&#039;, &#039;&#039;&#039;semi-fixed expressions&#039;&#039;&#039; and &#039;&#039;&#039;syntactically flexible expressions&#039;&#039;&#039;.&amp;lt;ref&amp;gt;The article follows the schema of classification that is proposed in Sag et al. (2002). Most of the examples are taken from their article as well.&amp;lt;/ref&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;1.1 Fixed expressions&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Fixed expressions are fully lexicalized and can neither be variated morphosyntactically nor modificated internally. Examples for fixed expressions are: &#039;&#039;in short&#039;&#039;, &#039;&#039;by and large&#039;&#039;, &#039;&#039;every which way&#039;&#039;. They are fixed, as you cannot say &#039;&#039;in shorter&#039;&#039; or &#039;&#039;in very short.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;1.2 Semi-fixed expressions&lt;br /&gt;
&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
In semi-fixed expressions word order and composition are strictly invariable, while inflection, variation in reflexive form and determiner selection is possible. &lt;br /&gt;
&lt;br /&gt;
In &#039;&#039;&#039;non-decomposable idioms&#039;&#039;&#039; (i.e. idioms in which the meaning cannot be assigned to the parts of the MWE) such as &#039;&#039;kick the bucket&#039;&#039; the verb can be inflected according to a particular context: &#039;&#039;he kicks&#039;&#039; &#039;&#039;the bucket&#039;&#039;. On the other hand non-decomposable idioms do not undergo syntactic variability. For example, a passive sentence as &#039;&#039;the bucket was kicked&#039;&#039; is not possible. (or at least it does not have the same meaning.)&lt;br /&gt;
&lt;br /&gt;
Another type of semi-fixed expressions are &#039;&#039;&#039;compound nominals&#039;&#039;&#039; as &#039;&#039;car park&#039;&#039; or &#039;&#039;peanut butter&#039;&#039;. They are syntactically-unalterable but can inflect for number: &#039;&#039;2 car parks&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Proper names&#039;&#039;&#039; are semi-fixed expressions as well since they can occur in different forms. For example the name of the U.S. sports team &#039;&#039;the San Francisco 49ers&#039;&#039; can occur as &#039;&#039;the 49ers&#039;&#039; or as a modifier in the compound noun &#039;&#039;a 49ers player&#039;&#039; etc.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;1.3 Syntactically-Flexible Expressions&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Syntactically-flexible expressions have a wider range of syntactic variability than semi-fixed expressions. They occur in the form of &#039;&#039;&#039;decomposable idioms&#039;&#039;&#039;, &#039;&#039;&#039;verb-particle constructions&#039;&#039;&#039; and &#039;&#039;&#039;light verbs&#039;&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
Decomposable idioms are likely to be syntactically flexible to some degree. Examples are &#039;&#039;let the cat out of the bag&#039;&#039; and &#039;&#039;sweep under the rug&#039;&#039;. Yet, it is hard to predict which kind of syntactic variation a given idiom can undergo.&lt;br /&gt;
&lt;br /&gt;
Verb-particle constructions, such as &#039;&#039;write up&#039;&#039; and &#039;&#039;look up&#039;&#039; are made up of a verb and one or more partcicles. Either they are semantically idiosyncratic as &#039;&#039;brush up on&#039;&#039; or compositional as &#039;&#039;break up&#039;&#039; in &#039;&#039;the meteorite broke up in the earth&#039;s atmosphere&#039;&#039;. In some transitive verb-particle constructions as &#039;&#039;call s.o. up&#039;&#039; an NP argument can occur either between or following the verb and particle(s): &#039;&#039;call Kim up&#039;&#039; or &#039;&#039;call up Kim&#039;&#039;, respectively. In addition adverbs can often be inserted between the verb and particle as in &#039;&#039;fight bravely on&#039;&#039;.&lt;br /&gt;
&lt;br /&gt;
For light verb constructions, as &#039;&#039;make a mistake&#039;&#039;, &#039;&#039;give a demo&#039;&#039; it is difficult to predict which light verb combines with a given noun. Though they are highly idiosyncratic they have to be distinguished from idioms: &amp;quot;the noun is used in a normal sense, and the verb meaning appears to be bleached, rather than idiomatic.&amp;quot;&amp;lt;ref&amp;gt;Sag et al. (2002), p. 7.&amp;lt;/ref&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;1.4 Institutionalized Phrases&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Institutionalized phrases are conventionalized phrases, such as &#039;&#039;salt and pepper&#039;&#039;, &#039;&#039;traffic light&#039;&#039; and &#039;&#039;to kindle excitement&#039;&#039;. They are semantically and syntactically compositional, but statistically idiosyncratic. Regarding the phrase &#039;&#039;traffic light&#039;&#039;, &#039;&#039;traffic&#039;&#039; and &#039;&#039;light&#039;&#039; both retain simpex senses but produce a compositional reading by combining constructionally.&lt;br /&gt;
&lt;br /&gt;
== Problems for NLP ==&lt;br /&gt;
&lt;br /&gt;
One problem that occurs in NLP given that MWEs are treated by general, compositional methods of linguistic analysis is the &#039;&#039;&#039;overgeneration&#039;&#039;&#039; problem. A system could deduce from given expressions other putatively possible expressions that are equivalent in meaning but do not exist due to a lack of institutionalization. &amp;quot;A generation system that is uniformed about both the patterns of compounding and the particular collocational frequency of the relevant dialect would correctly generate &#039;&#039;telephone booth&#039;&#039; (American) or &#039;&#039;telephone box&#039;&#039; (British/Australian), but might also generate such perfectly compositional, but unacceptable examples as &#039;&#039;telephone cabinet&#039;&#039;, &#039;&#039;telephone closet&#039;&#039;, etc.&amp;quot;&amp;lt;ref&amp;gt;Sag et al. (2002), p. 2.&amp;lt;/ref&amp;gt; &lt;br /&gt;
&lt;br /&gt;
Another problem is the &#039;&#039;&#039;idiomaticity&#039;&#039;&#039; problem. It is difficult to predict the meaning of an expression like &#039;&#039;kick the bucket&#039;&#039; since the meaning is not related to the meanings of &#039;&#039;kick&#039;&#039;, &#039;&#039;the&#039;&#039;, and &#039;&#039;bucket&#039;&#039;. Even though the expression seems to conform the grammar of English verb phrases.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&amp;lt;references /&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Nicoletta Calzolari et al.: [http://gandalf.aksis.uib.no/lrec2002/pdf/259.pdf &#039;&#039;Towards Best Practice for Multiword Expressions in Computational Lexicons&#039;&#039;] (2002) in: Proceedings of the Third International Conference on Language Resources and Evaluation (LREC 2002), pp. 1934–40.&lt;br /&gt;
&lt;br /&gt;
Ray Jackendoff: &#039;&#039;The Architecture of the Language Faculty&#039;&#039; (1997), Cambridge, MA: MIT Press.&lt;br /&gt;
&lt;br /&gt;
Ivan A. Sag et al.: [http://www.springerlink.com/content/978-3-540-43219-7/#section=653450&amp;amp;page=1&amp;amp;locus=0&#039;&#039;Multiword Expressions: A Pain in the Neck for NLP&#039;&#039;] (2002) in: LECTURE NOTES IN COMPUTER SCIENCE, Vol. 2276, pp. 1-15.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Further Literature ==&lt;br /&gt;
&lt;br /&gt;
Timothy Baldwin et al.: [http://acl.ldc.upenn.edu/acl2003/mwexp/pdfs/Baldwin.pdf. &#039;&#039;An Empirical Model of Multiword Expression Decomposability&#039;&#039;] (2003) in: Proceedings of the ACL 2003 Workshop on Multiword Expressions: Analysis, Acquisition and Treatment, pp. 89-96.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Eric Wehrli: [http://www.springerlink.com/content/wkeg2pxg1kha5uq9/ &#039;&#039;Parsing and Collocations&#039;&#039;] (2000) in: LECTURE NOTES IN COMPUTER SCIENCE, Vol. 1835, pp. 272-282.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== External links ==&lt;br /&gt;
&lt;br /&gt;
[http://en.wikipedia.org/wiki/Multiword_expression Wikipedia article on MWE ]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Research]]&lt;/div&gt;</summary>
		<author><name>Michells</name></author>
	</entry>
</feed>