File:Turkboot615 A-M.zip

From ACL Wiki
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Turkboot615_A-M.zip(file size: 5.63 MB, MIME type: application/zip)

Warning: This file type may contain malicious code. By executing it, your system may be compromised.

TWSI (Turk bootstrap Word Sense Inventory) version 2.0. This is the first part, target letters A-M.

For the description of the process, please consult the paper for further documentation. In short, three Mturk tasks were used to yield the data provided here: - "Substitutable words in context": Workers are presented a sentence with a target word and supply substitutions - "Are these words used with the same meaning?": Workers are presented a pair of sentences with the same target word marked in bold and can decide whether the meanings are identical, similar or different - "Match the Meaning" Workers are presented a sense inventory represented by prototypical sentences and align further sentences with the same target word to those senses.

The TWSI is organized by target word: For the next most frequent 615 nouns in English Wikipedia that are not included already in the TWSI 1.0, (dump used from January 3rd, 2008), all targets are organized into senses. With each sense, there are associated substitutions and sentences where the target word was used in this sense.

This data has been curated and extracted from the output of a turk bootstrapping acquisition cycle. Raw data is not included here, but is available upon request.

File history

Click on a date/time to view the file as it appeared at that time.

Date/TimeDimensionsUserComment
current14:56, 18 October 2010 (5.63 MB)Biem (talk | contribs)TWSI (Turk bootstrap Word Sense Inventory) version 2.0. This is the first part, target letters A-M. For the description of the process, please consult the paper for further documentation. In short, three Mturk tasks were used to yield the data provided

There are no pages that use this file.