Tibor Kiss


2022

pdf bib
GerEO: A Large-Scale Resource on the Syntactic Distribution of German Experiencer-Object Verbs
Johanna M. Poppek | Simon Masloch | Tibor Kiss
Proceedings of the Thirteenth Language Resources and Evaluation Conference

Although studied for several decades, the syntactic properties of experiencer-object (EO) verbs are still under discussion, while most analyses are not supported by substantial corpus data. With GerEO, we intend to fill this lacuna for German EO-verbs by presenting a large-scale database of more than 10,000 examples for 64 verbs (up to 200 per verb) from a newspaper corpus annotated for several syntactic and semantic features relevant for their analysis, including the overall syntactic construction, the semantic stimulus type, and the form of a possible stimulus preposition, i.e. a preposition heading a PP that indicates (a part/aspect of) the stimulus. Non-psych occurrences of the verbs are not excluded from the database but marked as such to make a comparison possible. Data of this kind can be used to develop and test theoretical hypotheses on the properties of EO-verbs, aid in the construction of experiments as well as provide training and test data for AI systems.

2021

pdf bib
A Quantitative Approach towards German Experiencer-Object Verbs
Johanna M. Poppek | Simon Masloch | Amelie Robrecht | Tibor Kiss
Proceedings of the Second Workshop on Quantitative Syntax (Quasy, SyntaxFest 2021)

2017

pdf bib
Issues of Mass and Count: Dealing with ‘Dual-Life’ Nouns
Tibor Kiss | Francis Jeffry Pelletier | Halima Husić | Johanna Poppek
Proceedings of the 6th Joint Conference on Lexical and Computational Semantics (*SEM 2017)

The topics of mass and count have been studied for many decades in philosophy (e.g., Quine, 1960; Pelletier, 1975), linguistics (e.g., McCawley, 1975; Allen, 1980; Krifka, 1991) and psychology (e.g., Middleton et al, 2004; Barner et al, 2009). More recently, interest from within computational linguistics has studied the issues involved (e.g., Pustejovsky, 1991; Bond, 2005; Schmidtke & Kuperman, 2016), to name just a few. As is pointed out in these works, there are many difficult conceptual issues involved in the study of this contrast. In this article we study one of these issues – the “Dual-Life” of being simultaneously +mass and +count – by means of an unusual combination of human annotation, online lexical resources, and online corpora.

2016

pdf bib
A sense-based lexicon of count and mass expressions: The Bochum English Countability Lexicon
Tibor Kiss | Francis Jeffry Pelletier | Halima Husic | Roman Nino Simunic | Johanna Marie Poppek
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

The present paper describes the current release of the Bochum English Countability Lexicon (BECL 2.1), a large empirical database consisting of lemmata from Open ANC (http://www.anc.org) with added senses from WordNet (Fellbaum 1998). BECL 2.1 contains ≈ 11,800 annotated noun-sense pairs, divided in four major countability classes and 18 fine-grained subclasses. In the current version, BECL also provides information on nouns whose senses occur in more than one class allowing a closer look on polysemy and homonymy with regard to countability. Further included are sets of similar senses using the Leacock and Chodorow (LCH) score for semantic similarity (Leacock & Chodorow 1998), information on orthographic variation, on the completeness of all WordNet senses in the database and an annotated representation of different types of proper names. The further development of BECL will investigate the different countability classes of proper names and the general relation between semantic similarity and countability as well as recurring syntactic patterns for noun-sense pairs. The BECL 2.1 database is also publicly available via http://count-and-mass.org.

2014

pdf bib
Building a reference lexicon for countability in English
Tibor Kiss | Francis Jeffry Pelletier | Tobias Stadtfeld
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

The present paper describes the construction of a resource to determine the lexical preference class of a large number of English noun-senses ( 14,000) with respect to the distinction between mass and count interpretations. In constructing the lexicon, we have employed a questionnaire-based approach based on existing resources such as the Open ANC (http://www.anc.org) and WordNet (CITATION). The questionnaire requires annotators to answer six questions about a noun-sense pair. Depending on the answers, a given noun-sense pair can be assigned to fine-grained noun classes, spanning the area between count and mass. The reference lexicon contains almost 14,000 noun-sense pairs. An initial data set of 1,000 has been annotated together by four native speakers, while the remaining 12,800 noun-sense pairs have been annotated in parallel by two annotators each. We can confirm the general feasibility of the approach by reporting satisfactory values between 0.694 and 0.755 in inter-annotator agreement using Krippendorff’s 𝛼.

2012

pdf bib
EXCOTATE: An Add-on to MMAX2 for Inspection and Exchange of Annotated Data
Tobias Stadtfeld | Tibor Kiss
Proceedings of COLING 2012: Demonstration Papers

2010

pdf bib
A Logistic Regression Model of Determiner Omission in PPs
Tibor Kiss | Katja Keßelmeier | Antje Müller | Claudia Roch | Tobias Stadtfeld | Jan Strunk
Coling 2010: Posters

pdf bib
An Annotation Schema for Preposition Senses in German
Antje Müller | Olaf Hülscher | Claudia Roch | Katja Keßelmeier | Tobias Stadtfeld | Jan Strunk | Tibor Kiss
Proceedings of the Fourth Linguistic Annotation Workshop

2007

pdf bib
Measuring the Productivity of Determinerless PPs
Florian Dömges | Tibor Kiss | Antje Müller | Claudia Roch
Proceedings of the Fourth ACL-SIGSEM Workshop on Prepositions

2006

pdf bib
Unsupervised Multilingual Sentence Boundary Detection
Tibor Kiss | Jan Strunk
Computational Linguistics, Volume 32, Number 4, December 2006

2002

pdf bib
Scaled Log Likelihood Ratios for the Detection of Abbreviations in Text Corpora
Tibor Kiss | Jan Strunk
COLING 2002: The 17th International Conference on Computational Linguistics: Project Notes

1996

pdf bib
Integrating Syntactic and Prosodic Information for the Efficient Detection of Empty Categories
Anton Batliner | Anke Feldhaus | Stefan Geifiler | Andreas Kieflling | Tibor Kiss | Ralf Kompe | Elmar Noth
COLING 1996 Volume 1: The 16th International Conference on Computational Linguistics