Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC'04)
L04-1001
: Marilyn Walker
Can We Talk? Prospects for Automatically Training Spoken Dialogue Systems
L04-1002
: Hans Uszkoreit
Strategic Directions of National and International Research Funding
L04-1003
: Gregor Thurmair
Multilingual Content Processing
L04-1004
: Brian MacWhinney
Collaborative Commentary: Opening Up Spoken Language Databases
L04-1005
: Nick Campbell
Getting to the Heart of the Matter; Speech is More than Just the Expression of Text or Language
L04-1006
: Bente Maegaard
Industrial Needs for Language Resources
L04-1007
: Junichi Tsujii
Thesaurus or Logical Ontology, Which do we Need for Mining Text?
L04-1008
: Kamlesh Dutta; Saroj Kaushik; Nupur Prakash
Information Extraction from Hindi Texts
L04-1009
: Cornelis H.A. Koster; Stefan Gradmann
The Language Belongs to the People!
L04-1010
: Paul Schmidt; Sandrine Garnier; Mike Sharwood; Toni Badia; Lourdes Díaz; Martí Quixal; Ana Ruggia; Antonio S. Valderrabanos; Alberto J. Cruz; Enrique Torrejon; Celia Rico; Jorge Jimenez
ALLES: Integrating NLP in ICALL Applications
L04-1011
: George Doddington; Alexis Mitchell; Mark Przybocki; Lance Ramshaw; Stephanie Strassel; Ralph Weischedel
The Automatic Content Extraction (ACE) Program Tasks, Data, and Evaluation
L04-1012
: Lee Schwartz; Takako Aikawa
Multilingual Corpus-based Approach to the Resolution of English ing
L04-1013
: Diana Santos; Anabela Barreiro
On the Problems of Creating a Golden Standard of Inflected Forms in Portuguese
L04-1014
: Sebastian Möller; Jan Krebber; Alexander Raake; Paula Smeele; Martin Rajman; Mirek Melichar; Vincenzo Pallotta; Gianna Tsakou; Basilis Kladis; Anestis Vovos; Jettie Hoonhout; Dietmar Schuchardt; Nikos Fakotakis; Todor Ganchev; Ilyas Potamitis
INSPIRE: Evaluation of a Smart-Home System for Infotainment Management and Device Control
L04-1015
: Ielka van der Sluis; Emiel Krahmer
Evaluating Multimodal NLG Using Production Experiments
L04-1016
: Nuno Seco; Tony Veale; Jer Hayes
Concept Creation in Lexical Ontologies
L04-1017
: Tony Veale
Polysemy and Category Structure in WordNet: An Evidential Approach
L04-1018
: Susanne Salmon-Alt; Laurent Romary
Towards a Reference Annotation Framework
L04-1019
: Sebastian Möller
A New ITU-T Recommendation on the Evaluation of Telephone-Based Spoken Dialogue Systems
L04-1020
: Dekai Wu; Grace Ngai; Marine Carpuat
Raising the Bar: Stacked Conservative Error Correction Beyond Boosting
L04-1021
: Franca Debole; Fabrizio Sebastiani
An Analysis of the Relative Difficulty of Reuters-21578 Subsets
L04-1022
: Ajay S. Bhaskarabhatla; Sriganesh Madhvanath
Experiences in Collection of Handwriting Data for Online Handwriting Recognition in Indic Scripts
L04-1023
: Hsin-Hsi Chen; Yi-Cheng Yu; Chih-Long Lin
Collocation Extraction Using Web Statistics
L04-1024
: Ajay S. Bhaskarabhatla; Sriganesh Madhvanath
An XML Representation for Annotated Handwriting Datasets for Online Handwriting Recognition
L04-1025
: Christina Alexandris; Stavroula-Evita Fotinea
Reusing Language Resources for Speech Applications involving Emotion
L04-1026
: Eva Navas; Amaia Castelruiz; Iker Luengo; Jon Sánchez; Inmaculada Hernáez
Designing and Recording an Audiovisual Database of Emotional Speech in Basque
L04-1027
: Gaël Dias; Sérgio Nunes
Evaluation of Different Similarity Measures for the Extraction of Multiword Units in a Reinforcement Learning Environment
L04-1028
: Hiroshi Nakagawa; Hidetaka Masuda; Dai Sato
Terminal Device Oriented Comparable Corpora and its Alignment- Towards Extracting Paraphrasing Patterns
L04-1029
: Serge Sharoff
Towards Basic Categories for Describing Properties of Texts in a Corpus
L04-1030
: Michael Carl; Ecaterina Rascu; Johann Haller
Using Weighted Abduction to Align Term Variant Translations in Bilingual Texts
L04-1031
: Luciana Bordoni
Investigation on Semantics to Improve the COVAX System
L04-1032
: Wim Peters
Incremental Knowledge Acquisition from WordNet and EuroWordNet
L04-1033
: Vivi Năstase; Rada Mihalcea
Finding Semantic Associations on Express Lane
L04-1034
: Mickel Grönroos; Manne Miettinen
Infrastructure for Collaborative Annotation of Speech
L04-1035
: Diana Maynard; Kalina Bontcheva; Hamish Cunningham
Automatic Language-Independent Induction of Gazetteer Lists
L04-1036
: Nikos Fakotakis
Corpus Design, Recording and Phonetic Analysis of Greek Emotional Database
L04-1037
: Yorick Wilks; Nick Webb; Andrea Setzer; Mark Hepple; Roberta Catizone
Human Dialogue Modelling Using Annotated Corpora
L04-1038
: Stefan Schaden
CrossTowns: Automatically Generated Phonetic Lexicons of Cross-lingual Pronunciation Variants of European City Names
L04-1039
: Hsin-Hsi Chen; Yi-Lin Chu
Pattern Discovery in Named Organization Corpus
L04-1040
: Masumi Narita; Chieko Sato; Masatoshi Sugiura
Connector Usage in the English Essay Writing of Japanese EFL Learners
L04-1041
: Patrick Drouin
Detection of Domain Specific Terminology Using Corpora Comparison
L04-1042
: Wolfgang Minker
Comparative Evaluation of a Stochastic Parser on Semantic and Syntactic-semantic Labels
L04-1043
: Chu-Ren Huang; Ru-Yng Chang; Hshiang-Pin Lee
Sinica BOW (Bilingual Ontological Wordnet): Integration of Bilingual WordNet and SUMO
L04-1044
: Stephan Bopp; Sandro Pedrazzini; Elisabeth Maier
How to Disassemble Alphabetical Processions - Morphological Treatment of Unknown Words
L04-1045
: Darinka Verdonik; Matej Rojc; Zdravko Kačič
Creating Slovenian Language Resources for Development of Speech-to-speech Translation Components
L04-1046
: Magnus Sahlgren
Automatic Bilingual Lexicon Acquisition Using Random Indexing of Aligned Bilingual Data
L04-1047
: Bojan Kotnik; Zdravko Kačič; Bogomir Horvat
The Development and Integration of the LDA-Toolkit Into COST249 SpeechDat(II) SIG Reference Recognizer
L04-1048
: Özlem Öztürk; Özgul Salor; Tolga Çiloğlu; Mubeccel Demirekler
Duration Modeling For Turkish Text-to-Speech Synthesis System
L04-1049
: Philipp Cimiano; Andreas Hotho; Steffen Staab
Clustering Concept Hierarchies from Text
L04-1050
: Alvin F. Martin; John S. Garofolo; Jonathan C. Fiscus; Audrey N. Le; David S. Pallett; Mark A. Przybocki; Gregory A. Sanders
NIST Language Technology Evaluation Cookbook
L04-1051
: Satoshi Sekine; Chikashi Nobata
Definition, Dictionaries and Tagger for Extended Named Entity Hierarchy
L04-1052
: Beom-mo Kang; Hunggyu Kim
Sejong Korean Corpora in the Making
L04-1053
: Yong-Ju Lee; Bong-Wan Kim; Young-Il Kim; Dae-Lim Choi; Kwang-Hyun Lee; Yongnam Um
Creation and Assessment of Korean Speech and Noise DB in Car Environment
L04-1054
: Alessandro Cucchiarelli; Roberto Navigli; Francesca Neri; Paola Velardi
Automatic Generation of Glosses in the OntoLearn System
L04-1055
: An Vandecatseye; Jean-Pierre Martens; Joao Neto; Hugo Meinedo; Carmen Garcia-Mateo; Javier Dieguez; France Mihelic; Janez Zibert; Jan Nouza; Petr David; Matus Pleva; Anton Cizmar; Harris Papageorgiou; Christina Alexandris
The COST278 Pan-European Broadcast News Database
L04-1056
: Daan Wissing; Jean-Pierre Martens; Ulrike Janke; Wim Goedertier
A Spoken Afrikaans Language Resource Designed for Research on Pronunciation Variations
L04-1057
: Tania Ellbogen; Florian Schiel; Alexander Steffen
The BITS Speech Synthesis Corpus for German
L04-1058
: Florian Schiel
MAUS Goes Iterative
L04-1059
: Mark Stevenson; Paul Clough
EuroWordNet as a Resource for Cross-language Information Retrieval
L04-1060
: Jonas Sjöbergh; Viggo Kann
Finding the Correct Interpretation of Swedish Compounds, a Statistical Approach
L04-1061
: Maya Ando; Satoshi Sekine; Shun Ishizaki
Automatic Extraction of Hyponyms from Japanese Newspapers. Using Lexico-syntactic Patterns
L04-1062
: Karin Kipper; Benjamin Snyder; Martha Palmer
Extending a Verb-lexicon Using a Semantically Annotated Corpus
L04-1063
: J.C.T. Beeken; P.H.J. van der Kamp
The Centre for Dutch Language and Speech Technology (TST Centre)
L04-1064
: Sue Ellen Wright
A Global Data Category Registry for Interoperable Language Resources
L04-1065
: J. G. Kruyt
The Integrated Language Database of 8th - 21st-Century Dutch
L04-1066
: Hans Dybkjær; Laila Dybkjær
From Acts and Topics to Transactions and Dialogue Smoothness
L04-1067
: Hideki Kashioka
Grouping Synonymous Sentences from a Parallel Corpus
L04-1068
: Khurshid Ahmad; Maria Teresa Musacchio
Discovery of (New) Knowledge and the Analysis of Text Corpora
L04-1069
: Harald Höge; Josef G. Bauer; Christian Geißler; Panji Setiawan; Kai Steinert
Evaluation of Microphone Array Front-Ends for ASR - an Extension of the AURORA Framework
L04-1070
: Janez ibert; France Mihelič
Development of Slovenian Broadcast News Speech Database
L04-1071
: Eckhard Bick
A Named Entity Recognizer for Danish
L04-1072
: M. Teresa Cabré; Carme Bach; Rosa Estopà; Judit Feliu; Gemma Martínez; Jorge Vivaldi
The GENOMA-KB Project: Towards the Integration of Concepts, Terms, Textual Corpora and Entities
L04-1073
: Elisabete Ranchhod; Paula Carvalho; Cristina Mota; Anabela Barreiro
Portuguese Large-scale Language Resources for NLP Applications
L04-1074
: Umut Özge; Bilge Say
Development of a Corpus Workbench for the METU Turkish Corpus
L04-1075
: Raúl Araya; Jordi Vivaldi
Mercedes, a Term-in-Context Highlighter
L04-1076
: Henrik Selsøe Sørensen
The Bilingual Web Dictionary on Demand
L04-1077
: Toma Erjavec; Kristina Hmeljak Sangawa; Irena Srdanović; Anton ml. Vahčič
Making an XML-based Japanese-Slovene Learners' Dictionary
L04-1078
: Toma Erjavec
MULTEXT-East Version 3: Multilingual Morphosyntactic Specifications, Lexicons and Corpora
L04-1079
: Lorena Seijo Pereiro; Ana Martínez Ínsua; Francisco Méndez Pazó; Francisco Campillo Díaz; Eduardo Rodríguez Banga
A Galician Textual Corpus for Morphosyntactic Tagging with Application to Text-to-Speech Synthesis
L04-1080
: Salvador España; María José Castro; José Luis Hidalgo
The SPARTACUS-Database: a Spanish Sentence Database for Offline Handwriting Recognition
L04-1081
: Sofia Stamou; Goran Nenadic; Dimitris Christodoulakis
Exploring Balkanet Shared Ontology for Multilingual Conceptual Indexing
L04-1082
: Mitsuo Shimohata; Eiichiro Sumita; Yuji Matsumoto
Building a Paraphrase Corpus for Speech Translation
L04-1083
: Yasuhiro Akiba; Eiichiro Sumita; Hiromi Nakaiwa; Seiichi Yamamoto; Hiroshi G. Okuno
Incremental Methods to Select Test Sentences for Evaluating Translation Ability
L04-1084
: Jan Odijk
Reusable Lexical Representations for Idioms
L04-1085
: Daniel Tihelka; Jindřich Matouek
The Design of Czech Language Formal Listening Tests for the Evaluation of TTS Systems
L04-1086
: Janez Stergar; Caglayan Erdem; Bogomir Horvat; Zdravko Kačič
A Data-driven Adaptation of Prosody in a Multilingual TTS
L04-1087
: M. Taulé; M. Civit; N. Artigas; M. García; L. Màrquez; M.A. Martí; B. Navarro
MiniCors and Cast3LB: Two Semantically Tagged Spanish Corpora
L04-1088
: Johnny Bigert
Probabilistic Detection of Context-Sensitive Spelling Errors
L04-1089
: Andrej gank; Toma Rotovnik; Mirjam Sepesy Maučec; Darinka Verdonik; Janez Kitak; Damjan Vlaj; Vladimir Hozjan; Zdravko Kačič; Bogomir Horvat
Acquisition and Annotation of Slovenian Broadcast News Database
L04-1090
: Andrej gank; Zdravko Kačič; Frank Diehl; Klara Vicsi; Gyorgy Szaszak; Jozef Juhar; Slavomir Lihan
The COST 278 MASPER Initiative - Crosslingual Speech Recognition with Large Telephone Databases
L04-1091
: Reinhard Rapp
Utilizing the One-Sense-per-Discourse Constraint for Fully Unsupervised Word Sense Induction and Disambiguation
L04-1092
: Reinhard Rapp
A Freely Available Automatically Generated Thesaurus of Related Words
L04-1093
: Vincent Vandeghinste; Erik Tjong Kim Sang
Using a Parallel Transcript/Subtitle Corpus for Sentence Compression
L04-1094
: Sofia Stamou; Dimitris Christodoulakis
Handling Subtle Sense Distinctions Through Wordnet Semantic Types
L04-1095
: Athanasios Karasimos; Amy Isard
Multi-lingual Evaluation of a Natural Language Generation System
L04-1096
: Heike Telljohann; Erhard Hinrichs; Sandra Kübler
The Tüba-D/Z Treebank: Annotating German with a Context-Free Backbone
L04-1097
: John S. Garofolo; Christophe D. Laprun; Martial Michel; Vincent M. Stanford; Elham Tabassi
The NIST Meeting Room Pilot Corpus
L04-1098
: Dafydd Gibbon; Catherine Bow; Steven Bird; Baden Hughes
Securing Interpretability: The Case of Ega Language Documentation
L04-1099
: Toshiyuki Takezawa; Genichiro Kikui
A Comparative Study on Human Communication Behaviors and Linguistic Characteristics for Speech-to-Speech Translation
L04-1100
: Núria Bel; Cornelis H.A. Koster; Marta Villegas
Cost-effective Cross-lingual Document Classification
L04-1101
: Katrin Erk; Sebastian Padó
A Powerful and Versatile XML Format for Representing Role-semantic Annotation
L04-1102
: Stefan Baumann; Caren Brinckmann; Silvia Hansen-Schirra; Geert-Jan Kruijff; Ivana Kruijff-Korbayová; Stella Neumann; Erich Steiner; Elke Teich; Hans Uszkoreit
The MULI Project: Annotation and Analysis of Information Structure in German and English
L04-1103
: P. H. J. van der Kamp; J. G. Kruyt
Putting the Dutch PAROLE Corpus to Work
L04-1104
: Julie Carson-Berndsen; Robert Kelly
Acquiring Reusable Multilingual Phonotactic Resources
L04-1105
: Moritz Neugebauer; Stephen Wilson
Phonological Treebanks. Issues in Generation and Application
L04-1106
: Pedro Concejero Cerezo; Juan José Rodríguez Soler; Daniel Tapias Merino; Alberto J. Sánchez García
Methodology for Rapid Prototyping and Testing of ASR Based User Interfaces
L04-1107
: Lars Degerstedt; Arne Jönsson
Open Resources for Language Technology
L04-1108
: Marie-Laure Reinberger; Walter Daelemans
Unsupervised Text Mining for Ontology Extraction: An Evaluation of Statistical Measures
L04-1109
: Daniel Aioanei; Julie Carson-Berndsen; Anja Geumann; Robert Kelly; Moritz Neugebauer; Stephen Wilson
A Multilingual Phonological Resource Toolkit for Ubiquitous Speech Technology
L04-1110
: Oscar Corcho; Raúl García-Castro; Asunción Gómez-Pérez
Benchmarking Ontology Tools. A Case Study for the WebODE Platform.
L04-1111
: Bayan Abu Shawar; Eric Atwell
A Chatbot as a Novel Corpus Visualization Tool
L04-1112
: Florentina Vasilescu; Philippe Langlais; Guy Lapalme
Evaluating Variants of the Lesk Approach for Disambiguating Words
L04-1113
: Sergei Nirenburg; Marjorie McShane; Stephen Beale
The Rationale for Building an Ontology Expressly for NLP
L04-1114
: Marjorie McShane; Stephen Beale; Sergei Nirenburg
Some Meaning Procedures of Ontological Semantics
L04-1115
: Eric K. Ringger; Robert C. Moore; Eugene Charniak; Lucy Vanderwende; Hisami Suzuki
Using the Penn Treebank to Evaluate Non-Treebank Parsers
L04-1116
: Hidetsugu Nanba; Manabu Okumura
Comparison of Some Automatic and Manual Methods for Summary Evaluation Based on the Text Summarization Challenge 2
L04-1117
: Anthony McEnery; Zhonghua Xiao
The Lancaster Corpus of Mandarin Chinese: A Corpus for Monolingual and Contrastive Language Study
L04-1118
: H. Folch; B. Habert; M. Jardino; N. Pernelle; M.C. Rousset; A. Termier
Highlighting Latent Structure in Documents
L04-1119
: Dan Tufis; Radu Ion; Nancy Ide
Word Sense Disambiguation as a Wordnets' Validation Method in Balkanet
L04-1120
: Dan Tufis
Term Translations in Parallel Corpora: Discovery and Consistency Check
L04-1121
: Luís Sarmento; Belinda Maia; Diana Santos
The Corpógrafo a Web-based Environment for Corpora Research
L04-1122
: Daniel Ferrés; Marc Massot; Muntsa Padró; Horacio Rodríguez; Jordi Turmo
Automatic Classification of Geographic Named Entities
L04-1123
: Olivia Sanchez-Graillet; Massimo Poesio
Acquiring Bayesian Networks from Text
L04-1124
: Thanh Bon Nguyen; Thi Minh Huyen Nguyen; Laurent Romary; Xuan Luong Vu
Developping Tools and Building Linguistic Resources for Vietnamese Morpho-syntactic Processing
L04-1125
: Christoph Draxler; Klaus Jänsch
SpeechRecorder - a Universal Platform Independent Multi-Channel Audio Recording Software
L04-1126
: Yasmina Quatrain; Sylvaine Nugier; Anne Peradotto
An Evaluation Protocol for Text Mining Tools : ALCESTE, SAS Text Miner, SPAD-CRM and Temis Text Mining Solutions Testing
L04-1127
: Alessandro Panunzi; Eugenio Picchi; Massimo Moneglia
Using PiTagger for Lemmatization and PoS Tagging of a Spontaneous Speech Corpus: C-Oral-Rom Italian
L04-1128
: Marco Baroni; Silvia Bernardini; Federica Comastri; Lorenzo Piccioni; Alessandra Volpi; Guy Aston; Marco Mazzoleni
Introducing the La Repubblica Corpus: A Large, Annotated, TEI(XML)-compliant Corpus of Newspaper Italian
L04-1129
: Nancy Ide; David Woolner
Exploiting Semantic Web Technologies for Intelligent Access to Historical Documents
L04-1130
: Marco Baroni; Sabrina Bisi
Using Cooccurrence Statistics and the Web to Discover Synonyms in a Technical Language
L04-1131
: Hiroyuki Shinnou; Minoru Sasaki
Semi-supervised Learning by Fuzzy Clustering and Ensemble Learning
L04-1132
: Nick Campbell
Speech & Expression; the Value of a Longitudinal Corpus
L04-1133
: Salma Jamoussi; Kamel Smaïli; Dominique Fohr; Jean-Paul Haton
A Complete Understanding Speech System Based on Semantic Concepts
L04-1134
: Kiril Simov; Alexander Simov; Hristo Ganev; Krasimira Ivanova; Ilko Grigorov
The CLaRK System: XML-based Corpora Development System for Rapid Prototyping
L04-1135
: Toni Badia; Àngel Gil; Martí Quixal; Oriol Valentín
NLP-enhanced Error Checking for Catalan Unrestricted Text
L04-1136
: Kalina Bontcheva
Open-source Tools for Creation, Maintenance, and Storage of Lexical Resources for Language Generation from Ontologies
L04-1137
: Agnes Lisowska; Andrei Popescu-Belis; Susan Armstrong
User Query Analysis for the Specification and Evaluation of a Dialogue Processing and Retrieval System
L04-1138
: Borislav Popov; Angel Kirilov; Diana Maynard; Dimitar Manov
Creation of Reusable Components and Language Resources for Named Entity Recognition in Russian
L04-1139
: Andrei Popescu-Belis
Abstracting a Dialog Act Tagset for Meeting Processing
L04-1140
: Andrei Popescu-Belis; Loïs Rigouste; Susanne Salmon-Alt; Laurent Romary
Online Evaluation of Coreference Resolution
L04-1141
: Xavier Carreras; Isaac Chao; Lluís Padró; Muntsa Padró
FreeLing: An Open-Source Suite of Language Analyzers
L04-1142
: Hisami Suzuki
Phrase-Based Dependency Evaluation of a Japanese Parser
L04-1143
: Baden Hughes; Catherine Bow; Steven Bird
Functional Requirements for an Interlinear Text Editor
L04-1144
: Baden Hughes; David Penton; Steven Bird; Catherine Bow; Gillian Wigglesworth; Patrick McConvell; Jane Simpson
Management of Metadata in Linguistic Fieldwork: Experience from the ACLA Project
L04-1145
: Adam Przepiórkowski; Zygmunt Krynicki; Łukasz Dębowski; Marcin Woliński; Daniel Janus; Piotr Bański
A Search Tool for Corpora with Positional Tagsets and Ambiguities
L04-1146
: Peter A. Heeman
The American English SALA-II Data Collection
L04-1147
: Andrew Finch; Yasuhiro Akiba; Eiichiro Sumita
How Does Automatic Machine Translation Evaluation Correlate with Human Scoring as the Number of Reference Translations Increases?
L04-1148
: Slaven Bilac; Timothy Baldwin; Hozumi Tanaka
Evaluating the FOKS Error Model
L04-1149
: Guillaume Gibert; Gérard Bailly; Frédéric Eliséi; Denis Beautemps; Rémi Brun
Evaluation of a Speech Cuer: From Motion Capture to a Concatenative Text-to-cued Speech System
L04-1150
: Nikolaos Nanas; Victoria Uren; Anne de Roeck; John Domingue
Beyond TREC's Filtering Track
L04-1151
: Sanni Nimb
A Corpus-based Syntactic Lexicon for Adverbs
L04-1152
: Carol Peters; Martin Braschler; Khalid Choukri; Julio Gonzalo; Michael Kluck
The Future of Evaluation for Cross-Language Information Retrieval Systems
L04-1153
: Henk van den Heuvel; Phil Hall; Harald Höge; Asunción Moreno; Antonio Rincon; Francesco Senia
SALA II Across the Finish Line: A Large Collection of Mobile Telephone Speech Databases from North and Latin America completed
L04-1154
: Xavier Gómez-Guinovart; Elena Sacau Fontenla
Parallel Corpora for the Galician Language: Building and Processing of the CLUVI (Linguistic Corpus of the University of Vigo)
L04-1155
: Junko Hosaka; Igor V. Kurochkin; Akihiko Konagaya
PBIE: A Data Preparation Toolkit Toward Developing a Parsing-Based Information Extraction System
L04-1156
: Andreas Wagner; Bettina Zeisler
A Syntactically Annotated Corpus of Tibetan
L04-1157
: Montserrat Marimon; Núria Bel
Lexical Entry Templates for Robust Deep Parsing
L04-1158
: Dan Tufis; Liviu Dragomirescu
Tiered Tagging Revisited
L04-1159
: Dan Tufis; Eduard Barbu
A Methodology and Associated Tools for Building Interlingual Wordnets
L04-1160
: Doaa Samy; Antonio Moreno-Sandoval; José M. Guirao
Construction of a Bilingual Arabic-Spanish Lexicon of Verbs Based on a Parallel Corpus
L04-1161
: I. Alegria; A. Gurrutxaga; P. Lizaso; X. Saralegi; S. Ugartetxea; R. Urizar
A XML-Based Term Extraction Tool for Basque
L04-1162
: Manolis Maragoudakis; Nikos Fakotakis; George Kokkinakis
A Bayesian Model for Shallow Syntactic Parsing of Natural Language Texts
L04-1163
: Florbela Barreto; Raquel Amaro
Multifunctional Computational Lexicon of Contemporary Portuguese: An Available Resource for Multitype Applications
L04-1164
: Jacques Duchateau; Tim Ceyssens; Hugo Van hamme
Use and Evaluation of Prosodic Annotations in Dutch
L04-1165
: Stephan Busemann; Hans-Ulrich Krieger
Resources and Techniques for Multilingual Information Extraction
L04-1166
: Lei Chen; Yang Liu; Mary Harper; Eduardo Maia; Susan McRoy
Evaluating Factors Impacting the Accuracy of Forced Alignments in a Multimodal Corpus
L04-1167
: C. Barras; G. Adda; M. Adda-Decker; B. Habert; P. Boula de Mareüil; P. Paroubek
Automatic Audio and Manual Transcripts Alignment, Time-code Transfer and Selection of Exact Transcripts
L04-1168
: V. Guijarrubia; I. Torres; L.J. Rodríguez
Evaluation of a Spoken Phonetic Database in Basque Language
L04-1169
: Yves Lepage; Guilhem Peralta
Using Paradigm Tables to Generate New Utterances Similar to those Existing in Linguistic Resources
L04-1170
: Mohamed Afify; Ossama Emam
Collection and Evaluation of Broadcast News Data for Arabic
L04-1171
: Kiril Simov; Petya Osenova; Sia Kolkovska; Elisaveta Balabanova; Dimitar Doikoff
A Language Resources Infrastructure for Bulgarian
L04-1172
: A. Batliner; C. Hacker; S. Steidl; E. Nöth; S. D'Arcy; M. Russell; M. Wong
"You Stupid Tin Box" - Children Interacting with the AIBO Robot: A Cross-linguistic Emotional Speech Corpus
L04-1173
: James Dowdall; Will Lowe; Jeremy Ellman; Fabio Rinaldi; Michael Hess
The Role of MultiWord Terminology in Knowledge Management
L04-1174
: Jörg Tiedemann; Lars Nygaard
The OPUS Corpus - Parallel and Free: http://logos.uio.no/opus
L04-1175
: Javier Farreres; Horacio Rodríguez
Selecting the Correct English Synset for a Spanish Sense
L04-1176
: Asunción Moreno; Khalid Choukri; Phil Hall; Henk van den Heuvel; Eric Sanders; Francesco Senia; Herbert Tropf
Collection of SLR in the Asian-Pacific Area
L04-1177
: Jaroslava Hlaváčová; Jana Klímová
Derivational Relations in Flectional Languages - Czech Case
L04-1178
: David Dalby; Lee Gillam; Christopher Cox; Debbie Garside
Standards for Language Codes: developing ISO 639
L04-1179
: Henk van den Heuvel; Dorota Iskra; Eric Sanders; Folkert de Vriend
SLR Validation: Current Trends and Developments
L04-1180
: Horacio Saggion
Identifying Definitions in Text Collections for Question Answering
L04-1181
: Laura Alonso; Irene Castellón; Jordi Escribano; Xavier Messeguer; Lluís Padró
Multiple Sequence Alignment for Characterizing the Lineal Structure of Revision
L04-1182
: Ben Hutchinson
Mining the Web for Discourse Markers
L04-1183
: Magnus Merkel; Andreas Lange
A Pattern Extraction Workbench Combining Multiple Linguistic Levels
L04-1184
: Anke Holler; Jan Frederik Maas; Angelika Storrer
Exploiting Coreference Annotations for Text-to-Hypertext Conversion
L04-1185
: Laura Hasler
"Why do you Ignore me?" - Proof that not all Direct Speech is Bad
L04-1186
: Costanza Navarretta; Bolette Sandford Pedersen; Dorte Haltrup Hansen
"Human Language Technology Elements in a Knowledge Organisation System - The VID Project"
L04-1187
: Kedar Bellare; Anish Das Sarma; Atish Das Sarma; Navneet Loiwal; Vaibhav Mehta; Ganesh Ramakrishnan; Pushpak Bhattacharyya
Generic Text Summarization Using WordNet
L04-1188
: Natalia V. Loukachevitch; Boris V. Dobrov
Development of Bilingual Domain-Specific Ontology for Automatic Conceptual Indexing
L04-1189
: Natalia V. Loukachevitch; Boris V. Dobrov
Development of Ontologies with Minimal Set of Conceptual Relations
L04-1190
: Maria Fernanda Bacelar do Nascimento; Amália Mendes; Luísa Pereira
Providing On-line Access to Portuguese Language Resources: Corpora and Lexicons
L04-1191
: Bruno Cartoni; Pierrette Bouillon; Yalina Alphonse; Sabine Lehmann
Automatisation of the Activity of Term Collection in Different Languages
L04-1192
: Jorge Vivaldi; Horacio Rodríguez
Automatically Selecting Domain Markers for Terminology Extraction
L04-1193
: Anne Vilnat; Patrick Paroubek; Laura Monceaux; Isabelle Robba; Véronique Gendner; Gabriel Illouz; Michèle Jardino
The Ongoing Evaluation Campaign of Syntactic Parsing of French: EASY
L04-1194
: Kateřina Veselá; Jiří Havelka; Eva Hajičová
Annotators Agreement: The Case of Topic-Focus Articulation
L04-1195
: Scott S. L. Piao; Paul Rayson; Dawn Archer; Tony McEnery
Evaluating Lexical Resources for a Semantic Tagger
L04-1196
: Frédéric Landragin; Alexandre Denis; Annalisa Ricci; Laurent Romary
Multimodal Meaning Representation for Generic Dialogue Systems Architectures
L04-1197
: Anna Braasch; Sussi Olsen
STO: A Danish Lexicon Resource - Ready for Applications
L04-1198
: Kalliopi Zervanou; John McNaught
A Domain-Independent Approach to IE Rule Development
L04-1199
: Laurence Devillers; Hélène Maynard; Sophie Rosset; Patrick Paroubek; Kevin McTait; D. Mostefa; Khalid Choukri; Laurent Charnay; Caroline Bousquet; Nadine Vigouroux; Frédéric Béchet; Laurent Romary; Jean-Yves Antoine; J. Villaneau; Myriam Vergnes; J. Goulian
The French MEDIA/EVALDA Project: the Evaluation of the Understanding Capability of Spoken Language Dialogue Systems
L04-1200
: Emanuela Cresti; Fernanda Bacelar do Nascimento; Antonio Moreno Sandoval; Jean Veronis; Philippe Martin; Khalid Choukri
The C-ORAL-ROM CORPUS. A Multilingual Resource of Spontaneous Speech for Romance Languages
L04-1201
: Bodil Nistrup Madsen; Hanne Erdman Thomsen; Carl Vikner
Principles of a System for Terminological Concept Modelling
L04-1202
: Christophe Van Bael; Helmer Strik; Henk van den Heuvel
On the Usefulness of Large Spoken Language Corpora for Linguistic Research
L04-1203
: Dafydd Gibbon; Firmin Ahoua; Eddi Gbéry; Eno-Abasi Urua; Moses Ekpenyong
WALA: A Multilingual Resource Repository for West African Languages
L04-1204
: Sabine Bartsch
Annotating a Corpus for Building a Domain-specific Knowledge Base
L04-1205
: Constantin Orăsan; Viktor Pekar; Laura Hasler
A Comparison of Summarisation Methods Based on Term Specificity Estimation
L04-1206
: Massimo Moneglia
Measurements of Spoken Language Variability in a Multilingual Corpus. Predictable Aspects
L04-1207
: L. Devillers; I. Vasilescu
Reliability of Lexical and Prosodic Cues in Two Real-life Spoken Dialog Corpora
L04-1208
: Carlo Strapparava; Alessandro Valitutti
WordNet Affect: an Affective Extension of WordNet
L04-1209
: Margarita Hospedales; Manel Rodríguez
The GENOMA-KB Platform: Queries over Integrated Linguistic Resources
L04-1210
: Morena Danieli; Juan María Garrido; Massimo Moneglia; Andrea Panizza; Silvia Quazza; Marc Swerts
Evaluation of Consensus on the Annotation of Prosodic Breaks in the Romance Corpus of Spontaneous Speech "C-ORAL-ROM"
L04-1211
: Maja Popović; Hermann Ney
Towards the Use of Word Stems and Suffixes for Statistical Machine Translation
L04-1212
: Matthias Eck; Stephan Vogel; Alex Waibel
Language Model Adaptation for Statistical Machine Translation Based on Information Retrieval
L04-1213
: Olga Uryupina
Evaluating Name-Matching for Coreference Resolution
L04-1214
: Carlos Amaral; Dominique Laurent; André Martins; Afonso Mendes; Cláudia Pinto
Design and Implementation of a Semantic Search Engine for Portuguese
L04-1215
: Richard Campbell; Eric Ringger
Converting Treebank Annotations to Language Neutral Syntax
L04-1216
: Yalina Alphonse; Pierrette Bouillon
Methodology For Building Thematic Indexes In Medicine For French
L04-1217
: Carmen Garcia-Mateo; Javier Dieguez-Tirado; Laura Docio-Fernandez; Antonio Cardenal-Lopez
Transcrigal: A Bilingual System for Automatic Indexing of Broadcast News
L04-1218
: Arantza Díaz de Ilarraza; Aitzpea Garmendia ; Maite Oronoz
Abar-Hitz: An Annotation Tool for the Basque Dependency Treebank
L04-1219
: Valia Kordoni; Julia Neu
Creating Multi-purpose Linguistic Resources for Modern Greek: a Deep Modern Greek Grammar
L04-1220
: Leo Wanner; Margarita Alonso Ramos; Antonia Martí
Enriching the Spanish EuroWordNet by Collocations
L04-1221
: Charles J. Fillmore; Collin F. Baker; Hiroaki Sato
FrameNet as a "Net"
L04-1222
: Alfonso Ortega; Federico Sukno; Eduardo LLeida; Alejandro Frangi; Antonio Miguel; Luis Buera; Ernesto Zacur
AV@CAR: A Spanish Multichannel Multimodal Corpus for In-Vehicle Automatic Audio-Visual Speech Recognition
L04-1223
: Robert S. Melvin; Win May; Shrikanth Narayanan; Panayiotis Georgiou; Shadi Ganjavi
Creation of a Doctor-Patient Dialogue Corpus Using Standardized Patients
L04-1224
: Brian MacWhinney; Steven Bird; Christopher Cieri; Craig Martell
Talkbank: Building an Open Unified Multimodal Database of Communicative Interaction
L04-1225
: Robert S. Belvin; Susanne Riehemann; Kristin Precoda
A Fine-Grained Evaluation Method for Speech-to-Speech Machine Translation Using Concept Annotations
L04-1226
: David M. de Matos; Ricardo Ribeiro; Nuno J. Mamede
Rethinking Reusable Resources
L04-1227
: Adam Meyers; Ruth Reeves; Catherine Macleod; Rachel Szekely; Veronika Zielinska; Brian Young
The Cross-Breeding of Dictionaries
L04-1228
: Adam Meyers; Ruth Reeves; Catherine Macleod; Rachel Szekely; Veronika Zielinska; Brian Young; Ralph Grishman
Annotating Noun Argument Structure for NomBank
L04-1229
: Felix Sasaki; Andreas Witt; Dafydd Gibbon; Thorsten Trippel
Concept-based Queries: Combining and Reusing Linguistic Corpus Formats and Query Languages
L04-1230
: Felix Sasaki; Andreas Witt
Co-reference in Japanese Task-oriented Dialogues: A Contribution to the Development of Language-specific and Language-general Annotation Schemes and Resources
L04-1231
: Hiroyuki Kaji; Osamu Imaichi
Constructing Word-Sense Association Networks from Bilingual Dictionary and Comparable Corpora
L04-1232
: Alexis Palmer; Jonas Kuhn; Carlota Smith
Utilization of Multiple Language Resources for Robust Grammar-Based Tense and Aspect Classification
L04-1233
: Yoshida Kyôsuke; Hashimoto Taiichi; Tokunaga Takenobu; Tanaka Hozumi
Retrieving Annotated Corpora for Corpus Annotation
L04-1234
: Tokunaga Takenobu; Koyama Tomofumi; Saito Suguru; Nakajima Masayuki
Classification of Japanese Spatial Nouns
L04-1235
: Antonio Sanfilippo; Gus Calapristi; Vernon Crow; Beth Hetzler; Alan Turner
Meaningful Clusters
L04-1236
: V. Finley Lacatusu; Steven J. Maiorano; Sanda M. Harabagiu
Multi-Document Summarization Using Multiple-Sequence Alignment
L04-1237
: Jahna Otterbacher; Dragomir Radev
RevisionBank: A Resource for Revision-based Multi-document Summarization and Evaluation
L04-1238
: Sandra Aluisio; Gisele Montilha Pinheiro; Aline M. P. Manfrin; Leandro H. M. de Oliveira; Luiz C. Genoves Jr.; Stella E. O. Tagnin
The Lácio-Web: Corpora and Tools to Advance Brazilian Portuguese Language Investigations and Computational Linguistic Tools
L04-1239
: Dragomir Radev; Jahna Otterbacher; Zhu Zhang
CST Bank: A Corpus for the Study of Cross-document Structural Relationships
L04-1240
: Jonas Kuhn; B'alam Mateo-Toledo
Applying Computational Linguistic Techniques in a Documentary Project for Q'anjob'al (Mayan, Guatemala)
L04-1241
: Minoru Sasaki; Hiroyuki Shinnou
Information Retrieval System Using Latent Contextual Relevance
L04-1242
: Daisuke Kawahara; Ryohei Sasano; Sadao Kurohashi
Toward Text Understanding: Integrating Relevance-tagged Corpus and Automatically Constructed Case Frames
L04-1243
: Sun-Mee Bae; Key-Sun Choi
Lexical Analysis of Agglutinative Languages Using a Dictionary of Lemmas and Lexical Transducers
L04-1244
: Rita Nüebel
Evaluation and Adaptation of a Specialised Language Checking Tool for Non-specialised Machine Translation and Non-expert MT Users for Multi-lingual Telecooperation
L04-1245
: A. Lavelli; M. E. Califf; F. Ciravegna; D. Freitag; C. Giuliano; N. Kushmerick; L. Romano
A Critical Survey of the Methodology for IE Evaluation
L04-1246
: Jer Hayes; Tony Veale; Nuno Seco
Enriching WordNet Via Generative Metonymy and Creative Polysemy
L04-1247
: Tom Laureys; Guy De Pauw; Hugo Van hamme; Walter Daelemans; Dirk Van Compernolle
Evaluation and Adaptation of the Celex Dutch Morphological Database
L04-1248
: Li Tang; Donghong Ji; Lingpeng Yang; Yu Nie
A Model of Semantic Representations Analysis for Chinese Sentences
L04-1249
: Kyonghee Paik; Kiyonori Ohtake; Kazuhide Yamamoto
A Comparison of Two Variant Corpora: The Same Content with Different Source
L04-1250
: Christopher B. Quirk
Training a Sentence-Level Machine Translation Confidence Measure
L04-1251
: Sonja E. Bosch; Laurette Pretorius
Software Tools for Morphological Tagging of Zulu Corpora and Lexicon Development
L04-1252
: David Wible; Chin-Hwa Kuo; Nai-Lung Tsao
Improving Collocation Extraction for High Frequency Words
L04-1253
: Ai Kawazoe; Asanobu Kitamoto; Nigel Collier
Annotation of Coreference Relations Among Linguistic Expressions and Images in Biological Articles
L04-1254
: Michael Kluck
Evaluation of Cross-Language Information Retrieval Using the Domain-Specific GIRT Data as Parallel German-English Corpus
L04-1255
: Hélène Manuélian
Generating Coreferential Descriptions from a Structured Model of the Context
L04-1256
: Thatsanee Charoenporn; Virach Sornlertlamvanich; Sawit Kasuriya; Chatchawarn Hansakunbuntheung; Hitoshi Isahara
Open Collaborative Development of the Thai Language Resources for Natural Language Processing
L04-1257
: Lambros Kranias; Anna Samiotou
Automatic Translation Memory Fuzzy Match Post-Editing: A Step Beyond Traditional TM/MT Integration
L04-1258
: Ineke Schuurman; Wim Goedertier; Heleen Hoekstra; Nelleke Oostdijk; Richard Piepenbrock; Machteld Schouppe
Linguistic Annotation of the Spoken Dutch Corpus: If We Had To Do It All Over Again
L04-1259
: Attila Novák; Viktor Nagy; Csaba Oravecz
Combining Symbolic and Statistical Methods in Morphological Analysis and Unknown Word Guessing
L04-1260
: Balázs Kis; Begoña Villada; Gosse Bouma; Gábor Ugray; Tamás Bíró; Gábor Pohl; John Nerbonne
A New Approach to the Corpus-based Statistical Investigation of Hungarian Multi-word Lexemes
L04-1261
: M. Begoña Villada Moirón
Discarding Noise in an Automatically Acquired Lexicon of Support verb Constructions
L04-1262
: Francisco Nevado; Francisco Casacuberta; Josu Landa
Translation Memories Enrichment by Statistical Bilingual Segmentation
L04-1263
: J. C. Roux; P. H. Louw; T. R. Niesler
The African Speech Technology Project: An Assessment
L04-1264
: Kris Demuynck; Tom Laureys; Patrick Wambacq; Dirk Van Compernolle
Automatic Phonemic Labeling and Segmentation of Spoken Dutch
L04-1265
: Nelleke Oostdijk; Lou Boves
Using Large Multi-purpose Corpora for Specific Research Questions: Discourse Phenomena Related to Wh-questions in the Spoken Dutch Corpus
L04-1266
: Paola Mariani; Costanza Badii
Methods of Digital Access for Legal Language Documentation
L04-1267
: Peter Wittenburg; Heidi Johnson; Markus Buchhorn; Hennie Brugman; Daan Broeder
Architecture for Distributed Language Resource Management and Archiving
L04-1268
: Hanne Fersøe; Elviira Hartikainen; Henk van den Heuvel; Giulio Maltese; Asuncíon Moreno; Shaunie Shammass; Ute Ziegenhain
Creation and Validation of Large Lexica for Speech-to-Speech Translation Purposes
L04-1269
: Antoni Oliver; Marko Tadić
Enlarging the Croatian Morphological Lexicon by Automatic Lexical Acquisition from Raw Corpora
L04-1270
: Panagiotis Zervas; Manolis Maragoudakis; Nikos Fakotakis; George Kokkinakis
Learning to Predict Pitch Accents Using Bayesian Belief Networks for Greek Language
L04-1271
: Joaquim Moré; Salvador Climent; Antoni Oliver
A Grammar and Style Checker Based on Internet Searches
L04-1272
: Peter Wittenburg; Greg Gulrajani; Daan Broeder; Marcus Uneson
Cross-Disciplinary Integration of Metadata Descriptions
L04-1273
: Valeria Quochi
Representing Italian Complex Nominals: A Pilot Study
L04-1274
: Hayssam Traboulsi; David Cheng; Khurshid Ahmad
Text Corpora, Local Grammars and Prediction
L04-1275
: Helmut Schmid; Arne Fitschen; Ulrich Heid
SMOR: A German Computational Morphology Covering Derivation, Composition and Inflection
L04-1276
: Emi Izumi; Kiyotaka Uchimoto; Hitoshi Isahara
The Overview of the SST Speech Corpus of Japanese Learner English and Evaluation Through the Experiment on Automatic Detection of Learners' Errors
L04-1277
: Paul Gévaudan; Dirk Wiebel
Dynamic Lexicographic Data Modelling. A Diachronic Dictionary Development Report
L04-1278
: Laura Alonso; Maria Fuentes; Marc Massot; Horacio Rodríguez
Re-using High-quality Resources for Continued Evaluation of Automated Summarization Systems
L04-1279
: Marc Rössler
Corpus-based Learning of Lexical Resources for German Named Entity Recognition
L04-1280
: Hennie Brugman; Onno Crasborn; Albert Russel
Collaborative Annotation of Sign Language Data with Peer-to-Peer Technology
L04-1281
: Glòria Vàzquez; Ana Fernández Montraveta; Irene Castellón; Laura Alonso
Semantic Categorization of Spanish Se-constructions
L04-1282
: Angelo Dalli; Valentin Tablan; Kalina Bontcheva; Yorick Wilks; Daan Broeder; Hennie Brugman; Peter Wittenburg
Web Services Architecture for Language Resources
L04-1283
: Daan Broeder; Thierry Declerck; Laurent Romary; Markus Uneson; Sven Strömqvist; Peter Wittenburg
A Large Metadata Domain of Language Resources
L04-1284
: Tamás Gröbler; Gábor Hodász; Balázs Kis
MetaMorpho TM: A Rule-Based Translation Corpus
L04-1285
: Hennie Brugman; Albert Russel
Annotating Multi-media/Multi-modal Resources with ELAN
L04-1286
: Agnès Tutin; Meriam Haddara; Ruslan Mitkov; Constantin Orasan
Annotation of Anaphoric Expressions in an Aligned Bilingual Corpus
L04-1287
: Tylman Ule; Kiril Simov
Unexpected Productions May Well be Errors
L04-1288
: Avik Sarkar; Anne De Roeck
A Framework for Evaluating the Suitability of Non-English Corpora for Language Engineering
L04-1289
: Anna Samiotou; Lambros Kranias; Dimitrios Kokkinakis
Intelligent Building of Language Resources for HLT Applications
L04-1290
: Tomoyosi Akiba; Atsushi Fujii; Katunobu Itou
Collecting Spontaneously Spoken Queries for Information Retrieval
L04-1291
: Hristo Tanev; Milen Kouylekov; Matteo Negri; Bonaventura Coppola; Bernardo Magnini
Multilingual Pattern Libraries for Question Answering: a Case Study for Definition Questions
L04-1292
: Michael Daum; Kilian A. Foth; Wolfgang Menzel
Automatic Transformation of Phrase Treebanks to Dependency Trees
L04-1293
: Maria Luigia Ceccotti; Manuela Sassi
Computational Lexicography and Carlo Emilio Gadda, Principe dell'Analisi e Duca della Buona Cognizione
L04-1294
: Yoko Mizuta; Nigel Collier
An Annotation Scheme for a Rhetorical Analysis of Biology Articles
L04-1295
: Antoinette Renouf; Andrew Kehoe
Textual Distraction as a Basis for Evaluating Automatic Summarisers
L04-1296
: Milena Slavcheva
Verb Valency Descriptors for a Syntactic Treebank
L04-1297
: Walter Kasper; Jörg Steffen; Jakub Piskorski; Paul Buitelaar
Integrated Language Technologies for Multilingual Information Services in the MEMPHIS Project
L04-1298
: S.R. Deepa; Kalika Bali; A.G. Ramakrishnan; Partha Pratim Talukdar
Automatic Generation of Compound Word Lexicon for Hindi Speech Synthesis
L04-1299
: Saif Ahmad; Paulo C F de Oliveira; Khurshid Ahmad
Summarization of Multimodal Information
L04-1300
: Toomas Altosaar; Matti Karjalainen
Design of an Interactive Web-based User Interface for Speech Database Query Formation
L04-1301
: Syd Bauman; Alejandro Bia; Lou Burnard; Toma Erjavec; Christine Ruotolo; Susan Schreibman
Migrating Language Resources from SGML to XML: The Text Encoding Initiative Recommendations
L04-1302
: Niels Ole Bernsen; Laila Dybkjær; Svend Kiilerich
Evaluating Conversation with Hans Christian Andersen
L04-1303
: Catia Cucchiarini; Elisabeth D'Halleweyn
The New Dutch-Flemish HLT Programme: a Concerted Effort to Stimulate the HLT Sector
L04-1304
: Eiko Yamamoto; Kyoji Umemura
Related Word-pairs Extraction Without Dictionaries
L04-1305
: Rachel Aires; Aline Manfrin; Sandra Aluísio; Diana Santos
What is my Style? Using Stylistic Features of Portuguese Web Texts to Classify Web Pages According to Users' Needs
L04-1306
: Marco Baroni; Silvia Bernardini
BootCaT: Bootstrapping Corpora and Terms from the Web
L04-1307
: Jörg Steffen
N-Gram Language Modeling for Robust Multi-Lingual Document Classification
L04-1308
: Ana-Maria Barbu
A Word Alignment System Based on a Translation Equivalence Extractor
L04-1309
: Daan Broeder; Peter Wittenburg; Onno Crasborn
Using Profiles for IMDI Metadata Creation
L04-1310
: Karlheinz Mörth
Rethinking Readability of Digital Editions The Case of the AAC's "Digital Brenner"
L04-1311
: Daniel Ferrés; Marc Massot; Muntsa Padró; Horacio Rodríguez; Jordi Turmo
Automatic Building Gazetteers of Co-referring Named Entities
L04-1312
: Nilda Ruimy; Pierrette Bouillon; Bruno Cartoni
Semi-Automatic Derivation of a French Lexicon from CLIPS
L04-1313
: Nancy Ide; Keith Suderman
The American National Corpus First Release
L04-1314
: Stefan Evert; Ulrich Heid; Kristina Spranger
Identifying Morphosyntactic Preferences in Collocations
L04-1315
: Laila Dybkjær; Niels Ole Bernse
Towards General-Purpose Annotation Tools How Far Are We Today?
L04-1316
: Uwe D. Reichel; Karl Weilhammer
Automated Morphological Segmentation and Evaluation
L04-1317
: Nancy Ide; Laurent Romary
A Registry of Standard Data Categories for Linguistic Annotation
L04-1318
: Andrew Hippisley; Chara Karavasili
A Natural Language Approach to Information Management: Tracking Scientific Advances Through the Structure of Words
L04-1319
: Rita Marinelli; Adriana Roventini; Alessandro Enea
Building a Maritime Domain Lexicon: a Few Considerations on the Database Structure and the Semantic Coding
L04-1320
: Péter Halácsy; András Kornai; László Németh; András Rung; István Szakadát; Viktor Trón
Creating Open Language Resources for Hungarian
L04-1321
: Atsushi Fujii; Makoto Iwayama; Noriko Kando
Test Collections for Patent-to-Patent Retrieval and Patent Map Generation in NTCIR-4 Workshop
L04-1322
: Yuka Tateisi; Jun-ichi Tsujii
Part-of-Speech Annotation of Biology Research Abstracts
L04-1323
: Boo Bekavac; Petya Osenova; Kiril Simov; Marko Tadić
Making Monolingual Corpora Comparable: a Case Study of Bulgarian and Croatian
L04-1324
: Lina Henriksen; Bart Jongejan; Bente Maegaard
Corporate Voice, Tone of Voice and Controlled Language Techniques
L04-1325
: Nikos Fakotakis
Cypriot Speech Database: Data Collection and Greek to Cypriot Dialect Adaptation
L04-1326
: Borja Navarro; Manuel Palomar; Patricio Martínez-Barco
Automatic Extraction of Syntactic Semantic Patterns for Multilingual Resources
L04-1327
: Dominique Dutoit; Pierre Nugues; Patrick de Torcy
The Integral Dictionary: An Ontological Resource for the Semantic Web: Integration of EuroWordNet, Balkanet, TID, and SUMO
L04-1328
: Viktor Pekar; Richard Evans; Ruslan Mitkov
Categorizing Web Pages as a Preprocessing Step for Information Extraction
L04-1329
: Christian Weiss
A Framework for Data-driven Video-realistic Audio-visual Speech-synthesis
L04-1330
: Manuela Kunze; Dietmar Rösner
Corpus Based Enrichment of GermaNet Verb Frames
L04-1331
: Thierry Poibeau; Bénédicte Goujon
Semi-automatic Acquisition of Command Grammar
L04-1332
: Thierry Declerck; Paul Buitelaar; Nicoletta Calzolari; Alessandro Lenci
Towards a Language Infrastructure for the Semantic Web
L04-1333
: Alvin Martin; David Miller; Mark Przybocki; Joseph Campbell; Hirotaka Nakasone
Conversational Telephone Speech Corpus Collection for the NIST Speaker Recognition Evaluation 2004
L04-1334
: Stephan Vogel; Christian Monson
Augmenting Manual Dictionaries for Statistical Machine Translation Systems
L04-1335
: Christian Biemann; Uwe Quasthoff; Christian Wolff
Linguistic Corpus Search
L04-1336
: Nicoletta Calzolari; Khalid Choukri; Maria Gavrilidou; Bente Maegaard; Paola Baroni; Hanne Fersøe; Alessandro Lenci; Valérie Mapelli; Monica Monachini; Stelios Piperidis
ENABLER Thematic Network of National Projects: Technical, Strategic and Political Issues of LRs
L04-1337
: Evie Coussé; Steven Gillis; Hanne Kloots; Marc Swerts
The Influence of the Labellers Regional Background on Phonetic Transcriptions: Implications for the Evaluation of Spoken Language Resources
L04-1338
: Paul Buitelaar; Diana Steffen; Martin Volk; Dominic Widdows; Bogdan Sacaleanu; pela Vintar; Stanley Peters; Hans Uszkoreit
Evaluation Resources for Concept-based Cross-Lingual Information Retrieval in the Medical Domain
L04-1339
: Chris Biemann; Stefan Bordag; Uwe Quasthoff
Automatic Acquisition of Paradigmatic Relations Using Iterated Co-occurrences
L04-1340
: Paul Buitelaar; Daniel Olejnik; Mihaela Hutanu; Alexander Schutz; Thierry Declerck; Michael Sintek
Towards Ontology Engineering Based on Linguistic Analysis
L04-1341
: Dorota Iskra; Rainer Siemund; Jamal Borno; Asuncion Moreno; Ossama Emam; Khalid Choukri; Oren Gedge; Herbert Tropf; Albino Nogueiras; Imed Zitouni; Anastasios Tsopanoglou; Nikos Fakotakis
OrienTel - Telephony Databases Across Northern Africa and the Middle East
L04-1342
: Hanne Fersøe; Monica Monachini
ELRA Validation Methodology and Standard Promotion for Linguistic Resources
L04-1343
: Hanno Biber; Evelyn Breiteneder
The AAC [Austrian Academy Corpus] An Enterprise to Develop Large Electronic Text Corpora
L04-1344
: Diana Binnenpoorte; Catia Cucchiarini; Helmer Strik; Lou Boves
Improving Automatic Phonetic Transcription of Spontaneous Speech Through Variant-Based Pronunciation Variation Modelling
L04-1345
: Massimo Poesio; Mijail A. Kabadjov
A General-Purpose, Off-the-shelf Anaphora Resolution Module: Implementation and Preliminary Evaluation
L04-1346
: Donghong Ji; Li Tang; Lingpeng Yang
Building a Conceptual Graph Bank for Chinese Language
L04-1347
: Anne Abeillé; Nicolas Barrier
Enriching a French Treebank
L04-1348
: Béatrice Daille; Samuel Dufour-Kowalski; Emmanuel Morin
French-English Multi-word Term Alignment Based on Lexical Context Analysis
L04-1349
: Vincenzo Pallotta; Hatem Ghorbel; Patrick Ruch; Giovanni Coray
An Argumentative Annotation Schema for Meeting Discussions
L04-1350
: Jochen Trommer; Dalina Kallulli
A morphological Analyzer for Standard Albanian
L04-1351
: Abdelhadi Soudi; Andreas Eisele
Generating an Arabic Full-form Lexicon for Bidirectional Morphology Lookup
L04-1352
: Petr Pollák; Jan Černocký
Orthographic and Phonetic Annotation of Very Large Czech Corpora with Quality Assessment
L04-1353
: Catarina Ribeiro; Ricardo Santos; João Correia; Rui Pedro Chaves; Palmira Marrafa
INQUER: A WordNet-based Question-Answering Application
L04-1354
: António Branco; João Silva
Evaluating Solutions for the Rapid Development of State-of-the-Art POS Taggers for Portuguese
L04-1355
: Stefan Klatt
A High Quality Partial Parser for Annotating German Text Corpora
L04-1356
: Manolis Maragoudakis; Nikos Fakotakis
Bayesian Semantics Incorporation to Web Content for Natural Language Information Retrieval
L04-1357
: Lars Bo Larsen
Usability Evaluation of Spoken Dialogue Systems
L04-1358
: Iulia Nica; Mª Antònia Martí; Andrés Montoyo; Sonia Vázquez
Enriching EWN with Syntagmatic Information by Means of WSD
L04-1359
: Rita Marinelli
Proper Names and Polysemy: From a Lexicographic Experience
L04-1360
: Ulrich Heid; Bettina Säuberlich; Esther Debus-Gregor; Werner Scholze-Stubenrecht
Tools for Upgrading Printed Dictionaries by Means of Corpus-based Lexical Acquisition
L04-1361
: Jakub Piskorski
Extraction of Polish Named-Entities
L04-1362
: Juan Fernández; Mauro Castillo; German Rigau; Jordi Atserias; Jordi Turmo
Automatic Acquisition of Sense Examples Using ExRetriever
L04-1363
: Cvetana Krstev; Duko Vitas; Ranka Stankoviæ; Ivan Obradoviæ; Gordana Pavloviæ-Laetiæ
Combining Heterogeneous Lexical Resources
L04-1364
: Viet-Bac Le; Do-Dat Tran; Eric Castelli; Laurent Besacier; Jean-François Serignat
Spoken and Written Language Resources for Vietnamese
L04-1365
: Andrei Popescu-Belis; Maria Georgescul; Alexander Clark; Susan Armstrong
Building and Using a Corpus of Shallow Dialogue Annotated Meetings
L04-1366
: Lorenzo Piccioni; Eros Zanchetta
XTERM: A Flexible Standard-Compliant XML-Based Termbase Management System
L04-1367
: Márton Miháltz
Word Sense Disambiguation Using Random Indexing
L04-1368
: Ulrich Heid; Holger Voormann; Jan-Torsten Milde; Ulrike Gut; Katrin Erk; Sebastian Padó
Querying Both Time-aligned and Hierarchical Corpora with NXT Search
L04-1369
: A. Chalamandaris; P. Tsiakoulis; S. Raptis; G. Giannopoulos; G. Carayannis
Bypassing Greeklish!
L04-1370
: Catarina Ribeiro; Ricardo Santos; Rui Pedro Chaves; Palmira Marrafa
Semi-Automatic UNL Dictionary Generation Using WordNet.PT
L04-1371
: Alexander Geyken
Bootstrapping a Database of German Multi-word Expressions
L04-1372
: Le An Ha
A Practical Comparison of Different Filters Used in Automatic Term Extraction
L04-1373
: Jesús Giménez; Lluís Màrquez
SVMTool: A general POS Tagger Generator Based on Support Vector Machines
L04-1374
: Stefanie Herrmann; Hartmut Keck; Stephan Kepser
A Multi-Modal Documentation System for Warao
L04-1375
: Ulrich Callmeier; Andreas Eisele; Ulrich Schäfer; Melanie Siegel
The DeepThought Core Architecture Framework
L04-1376
: Jordi Atserias; Salvador Climent; German Rigau
Towards the Meaning Top Ontology: Sources of Ontological Meaning
L04-1377
: Zygmunt Vetulani
An Environment for Dialogue Corpora Collection (ENDIACC)
L04-1378
: G. Bordel; A. Ezeiza; K. Lopez de Ipina; M. Méndez; M. Peñagarikano; T. Rico; C. Tovar; E. Zulueta
Development of Resources for a Bilingual Automatic Index System of Broadcast News in Basque and Spanish
L04-1379
: António Teixeira; Liliana Ferreira; Lurdes Moutinho; Rosa Lídia Coimbra; Raquel Lisboa
An Acoustic Corpus Contemplating Regional Variation for Studies of European Portuguese Nasals
L04-1380
: Laurent Romary; Amalia Todirascu; David Langlois
Experiments on Building Language Resources for Multi-Modal Dialogue Systems
L04-1381
: David Day; Chad McHenry; Robyn Kozierok; Laurel Riek
Callisto: A Configurable Annotation Workbench
L04-1382
: Ray Clifford; Neil Granoien; Douglas Jones; Wade Shen; Clifford Weinstein
The Effect of Text Difficulty on Machine Translation Performance -- A Pilot Study with ILR-Rated Texts in Spanish, Farsi, Arabic, Russian and Korean
L04-1383
: Joachim Wermter; Udo Hahn
An Annotated German-Language Medical Text Corpus as Language Resource
L04-1384
: Diana Pérez; Enrique Alfonseca; Pilar Rodríguez
Application of the BLEU Method for Evaluating Free-text Answers in an E-learning Environment
L04-1385
: Kyoko Kanzaki; Qing Ma; Eiko Yamamoto; Masaki Murata; Hitoshi Isahara
Extraction of Hyperonymy of Adjectives from Large Corpora by Using the Neural Network Model
L04-1386
: Eleni Miltsakaki; Rashmi Prasad; Aravind Joshi; Bonnie Webber
The Penn Discourse Treebank
L04-1387
: Violeta Seretan; Luka Nerima; Eric Wehrli
Using the Web as a Corpus for the Syntactic-Based Collocation Identification
L04-1388
: Michael Schiehlen; Kristina Spranger
Automatic Methods to Supplement Broad-Coverage Subcategorization Lexicons
L04-1389
: Henk Harkema; Robert Gaizauskas; Mark Hepple; Neil Davis; Yikun Guo; Angus Roberts; Ian Roberts
A Large-Scale Resource for Storing and Recognizing Technical Terminology
L04-1390
: Holmer Hemsen
Evaluation of a Multimodal Dialogue System for Small-screen Devices
L04-1391
: Christian Biemann; Stefan Bordag; Uwe Quasthoff; Christian Wolff
Web Services for Language Resources and Language Technology Applications
L04-1392
: Elisabeth Pinto; Delphine Charlet; Hélène François; Djamel Mostefa; Olivier Boëffard; Dominique Fohr; Odile Mella; Frédéric Bimbot; Khalid Choukri; Yann Philip; Francis Charpentier
Development of New Telephone Speech Databases for French: the NEOLOGOS Project
L04-1393
: Karel Pala; Pavel Smrz
Top Ontology as a Tool for Semantic Role Tagging
L04-1394
: Argyrios Vasilakopoulos; Michele Bersani; William J. Black
A Suite of Tools for Marking Up Textual Data for Temporal Text Mining Scenarios
L04-1395
: Anne De Roeck; Avik Sarkar; Paul Garthwaite
Frequent Term Distribution Measures for Dataset Profiling
L04-1396
: Josef Psutka; Pavel Ircing; Jan Hajič; Vlasta Radová; Josef V. Psutka; William J. Byrne; Samuel Gustman
Issues in Annotation of the Czech Spontaneous Speech Corpus in the MALACH project
L04-1397
: Asunción Gómez-Pérez; M. Carmen Suárez-Figueroa
Ontology Evaluation Functionalities of RDF(S),DAML+OIL, and OWL Parsers and Ontology Platforms
L04-1398
: Anna Sinopalnikova; Pavel Smrz
Word Association Norms as a Unique Supplement of Traditional Language Resources
L04-1399
: Nadine Aldinger
Towards a Dynamic Lexicon: Predicting the Syntactic Argument Structure of Complex Verbs
L04-1400
: Robert Král
Semantic Annotating of Czech Corpus via WSD
L04-1401
: Jean Carletta; Shipra Dingare; Malvina Nissim; Tatiana Nikitina
Using the NITE XML Toolkit on the Switchboard Corpus to Study Syntactic Choice: a Case Study
L04-1402
: Malvina Nissim; Shipra Dingare; Jean Carletta; Mark Steedman
An Annotation Scheme for Information Status in Dialogue
L04-1403
: Alex Trutnev; Antoine Ronzenknop; Martin Rajman
Speech Recognition Simulation and its Application for Wizard-of-Oz Experiments
L04-1404
: Murat Deviren; Khalid Daoudi; Kamel Smaïli
Language Modeling Using Dynamic Bayesian Networks
L04-1405
: Udo Hahn; Joachim Wermter
Pumping Documents Through a Domain and Genre Classification Pipeline
L04-1406
: Kiril Simov; Petya Osenova
A Hybrid Strategy For Regular Grammar Parsing
L04-1407
: Jordi Atserias; Bernardo Magnini; Octavian Popescu; Eneko Agirre; Aitziber Atutxa; German Rigau; John Carroll; Rob Koeling
Cross-Language Acquisition of Semantic Models for Verbal Predicates
L04-1408
: Andrea Sansò
MED-TYP: A Typological Database for Mediterranean Languages
L04-1409
: Kallirroi Georgila; Nikos Fakotakis; George Kokkinakis
A graphical Tool for Handling Rule Grammars in Java Speech Grammar Format
L04-1410
: Svetlana Sheremetyeva
A Flexible Language Acquisition Tool Kit for Natural Language Processing
L04-1411
: David Martínez; Eneko Agirre
The Effect of Bias on an Automatically-built Word Sense Corpus
L04-1412
: Victoria Arranz; Núria Castell; Josep Maria Crego; Jesús Giménez; Adrià de Gispert; Patrik Lambert
Bilingual Connections for Trilingual Corpora: An XML Approach
L04-1413
: Thorsten Trippel; Dafydd Gibbon; Alexandra Thies; Jan-Torsten Milde; Karin Looks; Benjamin Hell; Ulrike Gut
CoGesT: a Formal Transcription System for Conversational Gesture
L04-1414
: Anders Nøklestad
Memory-based Classification of Proper Names in Norwegian
L04-1415
: Alex Trutnev; Martin Rajman
Comparative Evaluations in the Domain of Automatic Speech Recognition
L04-1416
: Thorsten Trippel; Felix Sasaki; Dafydd Gibbon
Consistent Storage of Metadata in Inference Lexica: the MetaLex Approach
L04-1417
: Nuno Cavalheiro Marques; Sérgio Gonçalves
Applying a Part-of-Speech Tagger to Postal Address Detection on the Web
L04-1418
: Monica Monachini; Federico Calzolari; Michele Mammini; Sergio Rossi; Marisa Ulivieri
Unifying Lexicons in view of a Phonological and Morphological Lexical DB
L04-1419
: A. Braffort; A. Choisier; C. Collet; P. Dalle; F. Gianni; F. Lenseigne; J. Segouat
Toward an Annotation Software for Video of Sign Language, Including Image Processing Tools and Signing Space Modelling
L04-1420
: Fabio Tamburini
Building Distributed Language Resources By Grid Computing
L04-1421
: Bernd Bohnet; Halyna Seniv
Mapping Dependency Structures to Phrase Structures and the Automatic Acquisition of Mapping Rules
L04-1422
: Georgiana Puşcaşu
A Framework for Temporal Resolution
L04-1423
: Stephan Busemann
EGRAM A Grammar Development Environment and its Usage for Language Generation
L04-1424
: Louise Guthrie; Roberto Basili; Fabio Zanzotto; Kalina Bontcheva; Hamish Cunningham; David Guthrie; Jia Cui; Marco Cammisa; Jerry Cheng-Chieh Liu; Cassia Farria Martin; Kristiyan Haralambiev; Martin Holub; Klaus Macherey; Fredrick Jelinek
Large Scale Experiments for Semantic Labeling of Noun Phrases in Raw Text
L04-1425
: Eneko Agirre; Aitziber Atutxa; Koldo Gojenola; Kepa Sarasola
Exploring Portability of Syntactic Information from English to Basque
L04-1426
: Jordi Atserias; Luís Villarejo; German Rigau
Spanish WordNet 1.6: Porting the Spanish Wordnet Across Princeton Versions
L04-1427
: Magdalena Wolska; Bao Quoc Vo; Dimitra Tsovaltzi; Ivana Kruijff-Korbayová; Elena Karagjosova; Helmut Horacek; Armin Fiedler; Christoph Benzmüller
An Annotated Corpus of Tutorial Dialogs on Mathematical Theorem Proving
L04-1428
: Lonneke van der Plas; Vincenzo Pallotta; Martin Rajman; Hatem Ghorbel
Automatic Keyword Extraction from Spoken Text. A Comparison of Two Lexical Resources: EDR and WordNet
L04-1429
: Anna Kupść; Teruko Mitamura; Benjamin Van Durme; Eric Nyberg
Pronominal Anaphora Resolution for Unrestricted Text
L04-1430
: G. Gravier; J-F. Bonastre; E. Geoffrois; S. Galliano; K. Mc Tait; K. Choukri
The ESTER Evaluation Campaign for the Rich Transcription of French Broadcast News
L04-1431
: Manfred Klenner; Fabio Rinaldi; Michael Hess
Steps Towards Semantically Annotated Language Resources
L04-1432
: Nina Wacholder; Sharon Small; Bing Bai; Diane Kelly; Robert Rittman; Sean Ryan; Robert Salkin; Peng Song; Ying Sun; Liu Ting; Paul Kantor; Tomek Strzalkowski
Designing a Realistic Evaluation of an End-to-end Interactive Question Answering System
L04-1433
: Karin Müller
Semi-Automatic Construction of a Question Treebank
L04-1434
: Bogdan Babych; Debbie Elliott; Anthony Hartley
Calibrating Resource-light Automatic MT Evaluation: a Cheap Approach to Ranking MT Systems by the Usability of Their Output
L04-1435
: Stelios Piperidis; Iason Demiros; Prokopis Prokopidis; Peter Vanroose; Anja Hoethker; Walter Daelemans; Elsa Sklavounou; Manos Konstantinou; Yannis Karavidas
Multimodal, Multilingual Resources in the Subtitling Process
L04-1436
: Kazuki Adachi; Tomoki Toda; Hiromichi Kawanami; Hiroshi Saruwatari; Kiyohiro Shikano
Perceptual Evaluation of Quality Deterioration Owing to Prosody Modification
L04-1437
: Serge A. Yablonsky
Integration of Russian Language Resources
L04-1438
: Roberto Basili; Nicola Lorusso; Maria Teresa Pazienza; Fabio Massimo Zanzotto
A2Q: An Agent-based Architecure for Multilingual Q&A
L04-1439
: Guadalupe Aguado de Cea; Inmaculada Álvarez-de-Mon; Antonio Pareja-Lora
OntoTag's Linguistic Ontologies: Enhancing Higher Level and Semantic Web Annotations
L04-1440
: Kaarel Kaljurand; Fabio Rinaldi; James Dowdall; Michael Hess
Exploiting Language Resources for Semantic Web Annotations
L04-1441
: Kiyong Lee; Lou Burnard; Laurent Romary; Eric de la Clergerie; Thierry Declerck; Syd Bauman; Harry Bunt; Lionel Clément; Toma Erjavec; Azim Roussanaly; Claude Roux
Towards an International Standard on Feature Structure Representation
L04-1442
: Ariadna Font Llitjós; Jaime Carbonell
The Translation Correction Tool: English-Spanish User Studies
L04-1443
: Brian Mitchell; Robert Gaizauskas
A Labelled Corpus for Prepositional Phrase Attachment
L04-1444
: Gabriel Infante-Lopez; Maarten de Rijke
Comparing the Ambiguity Reduction Abilities of Probabilistic Context-Free Grammars
L04-1445
: Paul Morarescu; Sanda Harabagiu
NameNet: a Self-Improving Resource for Name Classification
L04-1446
: Katerina Pastra; Yorick Wilks
Image-Language Multimodal Corpora: Needs, Lacunae and an AI Synergy for Annotation
L04-1447
: Na-Rae Han; Martin Chodorow; Claudia Leacock
Detecting Errors in English Article Usage with a Maximum Entropy Classifier Trained on a Large, Diverse Corpus
L04-1448
: Radek Sedláček
The Core of the Czech Derivational Dictionary
L04-1449
: Walter Daelemans; Anja Höthker; Erik Tjong Kim Sang
Automatic Sentence Simplification for Subtitling in Dutch and English
L04-1450
: Canasai Kruengkrai; Thatsanee Charoenporn; Virach Sornlertlamvanich; Hitoshi Isahara
Enriching a Thai Lexical Database with Selectional Preferences
L04-1451
: Jonathan G. Fiscus
Results of the 2003 Topic Detection and Tracking Evaluation
L04-1452
: Jennifer Foster
Parsing Ungrammatical Input: an Evaluation Procedure
L04-1453
: Melania Degeratu; Vasileios Hatzivassiloglou
An Automatic Method for Constructing Domain-Specific Ontology Resources
L04-1454
: Ann Copestake; Fabre Lambeau; Benjamin Waldron; Francis Bond; Dan Flickinger; Stephan Oepen
A Lexicon Module for a Grammar Development Environment
L04-1455
: Bogdan Babych; Anthony Hartley
Modelling Legitimate Translation Variation for Automatic Evaluation of MT Quality
L04-1456
: Roberto Bartolini; Alessandro Lenci; Simonetta Montemagni; Vito Pirrelli; Claudia Soria
Semantic Mark-up of Italian Legal Texts Through NLP-based Techniques
L04-1457
: Lionel Clément; Benoît Sagot; Bernard Lang
Morphology Based Automatic Acquisition of Large-coverage Lexica
L04-1458
: Kiril Ribarov
Towards Intelligent Written Cultural Heritage Processing - Lexical processing
L04-1459
: Violetta Cavalli-Sforza; Jaime G. Carbonell; Peter J. Jansen
Developing Language Resources for a Transnational Digital Government System
L04-1460
: Mary D. Swift; Myroslava O. Dzikovska; Joel R. Tetreault; James F. Allen
Semi-automatic Syntactic and Semantic Corpus Annotation with a Deep Parser
L04-1461
: Georges Fafiotte; Christian Boitet; Mark Seligman; Zong Chengqing
Collecting and Sharing Bilingual Spontaneous Speech Corpora: the ChinFaDial Experiment
L04-1462
: Judita Preiss; Caroline Gasperin; Ted Briscoe
Can Anaphoric Definite Descriptions be Replaced by Pronouns?
L04-1463
: Roberto Bartolini; Alessandro Lenci; Simonetta Montemagni; Vito Pirrelli
Hybrid Constraints for Robust Parsing: First Experiments and Evaluation
L04-1464
: Véronique Aubergé; Nicolas Audibert; Albert Rilliard
E-Wiz: a Trapper Protocol for Hunting the Expressive Speech Corpora in Lab
L04-1465
: Simone Teufel; Hans van Halteren
Agreement in Human Factoid Annotation for Summarization Evaluation
L04-1466
: Albert Rilliard; Véronique Aubergé; Nicolas Audibert
Evaluating an Authentic Audio-Visual Expressive Speech Corpus
L04-1467
: Nadia Mana; Roldano Cattoni; Emanuele Pianta; Franca Rossi; Fabio Pianesi; Susanne Burger
The Italian NESPOLE! Corpus: a Multilingual Database with Interlingua Annotation in Tourism and Medical Domains
L04-1468
: Eugenio Picchi; Maria Luigia Ceccotti; Sebastiana Cucurullo; Manuela Sassi; Eva Sassolini
Linguistic Miner: An Italian Linguistic Knowledge System
L04-1469
: Antonietta Alonge; Birte Lönneker
Metaphors in Wordnets: From Theory to Practice
L04-1470
: Harry Bunt; Laurent Romary
Standardization in Multimodal Content Representation: Some Methodological Issues
L04-1471
: Roberto Basili; Marco Cammisa; Fabio Massimo Zanzotto
A Similarity Measure for Unsupervised Semantic Disambiguation
L04-1472
: Laila Dybkjær; Niels Ole Bernsen; Wolfgang Minker
Usability Evaluation of Multimodal and Domain-Oriented Spoken Language Dialogue Systems
L04-1473
: Jaap Kamps; Maarten Marx; Robert J. Mokken; Maarten de Rijke
Using WordNet to Measure Semantic Orientations of Adjectives
L04-1474
: Per Weijnitz; Eva Forsbom; Ebba Gustavii; Eva Pettersson; Jörg Tiedemann
MT Goes Farming: Comparing Two Machine Translation Approaches on a New Domain
L04-1475
: Esmeralda Uraga; César Gamboa
VOXMEX Speech Database: Design of a Phonetically Balanced Corpus
L04-1476
: Christopher Brewster; Harith Alani; Srinandan Dasmahapatra; Yorick Wilks
Data Driven Ontology Evaluation
L04-1477
: Oliver Schonefeld; Jan-Torsten Milde
Embedding IMDI Metadata into a Large Phonetic Corpus
L04-1478
: Francesca Bertagna
Using Semantic Language Resources to Support Textual Inference for Question Answering
L04-1479
: Vasco Calais Pedro; Jeongwoo Ko; Eric Nyberg; Teruko Mitamura
An Information Repository Model for Advanced Question Answering Systems
L04-1480
: Francesca Bertagna; Alessandro Lenci; Monica Monachini; Nicoletta Calzolari
Content Interoperability of Lexical Resources: Open Issues and "MILE" Perspectives
L04-1481
: Martin Čmejrek; Jan Cuřín; Jiří Havelka; Jan Hajič; Vladislav Kuboň
Prague Czech-English Dependency Treebank. Syntactically Annotated Resources for Machine Translation
L04-1482
: Christian Monson; Lori Levin; Rodolfo Vega; Ralf Brown; Ariadna Font Llitjos; Alon Lavie; Jaime Carbonell; Eliseo Cañulef; Rosendo Huisca
Data Collection and Analysis of Mapudungun Morphology for Spelling Correction
L04-1483
: Arlindo O. Veiga; Fernando S. Perdigão
An Efficient Word Confidence Measure Using Likelihood Ratio Scores
L04-1484
: Kenji Sagae; Brian MacWhinney; Alon Lavie
Adding Syntactic Annotations to Transcripts of Parent-Child Dialogs
L04-1485
: Huarui Zhang; Churen Huang; Shiwen Yu
Distributional Consistency: As a General Method for Defining a Core Lexicon
L04-1486
: Rebecca J. Passonneau
Computing Reliability for Coreference Annotation
L04-1487
: Eneko Agirre; Oier Lopez de Lacalle
Publicly Available Topic Signatures for all WordNet Nominal Senses
L04-1488
: Timothy Baldwin; Emily M. Bender; Dan Flickinger; Ara Kim; Stephan Oepen
Road-testing the English Resource Grammar Over the British National Corpus
L04-1489
: Ying Zhang; Stephan Vogel; Alex Waibel
Interpreting BLEU/NIST Scores: How Much Improvement do We Need to Have a Better System?
L04-1490
: Peter Anick
Exploiting Anchor Text as a Lexical Resource
L04-1491
: Dragomir Radev; Timothy Allison; Sasha Blair-Goldensohn; John Blitzer; Arda Çelebi; Stanko Dimitrov; Elliott Drabek; Ali Hakim; Wai Lam; Danyu Liu; Jahna Otterbacher; Hong Qi; Horacio Saggion; Simone Teufel; Michael Topper; Adam Winkel; Zhu Zhang
MEAD - A Platform for Multidocument Multilingual Text Summarization
L04-1492
: Saurabh Garg; Bilyana Martinovski; Susan Robinson; Jens Stephan; Joel Tetreault; David R. Traum
Evaluation of Transcription and Annotation Tools for a Multi-modal, Multi-party Dialogue Corpus
L04-1493
: Michael Emonts
Current Projects in Languages of Military Interest at the Defense Language Institute
L04-1494
: Aline Villavicencio; Timothy Baldwin; Benjamin Waldron
A Multilingual Database of Idioms
L04-1495
: Kazuaki Maeda; Stephanie Strassel
Annotation Tools for Large-Scale Corpus Development: Using AGTK at the Linguistic Data Consortium
L04-1496
: Stephanie Strassel
Linguistic Resources for Effective, Affordable, Reusable Speech-to-Text
L04-1497
: Marc Vilain
Building part-of-speech Corpora Through Histogram Hopping
L04-1498
: Gregory Ernest Monaco; Abdelhadi Soudi
An Emerging Transcontinental Collaborative Research and Education Agenda in Human Language Technologies
L04-1499
: Susan Robinson; Bilyana Martinovski; Saurabh Garg; Jens Stephan; David Traum
Issues in Corpus Development for Multi-party Multi-modal Task-oriented Dialogue
L04-1500
: Christopher Cieri; David Miller; Kevin Walker
The Fisher Corpus: a Resource for the Next Generations of Speech-to-Text
L04-1501
: David R. Traum; Susan Robinson; Jens Stephan
Evaluation of Multi-party Virtual Reality Dialogue Interaction
L04-1502
: Christopher Cieri; Joseph P. Campbell; Hirotaka Nakasone; David Miller; Kevin Walker
The Mixer Corpus of Multilingual, Multichannel Speaker Recognition Data
L04-1503
: Alessandro Mazzei; Vincenzo Lombardo
Building a Large Grammar for Italian
L04-1504
: Kitazawa Shigeyoshi; Kiriyama Shinya; Itoh Toshihiko; Nick Campbell
Japanese MULTEXT: a Prosodic Corpus
L04-1505
: Giuseppe Cappeli; Paulo Alberto
The OLISSIPO and LECTIO Projects
L04-1506
: Long Qiu; Min-Yen Kan; Tat-Seng Chua
A Public Reference Implementation of the RAP Anaphora Resolution Algorithm
L04-1507
: Mark Hepple; Neil Ireson; Paolo Allegrini; Simone Marchi; Simonetta Montemagni; Jose Maria Gomez Hidalgo
NLP-enhanced Content Filtering Within the POESIA Project
L04-1508
: Philippe Martin
WinPitch Corpus, a Text to Speech Alignment Tool for Multimodal Corpora
L04-1509
: Stefan Evert
The Statistical Analysis of Morphosyntactic Distributions
L04-1510
: Luciana Bordoni; Leonardo Pasqualini; Filippo Sciarrone
CHeM: A System for the Automatic Analysis of e-mails in the Restoration and Conservation Domain
L04-1511
: Robert Irie; Beth Sundheim
Resources for Place Name Analysis
L04-1512
: Bente Maegaard
NEMLAR - An Arabic Language Resources Project
L04-1513
: Key-Sun Choi; Hee-Sook Bae; Wonseok Kang; Juho Lee; Eunhe Kim; Hekyeong Kim; Donghee Kim; Youngbin Song; Hyosik Shin
Korean-Chinese-Japanese Multilingual Wordnet with Shared Semantic Hierarchy
L04-1514
: Christophe Jouis; Jean-Marie Ferru
Intranet Try To Find Project (ITTF): An Approach for the Search of Relevant Information Inside an Organization
L04-1515
: Christopher Cieri; Mark Liberman
A Progress Report from the Linguistic Data Consortium: Recent Activities in Resource Creation and Distribution and the Development of Tools and Standards
L04-1516
: Khalid Choukri
Recent Activities within the European Language Resources Association: Issues on Sharing Language Resources and Evaluation
L04-1517
: Widad Mustafa El Hadi; Ismail Timimi; Marianne Dabbadie
EVALDA-CESART Project: Terminological Resources Acquisition Tools Evaluation Campaign
L04-1518
: Gabriella Pardelli; Manuela Sassi; Sara Goggi
From Weaver to the ALPAC Report
L04-1519
: Rute Costa; Raquel Silva
The Verb in the Terminological Collocations. Contribution to the Development of a Morphological Analyser: MorphoCom
L04-1520
: Joaquim F. Ferreira da Silva; Zornitsa Kozareva; José Gabriel Pereira Lopes
Cluster Analysis and Classification of Named Entities
L04-1521
: Khalid Choukri; Mahtab Nikkhou; Niklas Paulsson
Network of Data Centres (NetDC): BNSC - An Arabic Broadcast News Speech Corpus
L04-1522
: Valérie Mapelli; Maria Nava; Sylvain Surcin; Djamel Mostefa; Khalid Choukri
Technolangue: A Permanent Evaluation and Information Infrastructure
L04-1523
: Palmira Marrafa
Extending Wordnets To Implicit Information
L04-1524
: Boris Dobrov; Igor Kuralenok; Natalia Loukachevitch; Igor Nekrestyanov; Ilya Segalovich
Russian Information Retrieval Evaluation Seminar
