ACL Logo ACL Anthology
A Digital Archive of Research Papers in Computational Linguistics

Google search the Anthology

Proceedings of the Second International Conference on Language Resources and Evaluation (LREC'00)

L00-1001 : Gérard Bailly; Eduardo R. Banga; Alex Monaghan; Erhard Rank
The Cost258 Signal Generation Test Array

L00-1002 : Kosho Shudo; Masahito Takahashi; Yasuo Koyama; Kenji Yoshimura
Collocations as Word Co-ocurrence Restriction Data - An Application to Japanese Word Processor -

L00-1003 : Amit Bagga
Enhancing the TDT Tracking Evaluation

L00-1004 : Amalia Arvaniti; Mary Baltazani
GREEK ToBI: A System for the Annotation of Greek Speech Corpora

L00-1005 : Adam Kilgarriff; Joseph Rosenzweig
English Senseval: Report and Results

L00-1006 : Asunción Moreno; Robrecht Comeyne; Keith Haslam; Henk van den Heuvel; Harald Höge; Sabine Horbach; Giorgio Micca
SALA: SpeechDat across Latin America. Results of the First Phase

L00-1007 : Dan Tufiş
Using a Large Set of EAGLES-compliant Morpho-syntactic Descriptors as a Tagset for Probabilistic Tagging

L00-1008 : Elliott Macklovitch; Michel Simard; Philippe Langlais
TransSearch: A Free Translation Memory on the World Wide Web

L00-1009 : Bolette Sandford Pedersen; Sanni Nimb
Semantic Encoding of Danish Verbs in SIMPLE - Adapting a Verb Framed Model to a Satellite-framed Language

L00-1010 : Mochizuki Hajime; Okumura Manabu
A Comparison of Summarization Methods Based on Task-based Evaluation

L00-1011 : Zheng Jie; Mao Yuhang
A Word Sense Disambiguation Method Using Bilingual Corpus

L00-1012 : Stavroula-Evita Fotinea; Ioannis Dologlou; Stylianos Bakamidis; Gregory Stainhaouer; George Carayannis
Perceptual Evaluation of a New Subband Low Bit Rate Speech Compression System based on Waveform Vector Quantization and SVD Postfiltering

L00-1013 : Sandro Pedrazzini; Elisabeth Maier; Dierk König
Terms Specification and Extraction within a Linguistic-based Intranet Service

L00-1014 : Eva Hajičová; Petr Sgall
Semantico-syntactic Tagging of Very Large Corpora: the Case of Restoration of Nodes on the Underlying Level

L00-1015 : Eva Hajičová; Jarmila Panenová; Petr Sgall
Coreference in Annotating a Large Corpus

L00-1016 : Peter Bennison; Lynne Bowker
Designing a Tool for Exploiting Bilingual Comparable Corpora

L00-1017 : Diana Maynard; Sophia Ananiadou
Creating and Using Domain-specific Ontologies for Terminological Applications

L00-1018 : Ellen M. Voorhees; Dawn M. Tice
The TREC-8 Question Answering Track

L00-1019 : Satoshi Sekine; Hitoshi Isahara
IREX: IR & IE Evaluation Project in Japanese

L00-1020 : Svetlana Sheremetyeva; Sergei Nirenburg
Towards A Universal Tool For NLP Resource Acquisition

L00-1021 : Hu Junfeng; Yu Shiwen
The Multi-layer Language Knowledge Base of Chinese NLP

L00-1022 : Yasmina Abbas; Marie-Luce Picard
With WORLDTREK Family, Create, Update and Browse your Terminological World

L00-1023 : N. Chenfour; A. Benabbou; A. Mouradi
Etude et Evaluation de la Di-Syllabe comme Unité Acoustique pour le Système de Synthèse Arabe PARADIS

L00-1024 : Marcela Charfuelán; José Relaño Gil; M. Carmen Rogríguez Gancedo; Daniel Tapias Merino; Luis Hernández Gómez
Dialogue Annotation for Language Systems Evaluation

L00-1025 : Philippe Langlais; Sébastien Sauvé; George Foster; Elliott Macklovitch; Guy Lapalme
Evaluation of TRANSTYPE, a Computer-aided Translation Typing System: A Comparison of a Theoretical- and a User-oriented Evaluation Procedures

L00-1026 : Gerardo Sierra; John McNaught
Extraction of Semantic Clusters for Terminological Information Retrieval from MRDs

L00-1027 : Jean-Yves Antoine; Jacques Siroux; Jean Caelen; Jeanne Villaneau; Jérôme Goulian; Mohamed Ahafhaf
Obtaining Predictive Results with an Objective Evaluation of Spoken Dialogue Systems: Experiments with the DCR Assessment Paradigm

L00-1028 : Guy Pérennou; Martine De Calmès
MHATLex: Lexical Resources for Modelling the French Pronunciation

L00-1029 : Carine-Alexia Lavelle; Martine De Calmès; Guy Pérennou
Dialogue and Prompting Strategies Evaluation in the DEMON System

L00-1030 : Henk van den Heuvel; Lou Boves; Khalid Choukri; Simo Goddijn; Eric Sanders
SLR Validation: Present State of Affairs and Prospects

L00-1031 : Thierry Dutoit; Michel Bagein; Fabrice Malfrère; Vincent Pagel; Alain Ruelle; Nawfal Tounsi; Dominique Wynsberghe
EULER: an Open, Generic, Multilingual and Multi-platform Text-to-Speech System

L00-1032 : Marc Swerts; Emiel Krahmer
On the Use of Prosody for On-line Evaluation of Spoken Dialogue Systems

L00-1033 : I. Aduriz; E. Agirre; I. Aldezabal; X. Arregi; J. M. Arriola; X. Artola; K. Gojenola; A. Maritxalar; K. Sarasola; M. Urkia
A Word-level Morphosyntactic Analyzer for Basque

L00-1034 : Albert Russel; Hennie Brugman; Daan Broeder; Peter Wittenburg
The EUDICO Project, Multi Media Annotation over the Internet

L00-1035 : Anna Braasch; Sussi Olsen
Towards a Strategy for a Representation of Collocations - Extending the Danish PAROLE-lexicon

L00-1036 : Stavroula-Evita Fotinea; Athanassios Protopapas; Dimitris Dimitriadis; George Carayannis
Perceptual Evaluation of Text-to-Speech Implementation of Enclitic Stress in Greek

L00-1037 : Tami Rannon; Ofra Golani; Anat Goren; Sherrie Shammass; Ami Moyal
Creation of Spoken Hebrew Databases

L00-1038 : Damjan Vlaj; Janez Kaiser; Ralph Wilhelm; Ute Ziegenhain
PLEDIT - A New Efficient Tool for Management of Multilingual Pronunciation Lexica and Batchlists

L00-1039 : Rosa Estopà; Jordi Vivaldi; M. Teresa Cabré
Use of Greek and Latin Forms for Term Detection

L00-1040 : Maria Canelli; Daniele Grasso; Margaret King
Methods and Metrics for the Evaluation of Dictation Systems: a Case Study

L00-1041 : Noah A. Smith; Michael E. Jahr
Cairo: An Alignment Visualization Tool

L00-1042 : Andreas Mengel; Wolfgang Lezius
An XML-based Representation Format for Syntactically Annotated Corpora

L00-1043 : Ornella Corazzari; Nicoletta Calzolari; Antonio Zampolli
An Experiment of Lexical-Semantic Tagging of an Italian Corpus

L00-1044 : Nuria Bel; Federica Busa; Nicoletta Calzolari; Elisabetta Gola; Alessandro Lenci; Monica Monachini; Antoine Ogonowski; Ivonne Peters; Wim Peters; Nilda Ruimy; Marta Villegas; Antonio Zampolli
SIMPLE: A General Framework for the Development of Multilingual Lexicons

L00-1045 : Zygmunt Vetulani
Electronic Language Resources for Polish: POLEX, CEGLEX and GRAMLEX

L00-1046 : Rainer Siemund; Harald Höge; Siegfried Kunzmann; Krzysztof Marasek
SPEECON - Speech Data for Consumer Devices

L00-1047 : Antonio Moreno; Ralph Grishman; Susana López; Fernando Sánchez; Satoshi Sekine
A Treebank of Spanish and its Application to Parsing

L00-1048 : Susanne J. Jekat; Lorenzo Tessiore
End-to-End Evaluation of Machine Interpretation Systems: A Graphical Evaluation Tool

L00-1049 : X. Artola; A. Díaz de Ilarraza; N. Ezeiza; K. Gojenola; A. Maritxalar; A. Soroa
A Proposal for the Integration of NLP Tools using SGML-Tagged Documents

L00-1050 : Thierry Fontenelle
A Bilingual Electronic Dictionary for Frame Semantics

L00-1051 : Martin Braschler; Donna Harman; Michael Hess; Michael Kluck; Carol Peters; Peter Schäuble
The Evaluation of Systems for Cross-language Information Retrieval

L00-1052 : José Bettencourt Gonçalves; Rita Veloso
Spoken Portuguese: Geographic and Social Varieties

L00-1053 : Maria Fernanda Bacelar do Nascimento; Luisa Pereira; João Saramago
Portuguese Corpora at CLUL

L00-1054 : Antonio Moreno; Chantal Pérez
Reusing the Mikrokosmos Ontology for Concept-based Multilingual Terminology Databases

L00-1055 : Kimura Kazuhiro; Hirakawa Hideki
Abstraction of the EDR Concept Classification and its Effectiveness in Word Sense Disambiguation

L00-1056 : Alessandro Cucchiarelli; Enrico Faggioli; Paola Velardi
Will Very Large Corpora Play For Semantic Disambiguation The Role That Massive Computing Power Is Playing For Other AI-Hard Problems?

L00-1057 : Shuichi Itahashi
Guidelines for Japanese Speech Synthesizer Evaluation

L00-1058 : Masumi Narita
Constructing a Tagged E-J Parallel Corpus for Assisting Japanese Software Engineers in Writing English Abstracts

L00-1059 : Hiroyuki Shinnou; Masanori Ikeya
Extraction of Unknown Words Using the Probability of Accepting the Kanji Character Sequence as One Word

L00-1060 : Rosen Ivanov
Automatic Speech Segmentation in High Noise Condition

L00-1061 : Elisa Gavieiro-Villatte; Laurent Spaggiari
Open Ended Computerized Overview of Controlled Languages

L00-1062 : Rodolfo Delmonte
Shallow Parsing and Functional Structure in Italian Corpora

L00-1063 : Dimitrios Kokkinakis; Maria Toporowska Gronostaj; Karin Warmenius
Annotating, Disambiguating & Automatically Extending the Coverage of the Swedish SIMPLE Lexicon

L00-1064 : Diana Santos; Eckhard Bick
Providing Internet Access to Portuguese Corpora: the AC/DC Project

L00-1065 : Sharon Inkelas; Aylin Küntay; C. Orhan Orgun; Ronald Sprouse
Turkish Electronic Living Lexicon (TELL): A Lexical Database

L00-1066 : Wim Goedertier; Simo Goddijn; Jean-Pierre Martens
Orthographic Transcription of the Spoken Dutch Corpus

L00-1067 : Giulia Bernardis; Hervé Bourlard; Martin Rajman; Jean-Cédric Chappelier
Development of Acoustic and Linguistic Resources for Research and Evaluation in Interactive Vocal Information Servers

L00-1068 : Guillermo Rojo; Maria Concepción Álvarez; Pilar Alvariño; Adelaida Gil; María Paula Santalla; Susana Sotelo
An Architecture for Document Routing in Spanish: Two Language Components, Pre-processor and Parser

L00-1069 : John A. Bateman; Anthony F. Hartley
Target Suites for Evaluating the Coverage of Text Generators

L00-1070 : Claire Grover; Colin Matheson; Andrei Mikheev; Marc Moens
LT TTT - A Flexible Tokenisation Tool

L00-1071 : Albert Rilliard; Véronique Aubergé
Perception and Analysis of a Reiterant Speech Paradigm: a Functional Diagnostic of Synthetic Prosody

L00-1072 : Marcello Federico; Dimitri Giordani; Paolo Coletti
Development and Evaluation of an Italian Broadcast News Corpus

L00-1073 : Marta Villegas; Nuria Bel; Alessandro Lenci; Nicoletta Calzolari; Nilda Ruimy; Antonio Zampolli; Teresa Sadurní; Joan Soler
Multilingual Linguistic Resources: From Monolingual Lexicons to Bilingual Interrelated Lexicons

L00-1074 : Alessandro Lenci; Simonetta Montemagni; Vito Pirrelli; Claudia Soria
Where Opposites Meet. A Syntactic Meta-scheme for Corpus Annotation and Parsing Evaluation

L00-1075 : Paolo Allegrini; Simonetta Montemagni; Vito Pirrelli
Controlled Bootstrapping of Lexico-semantic Classes as a Bridge between Paradigmatic and Syntagmatic Knowledge: Methodology and Evaluation

L00-1076 : Rodger Kibble; Kees van Deemter
Coreference Annotation: Whither?

L00-1077 : R. López-Cózar; A.J. Rubio; J.E. Díaz Verdejo; A. De la Torre
Evaluation of a Dialogue System Based on a Generic Model that Combines Robust Speech Understanding and Mixed-initiative Control

L00-1078 : Cosmin Munteanu; Marian Boldea
MDWOZ: A Wizard of Oz Environment for Dialog Systems Development

L00-1079 : Dan Bohuş; Marian Boldea
A Web-based Text Corpora Development System

L00-1080 : Byron Georgantopoulos; Stelios Piperidis
Term-based Identification of Sentences for Text Summarisation

L00-1081 : Kristīne Levāne; Andrejs Spektors
Morphemic Analysis and Morphological Tagging of Latvian Corpus

L00-1082 : Patrick Kremer; Laurent Schmitt
Textual Information Retrieval Systems Test: The Point of View of an Organizer and Corpuses Provider

L00-1083 : Nelleke Oostdijk
The Spoken Dutch Corpus. Overview and First Evaluation

L00-1084 : Toni Badia; Àngels Egea
A Strategy for the Syntactic Parsing of Corpora: from Constraint Grammar Output to Unification-based Processing

L00-1085 : Joan Soler i Bou
Producing LRs in Parallel with Lexicographic Description: the DCC project

L00-1086 : Atsushi Fujii; Tetsuya Ishikawa
A Novelty-based Evaluation Method for Information Retrieval

L00-1087 : Ruslan Mitkov
Towards More Comprehensive Evaluation in Anaphora Resolution

L00-1088 : Joseph Polifroni; Stephanie Seneff
Galaxy-II as an Architecture for Spoken Dialogue Evaluation

L00-1089 : Marko Tadić
Building the Croatian-English Parallel Corpus

L00-1090 : Tamás Váradi
Lexical and Translation Equivalence in Parallel Corpora

L00-1091 : D. Broeder; H. Brugman; A. Russel; R. Skiba; P. Wittenburg
Towards a Standard for Meta-descriptions of Language Resources

L00-1092 : Einar Meister; Arvo Eek; Toomas Altosaar; Martti Vainio
Object-oriented Access to the Estonian Phonetic Database

L00-1093 : Adriana Roventini; Antonietta Alonge; Nicoletta Calzolari; Bernardo Magnini; Francesca Bertagna
ItalWordNet: a Large Semantic Database for Italian

L00-1094 : Cătălina Barbu
FAST - Towards a Semi-automatic Annotation of Corpora

L00-1095 : François Trouilleux; Eric Gaussier; Gabriel G. Bès; Annie Zaenen
Coreference Resolution Evaluation Based on Descriptive Specificity

L00-1096 : Dominique Dutoit
A Text->Meaning->Text Dictionary and Process

L00-1097 : Philippe Boula de Mareüil; Christophe d'Alessandro; François Yvon; Véronique Aubergé; Jacqueline Vaissière; Angélique Amelot
A French Phonetic Lexicon with Variants for Speech and Language Processing

L00-1098 : Laila Dybkjær; Morten Baun Møller; Niels Ole Bernsen; Michael Grosse; Martin Olsen; Amanda Schiffrin
Annotating Communication Problems Using the MATE Workbench

L00-1099 : Niels Ole Bernsen; Laila Dybkjær
A Methodology for Evaluating Spoken Language Dialogue Systems and Their Components

L00-1100 : Niamh Bohan; Elisabeth Breidt; Martin Volk
Evaluating Translation Quality as Input to Product Development

L00-1101 : Lars Ahrenberg; Magnus Merkel; Anna Sågvall Hein; Jörg Tiedemann
Evaluation of Word Alignment Systems

L00-1102 : Hervé Déjean
How To Evaluate and Compare Tagsets? A Proposal

L00-1103 : John White; Jennifer Doyon; Susan Talbott
Determining the Tolerance of Text-handling Tasks for MT Output

L00-1104 : Johann Gamper
A Parallel Corpus of Italian/German Legal Texts

L00-1105 : Sabine Buchholz; Antal van den Bosch
Integrating Seed Names and ngrams for a Named Entity List and Classifier

L00-1106 : Hideki Kashioka; Satosi Shirai
Automatically Expansion of Thesaurus Entries with a Different Thesaurus

L00-1107 : Daniel Zeman; Anoop Sarkar
Learning Verb Subcategorization from Corpora: Counting Frame Subsets

L00-1108 : Sašo Džeroski; Tomaž Erjavec; Jakub Zavrel
Morphosyntactic Tagging of Slovene: Evaluating Taggers and Tagsets

L00-1109 : Giorgio Micca; Alessandra Frasca; Maria Gabriella Di Benedetto
Cross-lingual Interpolation of Speech Recognition Models

L00-1110 : Wim Peters; Ivonne Peters
Lexicalised Systematic Polysemy in WordNet

L00-1111 : Björn Gambäck; Fredrik Olsson
Experiences of Language Engineering Algorithm Reuse

L00-1112 : Jana Klímová; Jan Kocek
Derivation in the Czech National Corpus

L00-1113 : Jakub Zavrel; Walter Daelemans
Bootstrapping a Tagged Corpus through Combination of Existing Heterogeneous Taggers

L00-1114 : Barbora Hladká
The Context (not only) for Humans

L00-1115 : Lars Borin
Something Borrowed, Something Blue: Rule-based Combination of POS Taggers

L00-1116 : Jon Mills
Screffva: A Lexicographer's Workbench

L00-1117 : Philippe Alcouffe; Nicolas Gacon; Claude Roux; Frédérique Segond
A Step toward Semantic Indexing of an Encyclopedic Corpus

L00-1118 : Thomas Brey; Gerhard Hanrieder; Paul Heisterkamp; Ludwig Hitzenberger; Peter Regel-Brietzmann
Issues in the Evaluation of Spoken Dialogue Systems - Experience from the ACCeSS Project

L00-1119 : Gees C. Stein; Tomek Strzalkowski; G. Bowden Wise; Amit Bagga
Evaluating Summaries for Multiple Documents in an Interactive Environment

L00-1120 : Jorge Kinoshita
Grammarless Bracketing in an Aligned Bilingual Corpus

L00-1121 : W.J. Black; J. McNaught; G.P. Zarri; A. Persidis; A. Brasher; L. Gilardoni; E. Bertino; G. Semeraro; P. Leo
A Semi-automatic System for Conceptual Annotation, its Application to Resource Construction and Evaluation

L00-1122 : Amy Isard; David McKelvie; Andreas Mengel; Morten Baun Møller
The MATE Workbench Annotation Tool, a Technical Description

L00-1123 : Rhys James Jones; John S. Mason; Louise Helliker; Mark Pawlewski
Recruitment Techniques for Minority Language Speech Databases: Some Observations

L00-1124 : Charles L. Wayne
Multilingual Topic Detection and Tracking: Successful Research Enabled by Corpora and Evaluation

L00-1125 : Montserrat Marimon Felipe; Jordi Porta Zamorano
PoS Disambiguation and Partial Parsing Bidirectional Interaction

L00-1126 : Hamish Cunnigham; Kalina Bontcheva; Valentin Tablan; Yorick Wilks
Software Infrastructure for Language Resources: a Taxonomy of Previous Work and a Requirements Analysis

L00-1127 : Nancy Ide; Patrice Bonhomme; Laurent Romary
XCES: An XML-based Encoding Standard for Linguistic Corpora

L00-1128 : Iason Demiros; Sotiris Boutsis; Voula Giouli; Maria Liakata; Harris Papageorgiou; Stelios Piperidis
Named Entity Recognition in Greek Texts

L00-1129 : Sotiris Boutsis; Prokopis Prokopidis; Voula Giouli; Stelios Piperidis
A Robust Parser for Unrestricted Greek Text

L00-1130 : Matej Rojc; Zdravko Kačič
A Computational Platform for Development of Morphologic and Phonetic Lexica

L00-1131 : Constantin Orăsan; Ramesh Krishnamurthy
An Open Architecture for the Construction and Administration of Corpora

L00-1132 : Matej Rojc; Zdravko Kačič
Design of Optimal Slovenian Speech Corpus for Use in the Concatenative Speech Synthesis System

L00-1133 : Constantin Orăsan
CLinkA A Coreferential Links Annotator

L00-1134 : Adam Kilgarriff; Colin Yallop
What's in a Thesaurus?

L00-1135 : Harris Papageorgiou; Prokopis Prokopidis; Voula Giouli; Stelios Piperidis
A Unified POS Tagging Architecture and its Application to Greek

L00-1136 : Patrice Bonhomme; Patrice Lopez
Resources for Lexicalized Tree Adjoining Grammars and XML Encoding: TagML

L00-1137 : Andreas Witt; Harald Lüngen; Dafydd Gibbon
Enhancing Speech Corpus Resources with Multiple Lexical Tag Layers

L00-1138 : Steven Bird; David Day; John Garofolo; John Henderson; Christophe Laprun; Mark Liberman
ATLAS: A Flexible and Extensible Architecture for Linguistic Annotation

L00-1139 : Pavel Skrelin; Tatiana Sherstinova
Models of Russian Text/Speech Interactive Databases for Supporting of Scientific, Practical and Cultural Researches

L00-1140 : Lluís de Yzaguirre; Marta Ribas; Jordi Vivaldi; M. Teresa Cabré
Some Technical Aspects about Aligning Near Languages

L00-1141 : Tony McEnery; Paul Baker; Lou Burnard
Corpus Resources and Minority Language Engineering

L00-1142 : Brigitte Krenn
CDB - A Database of Lexical Collocations

L00-1143 : Marilyn Walker; Lynette Hirschman; John Aberdeen
Evaluation for Darpa Communicator Spoken Dialogue Systems

L00-1144 : Edouard Geoffrois; Claude Barras; Steven Bird; Zhibiao Wu
Transcribing with Annotation Graphs

L00-1145 : Massimo Poesio
Annotating a Corpus to Develop and Evaluate Discourse Entity Realization Algorithms: Issues and Preliminary Results

L00-1146 : Steven Bird; Peter Buneman; Wang-Chiew Tan
Towards a Query Language for Annotation Graphs

L00-1147 : Catherine Macleod; Nancy Ide; Ralph Grishman
The American National Corpus: A Standardized Resource for American English

L00-1148 : Martha Palmer; Hoa Trang Dang; Joseph Rosenzweig
Semantic Tagging for the Penn Treebank

L00-1149 : Kiril Ribarov
Rule-based Tagging: Morphological Tagset versus Tagset of Analytical Functions

L00-1150 : Kiril Ribarov
The (Un)Deterministic Nature of Morphological Context

L00-1151 : David Day; Alan Goldschen; John Henderson
A Framework for Cross-Document Annotation

L00-1152 : Peggy Cadel; Hélène Ledouble
Extraction of Concepts and Multilingual Information Schemes from French and English Economics Documents

L00-1153 : Eric J. Breck; John D. Burger; Lisa Ferro; Lynette Hirschman; David House; Marc Light; Inderjeet Mani
How to Evaluate Your Question Answering System Every Day ... and Still Get Real Work Done

L00-1154 : Daniela Oppermann; Susanne Burger; Karl Weilhammer
What are Transcription Errors and Why are They made?

L00-1155 : Barbara Di Eugenio
On the Usage of Kappa to Evaluate Agreement on Coding Tasks

L00-1156 : Sun Le; Jin Youbing; Du Lin; Sun Yufang
Automatic Extraction of English-Chinese Term Lexicons from Noisy Bilingual Corpora

L00-1157 : Christopher Cieri; Mark Liberman
Issues in Corpus Creation and Distribution: The Evolution of the Linguistic Data Consortium

L00-1158 : Christopher Cieri; David Graff; Mark Liberman; Nii Martey; Stephanie Strassel
Large, Multilingual, Broadcast News Corpora for Cooperative Research in Topic Detection and Tracking: The TDT-2 and TDT-3 Corpus Efforts

L00-1159 : Yuji Matsumoto; Tatsuo Yamashita
Using Machine Learning Methods to Improve Quality of Tagged Corpora and Learning Models

L00-1160 : Stephanie Strassel; David Graff; Nii Martey; Christopher Cieri
Quality Control in Large Annotation Projects Involving Multiple Judges: The Case of the TDT Corpora

L00-1161 : Takehito Utsuro
Learning Preference of Dependency between Japanese Subordinate Clauses and its Evaluation in Parsing

L00-1162 : Lin-Shan Lee; Lee-Feng Chien
Live Lexicons and Dynamic Corpora Adapted to the Network Resources for Chinese Spoken Language Processing Applications in an Internet Era

L00-1163 : Lori Levin; Boris Bartlog; Ariadna Font Llitjos; Donna Gates; Alon Lavie; Dorcas Wallace; Taro Watanabe; Monika Woszczyna
Lessons Learned from a Task-based Evaluation of Speech-to-Speech Machine Translation

L00-1164 : Frank Van Eynde; Jakub Zavrel; Walter Daelemans
Part of Speech Tagging and Lemmatisation for the Spoken Dutch Corpus

L00-1165 : Karl Weilhammer; Daniela Oppermann; Susanne Burger
The Influence of Scenario Constraints on the Spontaneity of Speech. A Comparison of Dialogue Corpora

L00-1166 : Leonardo Lesmo; Vincenzo Lombardo
Automatic Assignment of Grammatical Relations

L00-1167 : Bernardo Magnini; Gabriela Cavaglià
Integrating Subject Field Codes into WordNet

L00-1168 : Cristina Bosco; Vincenzo Lombardo; Daniela Vassallo; Leonardo Lesmo
Building a Treebank for Italian: a Data-driven Annotation Schema

L00-1169 : Kyongho Min; William H. Wilson; Yoo-Jin Moon
Typographical and Orthographical Spelling Error Correction

L00-1170 : Jana Klímová; Karel Pala
Application of WordNet ILR in Czech Word-formation

L00-1171 : Byeongchang Kim; Jin-seok Lee; Jeongwon Cha; Geunbae Lee
POSCAT: A Morpheme-based Speech Corpus Annotation Tool

L00-1172 : Uwe Quasthoff; Christian Wolff
A Flexible Infrastructure for Large Monolingual Corpora

L00-1173 : Byung-Ju Kang; Key-Sun Choi
Automatic Transliteration and Back-transliteration by Decision Tree Learning

L00-1174 : Klaus Ries; Lori Levin; Liza Valle; Alon Lavie; Alex Waibel
Shallow Discourse Genre Annotation in CallHome Spanish

L00-1175 : Anne Abeillé; Lionel Clément; Alexandra Kinyon
Building a Treebank for French

L00-1176 : Paola Merlo; Suzanne Stevenson
Establishing the Upper Bound and Inter-judge Agreement of a Verb Classification Task

L00-1177 : Nadjet Bouayad-Agha
Layout Annotation in a Corpus of Patient Information Leaflets

L00-1178 : D. Vaufreydaz; C. Bergamini; J.F. Serignat; L. Besacier; M. Akbar
A New Methodology for Speech Corpora Definition from Internet Documents

L00-1179 : Luisa Bentivogli; Emanuele Pianta; Fabio Pianesi
Coping with Lexical Gaps when Building Aligned Multilingual Wordnets

L00-1180 : Young-Soog Chae; Key-Sun Choi
Design and Construction of Knowledge base for Verb using MRD and Tagged Corpus

L00-1181 : Young-Soog Chae; Key-Sun Choi
Introduction of KIBS (Korean Information Base System) Project

L00-1182 : John Bateman; Elke Teich; Geert-Jan Kruijff; Ivanna Kruijff-Korbayová; Serge Sharoff; Hana Skoumalová
Resources for Multilingual Text Generation in Three Slavic Languages

L00-1183 : Dafydd Gibbon; Thorsten Trippel
A Multi-view Hyperlexicon Resource for Speech and Language System Development

L00-1184 : Lynne Cahill; Christy Doran; Roger Evans; Rodger Kibble; Chris Mellish; D. Paiva; Mike Reape; Donia Scott; Neil Tipper
Enabling Resource Sharing in Language Generation: an Abstract Reference Architecture

L00-1185 : Zdravko Kačič; Bogomir Horvat; Aleksandra Zögling
Issues in Design and Collection of Large Telephone Speech Corpus for Slovenian Language

L00-1186 : Christophe Jouis; ARC A3
ARC A3: A Method for Evaluating Term Extracting Tools and/or Semantic Relations between Terms from Corpora

L00-1187 : Richard F. E. Sutcliffe; Sadao Kurohashi
A Parallel English-Japanese Query Collection for the Evaluation of On-Line Help Systems

L00-1188 : Dan Tufiş; Péter Dienes; Csaba Oravecz; Tamás Váradi
Principled Hidden Tagset Design for Tiered Tagging of Hungarian

L00-1189 : Felisa Verdejo; Julio Gonzalo; Anselmo Peñas; Fernando López; David Fernández
Evaluating Wordnets in Cross-language Information Retrieval: the ITEM Search Engine

L00-1190 : Dafydd Gibbon; Ana Paula Quirino Simões; Martin Matthiesen
An Optimised FS Pronunciation Resource Generator for Highly Inflecting Languages

L00-1191 : Gabriel Illouz
Sublanguage Dependent Evaluation: Toward Predicting NLP performances

L00-1192 : Jan-Torsten Milde; Markus Reinsch
The Universal XML Organizer: UXO

L00-1193 : Helka Folch; Serge Heiden; Benoît Habert; Serge Fleury; Gabriel Illouz; Pierre Lafon; Julien Nioche; Sophie Prévost
TyPTex: Inductive Typological Text Classification by Multivariate Statistical Analysis for NLP Systems Tuning/Evaluation

L00-1194 : Davide Turcato; Janine Toole; Stavroula Tsiplakou; Trude Heift; Paul McFetridge
An Approach to Lexical Development for Inflectional Languages

L00-1195 : Luzia Wittmann; Ricardo Daniel Ribeiro; Tânia Pêgo; Fernando Batista
Some Language Resources and Tools for Computational Processing of Portuguese at INESC

L00-1196 : Takehito Utsuro; Manabu Sassano
Minimally Supervised Japanese Named Entity Recognition: Resources and Evaluation

L00-1197 : Joyce Yue Chai
Evaluation of a Generic Lexical Semantic Resource in Information Extraction

L00-1198 : Jim Talley
The Establishment of Motorola's Human Language Data Resource Center: Addressing the Criticality of Language Resources in the Industrial Setting

L00-1199 : Katsunobu Itou; Kiyohiro Shikano; Tatsuya Kawahara; Kasuya Takeda; Atsushi Yamada; Akinori Itou; Takehito Utsuro; Tetsunori Kobayashi; Nobuaki Minematsu; Mikio Yamamoto; Shigeki Sagayama; Akinobu Lee
IPA Japanese Dictation Free Software Project

L00-1200 : Kikuo Maekawa; Hanae Koiso; Sadaoki Furui; Hitoshi Isahara
Spontaneous Speech Corpus of Japanese

L00-1201 : Sean Boisen; Michael R. Crystal; Richard Schwartz; Rebecca Stone; Ralph Weischedel
Annotating Resources for Information Extraction

L00-1202 : Thierry Declerck; Alexander Werner Jachmann; Hans Uszkoreit
The New Edition of the Natural Language Software Registry (an Initiative of ACL hosted at DFKI)

L00-1203 : Jong-mi Kim
Design Methodology for Bilingual Pronunciation Dictionary

L00-1204 : Constandina Economou; Spyros Raptis; Gregory Stainhaouer
LEXIPLOIGISSI: An Educational Platform for the Teaching of Terminology in Greece

L00-1205 : Malgorzata Marciniak; Agnieszka Mykowiecka; Anna Kupść; Adam Przepiórkowski
An HPSG-Annotated Test Suite for Polish

L00-1206 : Finn Tore Johansen; Narada Warakagoda; Børge Lindberg; Gunnar Lehtinen; Zdravko Kačič; Andreh Žgank; Kjell Elenius; Gampiero Salvi
The COST 249 SpeechDat Multilingual Reference Recogniser

L00-1207 : Marianna Katsoyannou; Eleni Efthimiou
Terminology Encoding in View of Multifunctional NLP Resources

L00-1208 : Key-Sun Choi; Young-Soog Chae
Terminology in Korea: KORTERM

L00-1209 : Gaëlle Birocheau
Morphological Tagging to Resolve Morphological Ambiguities

L00-1210 : Sonja Nießen; Franz Josef Och; Gregor Leusch; Hermann Ney
An Evaluation Tool for Machine Translation: Fast Evaluation for MT Research

L00-1211 : Fiammetta Namer; Georgette Dal
GéDériF: Automatic Generation and Analysis of Morphologically Constructed Lexical Resources

L00-1212 : Josué Ndamba; Jean Silence Bayamboussa
Le Programme Compalex (COMPAraison LEXicale)

L00-1213 : David Graff; Steven Bird
Many Uses, Many Annotations for Large Speech Corpora: Switchboard and TDT as Case Studies

L00-1214 : Gerhard Budin; Alan K. Melby
Accessibility of Multilingual Terminological Resources - Current Problems and Prospects for the Future

L00-1215 : Bilel Gargouri; Mohamed Jmaiel; Abdelmajid Ben Hamadou
Using a Formal Approach to Evaluate Grammars

L00-1216 : Alvin Martin; Mark Przybocki
Design Issues in Text-Independent Speaker Recognition Evaluation

L00-1217 : Fei Xia; Martha Palmer; Nianwen Xue; Mary Ellen Okurowski; John Kovarik; Fu-Dong Chiou; Shizhe Huang; Tony Kroch; Mitch Marcus
Developing Guidelines and Ensuring Consistency for Chinese Text Annotation

L00-1218 : Jerneja Gros; France Mihelič; Simon Dobrišek; Tomaž Erjavec; Mario Žganec
Corpora of Slovene Spoken Language for Multi-lingual Applications

L00-1219 : E. Kavallieratou; N. Liolios; E. Koutsogeorgos; N. Fakotakis; G. Kokkinakis
GRUHD: A Greek database of Unconstrained Handwriting

L00-1220 : France Mihelič; Jerneja Gros; Elmar Nöth; Volker Warnke
Labeling of Prosodic Events in Slovenian Speech Database GOPOLIS

L00-1221 : Catia Cucchiarini; Johan Van Hoorde; Elizabeth D'Halleweyn
NL-Translex: Machine Translation for Dutch

L00-1222 : Jaroslava Hlaváčová
Rarity of Words in a Language and in a Corpus

L00-1223 : Ángel Martín Municio; Guillermo Rojo; Fernando Sánchez León; Octavio Pinillos
Language Resources Development at the Spanish Royal Academy

L00-1224 : Irina Prodanof; Amedeo Cappelli; Lorenzo Moretti
Reusability as Easy Adaptability: A Substantial Advance in NL Technology

L00-1225 : Andrew Bredenkamp; Berthold Crysmann; Mirela Petrea
Looking for Errors: A Declarative Formalism for Resource-adaptive Language Checking

L00-1226 : Martin Gellerstam; Yvonne Cederholm; Torgny Rasmark
The Bank of Swedish

L00-1227 : George Tambouratzis; Stella Markantonatou; Nikolaos Hairetakis; George Carayannis
Automatic Style Categorisation of Corpora in the Greek Language

L00-1228 : Aristomenis Thanopoulos; Nikos Fakotakis; George Kokkinakis
Automatic Extraction of Semantic Similarity of Words from Raw Technical Texts

L00-1229 : H. Bonneau-Maynard; L. Devillers; S. Rosset
Predictive Performance of Dialog Systems

L00-1230 : Penny Labropoulou; Elena Mantzari; Harris Papageorgiou; Maria Gavrilidou
Automatic Generation of Dictionary Definitions from a Computational Lexicon

L00-1231 : Nicole Beringer; Marcia Neff
Regional Pronunciation Variants for Automatic Segmentation

L00-1232 : Mario Refice; Michelina Savino; Marco Altieri; Roberto Altieri
SegWin: a Tool for Segmenting, Annotating, and Controlling the Creation of a Database of Spoken Italian Varieties

L00-1233 : Klaus Bengler
Automotive Speech-Recognition - Success Conditions Beyond Recognition Rates

L00-1234 : Wolfgang Menzel; Eric Atwell; Patrizia Bonaventura; Daniel Herron; Peter Howarth; Rachel Morton; Clive Souter
The ISLE Corpus of Non-Native Spoken English

L00-1235 : Kallirroi Georgila; Nikos Fakotakis; George Kokkinakis
A Graphical Parametric Language-Independent Tool for the Annotation of Speech Corpora

L00-1236 : Georges Vignaux
The PAROLE Program

L00-1237 : Stéphane Chaudiron; Khalid Choukri; Audrey Mance; Valérie Mapelli
For a Repository of NLP Tools

L00-1238 : Jeffrey Allen; Khalid Choukri
Survey of Language Engineering Needs: a Language Resources Perspective

L00-1239 : Jo Calder
Interarbora and Thistle - Delivering Linguistic Structure by the Internet

L00-1240 : George Demetriou; Robert Gaizauskas
Automatically Augmenting Terminological Lexicons from Untagged Text

L00-1241 : Andrea Setzer; Robert Gaizauskas
Annotating Events and Temporal Information in Newswire Texts

L00-1242 : Bonnie J. Dorr; Gina-Anne Levow; Dekang Lin; Scott Thomas
Chinese-English Semantic Resource Construction

L00-1243 : Vera Fluhr-Semenova; Christian Fluhr; Stéphanie Brisson
Production of NLP-oriented Bilingual Language Resources from Human-oriented dictionaries

L00-1244 : J.C. Roux; E.C. Botha; J.A. du Preez
Developing a Multilingual Telephone Based Information System in African Languages

L00-1245 : Roberto Basili; Maria Teresa Pazienza; Michele Vindigni; Fabio Massimo Zanzotto
Tuning Lexicons to New Operational Scenarios

L00-1246 : José A.R. Fonollosa; Asunción Moreno
SpeechDat-Car Fixed Platform

L00-1247 : Thorsten Brants
Inter-annotator Agreement for a German Newspaper Corpus

L00-1248 : Thorsten Brants; Oliver Plaehn
Interactive Corpus Annotation

L00-1249 : Tomaž Erjavec; Roger Evans; Nancy Ide; Adam Kilgarriff
The Concede Model for Lexical Databases

L00-1250 : Nick Hatzigeorgiu; Maria Gavrilidou; Stelios Piperidis; George Carayannis; Anastasia Papakostopoulou; Athanassia Spiliotopoulou; Anna Vacalopoulou; Penny Labropoulou; Elena Mantzari; Harris Papageorgiou; Iason Demiros
Design and Implementation of the Online ILSP Greek Corpus

L00-1251 : Saturnino Luz
A Software Toolkit for Sharing and Accessing Corpora Over the Internet

L00-1252 : Ülle Viks
Tools for the Generation of Morphological Entries in Dictionaries

L00-1253 : Paula Guerreiro
Improving Lexical Databases with Collocational Information: Data from Portuguese

L00-1254 : Kiyoaki Shirai; Hozumi Tanaka; Takenobu Tokunaga
Semi-automatic Construction of a Tree-annotated Corpus Using an Iterative Learning Statistical Language Model

L00-1255 : Marilyn Mason
Issues from Corpus Analysis that have influenced the On-going Development of Various Haitian Creole Text- and Speech-based NLP Systems and Applications

L00-1256 : David Portabella; Albert Febrer; Asunción Moreno
NaniTrans: a Speech Labelling Tool

L00-1257 : Sanda M. Harabagiu; Steven J. Maiorano
Acquisition of Linguistic Patterns for Knowledge-based Information Extraction

L00-1258 : Elisabeth D'Halleweyn; Erwin Dewallef; Jeannine Beeken
A Platform for Dutch in Human Language Technologies

L00-1259 : Marilyn Walker; Candace Kamm; Julie Boland
Developing and Testing General Models of Spoken Dialogue System Peformance

L00-1260 : Claude de Loupy; Marc El-Bèze
Using Few Clues Can Compensate the Small Amount of Resources Available for Word Sense Disambiguation

L00-1261 : George Mikros; George Carayannis
Modern Greek Corpus Taxonomy

L00-1262 : Patrick Paroubek
Language Resources as by-Product of Evaluation: The MULTITAG Example

L00-1263 : Judith L. Klavans; Nina Wacholder; David K. Evans
Evaluation of Computational Linguistic Techniques for Identifying Significant Topics for Browsing Applications

L00-1264 : Satoshi Nakamura; Kazuo Hiyane; Futoshi Asano; Takanobu Nishiura; Takeshi Yamada
Acoustical Sound Database in Real Environments for Sound Scene Understanding and Hands-Free Speech Recognition

L00-1265 : George Demetriou; Eric Atwell; Clive Souter
Using Lexical Semantic Knowledge from Machine Readable Dictionaries for Domain Independent Language Modelling

L00-1266 : L. Cristoforetti; M. Matassoni; M. Omologo; P. Svaizer; E. Zovato
Annotation of a Multichannel Noisy Speech Corpus

L00-1267 : John Kontos; Ioanna Malagardi; Spyros Fountoukis
ARISTA Generative Lexicon for Compound Greek Medical Terms

L00-1268 : Knut Hofland
A Self-Expanding Corpus Based on Newspapers on the Web

L00-1269 : Janne Bondi Johannessen; Anders Nøklestad; Kristin Hagen
A Web-based Advanced and User Friendly System: The Oslo Corpus of Tagged Norwegian Texts

L00-1270 : Nick Campbell
COCOSDA - a Progress Report

L00-1271 : Ivonne Peters; Wim Peters
The Treatment of Adjectives in SIMPLE: Theoretical Observations

L00-1272 : Christine Michel
Cardinal, Nominal or Ordinal Similarity Measures in Comparative Evaluation of Information Retrieval Process

L00-1273 : Laurie E. Damianos; Jill Drury; Tari Fanderclai; Lynette Hirschman; Jeff Kurtz; Beatrice Oshika
Evaluating Multi-party Multi-modal Systems

L00-1274 : Claudia Kunze
Extension and Use of GermaNet, a Lexical-Semantic Database

L00-1275 : Serge A.Yablonsky
Russian Monitor Corpora: Composition, Linguistic Encoding and Internet Publication

L00-1276 : Ann Copestake; Dan Flickinger
An Open Source Grammar Development Environment and Broad-coverage English Grammar Using HPSG

L00-1277 : Sun Maosong; Sun Honglin; Huang Changning; Zhang Pu; Xing Hongbing; Zhou Qiang
Hua Yu: A Word-segmented and Part-Of-Speech Tagged Chinese Corpus

L00-1278 : Asunción Moreno; Børge Lindberg; Christoph Draxler; Gaël Richard; Khalid Choukri; Stephan Euler; Jeffrey Allen
SPEECHDAT-CAR. A Large Speech Database for Automotive Environments

L00-1279 : Giovanna Turrini; Laura Cignoni; Alessandro Paccosi
Addizionario: an Interactive Hypermedia Tool for Language Learning

L00-1280 : Khalid Choukri; Audrey Mance; Valérie Mapelli
Recent Developments within the European Language Resources Association (ELRA)