Daniel M. Bikel

Also published as: Dan Bikel, Daniel Bikel


2022

pdf bib
Exploring Dual Encoder Architectures for Question Answering
Zhe Dong | Jianmo Ni | Dan Bikel | Enrique Alfonseca | Yuan Wang | Chen Qu | Imed Zitouni
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Dual encoders have been used for question-answering (QA) and information retrieval (IR) tasks with good results. There are two major types of dual encoders, Siamese Dual Encoders (SDE), with parameters shared across two encoders, and Asymmetric Dual Encoder (ADE), with two distinctly parameterized encoders. In this work, we explore the dual encoder architectures for QA retrieval tasks. By evaluating on MS MARCO, open domain NQ, and the MultiReQA benchmarks, we show that SDE performs significantly better than ADE. We further propose three different improved versions of ADEs. Based on the evaluation of QA retrieval tasks and direct analysis of the embeddings, we demonstrate that sharing parameters in projection layers would enable ADEs to perform competitively with SDEs.

2021

pdf bib
Benchmarking Scalable Methods for Streaming Cross Document Entity Coreference
Robert L Logan IV | Andrew McCallum | Sameer Singh | Dan Bikel
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Streaming cross document entity coreference (CDC) systems disambiguate mentions of named entities in a scalable manner via incremental clustering. Unlike other approaches for named entity disambiguation (e.g., entity linking), streaming CDC allows for the disambiguation of entities that are unknown at inference time. Thus, it is well-suited for processing streams of data where new entities are frequently introduced. Despite these benefits, this task is currently difficult to study, as existing approaches are either evaluated on datasets that are no longer available, or omit other crucial details needed to ensure fair comparison. In this work, we address this issue by compiling a large benchmark adapted from existing free datasets, and performing a comprehensive evaluation of a number of novel and existing baseline models. We investigate: how to best encode mentions, which clustering algorithms are most effective for grouping mentions, how models transfer to different domains, and how bounding the number of mentions tracked during inference impacts performance. Our results show that the relative performance of neural and feature-based mention encoders varies across different domains, and in most cases the best performance is achieved using a combination of both approaches. We also find that performance is minimally impacted by limiting the number of tracked mentions.

pdf bib
MOLEMAN: Mention-Only Linking of Entities with a Mention Annotation Network
Nicholas FitzGerald | Dan Bikel | Jan Botha | Daniel Gillick | Tom Kwiatkowski | Andrew McCallum
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

We present an instance-based nearest neighbor approach to entity linking. In contrast to most prior entity retrieval systems which represent each entity with a single vector, we build a contextualized mention-encoder that learns to place similar mentions of the same entity closer in vector space than mentions of different entities. This approach allows all mentions of an entity to serve as “class prototypes” as inference involves retrieving from the full set of labeled entity mentions in the training set and applying the nearest mention neighbor’s entity label. Our model is trained on a large multilingual corpus of mention pairs derived from Wikipedia hyperlinks, and performs nearest neighbor inference on an index of 700 million mentions. It is simpler to train, gives more interpretable predictions, and outperforms all other systems on two multilingual entity linking benchmarks.

2008

pdf bib
Event Matching Using the Transitive Closure of Dependency Relations
Daniel M. Bikel | Vittorio Castelli
Proceedings of ACL-08: HLT, Short Papers

2004

pdf bib
A Distributional Analysis of a Lexicalized Statistical Parsing Model
Daniel M. Bikel
Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing

pdf bib
Intricacies of Collins’ Parsing Model
Daniel M. Bikel
Computational Linguistics, Volume 30, Number 4, December 2004

2002

pdf bib
Recovering Latent Information in Treebanks
David Chiang | Daniel M. Bikel
COLING 2002: The 19th International Conference on Computational Linguistics

2000

pdf bib
Two Statistical Parsing Models Applied to the Chinese Treebank
Daniel M. Bikel | David Chiang
Second Chinese Language Processing Workshop

pdf bib
Automatic WordNet Mapping Using Word Sense Disambiguation
Daniel M. Bikel
2000 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora

pdf bib
A Statistical Model for Parsing and Word-Sense Disambiguation
Daniel M. Bikel
2000 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora

1997

pdf bib
Nymble: a High-Performance Learning Name-finder
Daniel M. Bikel | Scott Miller | Richard Schwartz | Ralph Weischedel
Fifth Conference on Applied Natural Language Processing

1996

pdf bib
Progress in Information Extraction
Ralph Weischedel | Sean Boisen | Daniel Bikel | Robert Bobrow | Michael Crystal | William Ferguson | Allan Wechsler | The PLUM Research Group
TIPSTER TEXT PROGRAM PHASE II: Proceedings of a Workshop held at Vienna, Virginia, May 6-8, 1996

pdf bib
Approaches in MET (Multi-Lingual Entity Task)
Damaris Ayuso | Daniel Bikel | Tasha Hall | Erik Peterson | Ralph Weischedel | Patrick Jost
TIPSTER TEXT PROGRAM PHASE II: Proceedings of a Workshop held at Vienna, Virginia, May 6-8, 1996