David A. Smith

Also published as: David Addison Smith, David Smith


2019

pdf bib
Noisy Neural Language Modeling for Typing Prediction in BCI Communication
Rui Dong | David Smith | Shiran Dudy | Steven Bedrick
Proceedings of the Eighth Workshop on Speech and Language Processing for Assistive Technologies

Language models have broad adoption in predictive typing tasks. When the typing history contains numerous errors, as in open-vocabulary predictive typing with brain-computer interface (BCI) systems, we observe significant performance degradation in both n-gram and recurrent neural network language models trained on clean text. In evaluations of ranking character predictions, training recurrent LMs on noisy text makes them much more robust to noisy histories, even when the error model is misspecified. We also propose an effective strategy for combining evidence from multiple ambiguous histories of BCI electroencephalogram measurements.

2018

pdf bib
Modeling the Decline in English Passivization
Liwen Hou | David Smith
Proceedings of the Society for Computation in Linguistics (SCiL) 2018

pdf bib
A Multi-Context Character Prediction Model for a Brain-Computer Interface
Shiran Dudy | Shaobin Xu | Steven Bedrick | David Smith
Proceedings of the Second Workshop on Subword/Character LEvel Models

Brain-computer interfaces and other augmentative and alternative communication devices introduce language-modeing challenges distinct from other character-entry methods. In particular, the acquired signal of the EEG (electroencephalogram) signal is noisier, which, in turn, makes the user intent harder to decipher. In order to adapt to this condition, we propose to maintain ambiguous history for every time step, and to employ, apart from the character language model, word information to produce a more robust prediction system. We present preliminary results that compare this proposed Online-Context Language Model (OCLM) to current algorithms that are used in this type of setting. Evaluation on both perplexity and predictive accuracy demonstrates promising results when dealing with ambiguous histories in order to provide to the front end a distribution of the next character the user might type.

pdf bib
Multi-Input Attention for Unsupervised OCR Correction
Rui Dong | David Smith
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

We propose a novel approach to OCR post-correction that exploits repeated texts in large corpora both as a source of noisy target outputs for unsupervised training and as a source of evidence when decoding. A sequence-to-sequence model with attention is applied for single-input correction, and a new decoder with multi-input attention averaging is developed to search for consensus among multiple sequences. We design two ways of training the correction model without human annotation, either training to match noisily observed textual variants or bootstrapping from a uniform error model. On two corpora of historical newspapers and books, we show that these unsupervised techniques cut the character and word error rates nearly in half on single inputs and, with the addition of multi-input decoding, can rival supervised methods.

2017

pdf bib
Can You See the (Linguistic) Difference? Exploring Mass/Count Distinction in Vision
David Addison Smith | Sandro Pezzelle | Francesca Franzon | Chiara Zanini | Raffaella Bernardi
IWCS 2017 — 12th International Conference on Computational Semantics — Short papers

2016

pdf bib
Online Multilingual Topic Models with Multi-Level Hyperpriors
Kriste Krstovski | David Smith | Michael J. Kurtz
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Bootstrapping Translation Detection and Sentence Extraction from Comparable Corpora
Kriste Krstovski | David Smith
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

2014

pdf bib
Detecting and Evaluating Local Text Reuse in Social Networks
Shaobin Xu | David Smith | Abigail Mullen | Ryan Cordell
Proceedings of the Joint Workshop on Social Dynamics and Personal Attributes in Social Media

2013

pdf bib
Online Polylingual Topic Models for Fast Document Translation Detection
Kriste Krstovski | David A. Smith
Proceedings of the Eighth Workshop on Statistical Machine Translation

2012

pdf bib
A Dictionary of Wisdom and Wit: Learning to Extract Quotable Phrases
Michael Bendersky | David Smith
Proceedings of the NAACL-HLT 2012 Workshop on Computational Linguistics for Literature

pdf bib
Discovering Factions in the Computational Linguistics Community
Yanchuan Sim | Noah A. Smith | David A. Smith
Proceedings of the ACL-2012 Special Workshop on Rediscovering 50 Years of Discoveries

pdf bib
Parse, Price and Cut—Delayed Column and Row Generation for Graph Based Parsers
Sebastian Riedel | David Smith | Andrew McCallum
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

pdf bib
Improving NLP through Marginalization of Hidden Syntactic Structure
Jason Naradowsky | Sebastian Riedel | David Smith
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

pdf bib
Grammarless Parsing for Joint Inference
Jason Naradowsky | Tim Vieira | David Smith
Proceedings of COLING 2012

2011

pdf bib
Joint Annotation of Search Queries
Michael Bendersky | W. Bruce Croft | David A. Smith
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf bib
A Discriminative Model for Joint Morphological Disambiguation and Dependency Parsing
John Lee | Jason Naradowsky | David A. Smith
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf bib
A Minimally Supervised Approach for Detecting and Ranking Document Translation Pairs
Kriste Krstovski | David A. Smith
Proceedings of the Sixth Workshop on Statistical Machine Translation

2010

pdf bib
Relaxed Marginal Inference and its Application to Dependency Parsing
Sebastian Riedel | David A. Smith
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

2009

pdf bib
Parser Adaptation and Projection with Quasi-Synchronous Grammar Features
David A. Smith | Jason Eisner
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf bib
Polylingual Topic Models
David Mimno | Hanna M. Wallach | Jason Naradowsky | David A. Smith | Andrew McCallum
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

2008

pdf bib
Dependency Parsing by Belief Propagation
David Smith | Jason Eisner
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing

pdf bib
HotSpots: Visualizing Edits to a Text
Srinivas Bangalore | David Smith
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing

2007

pdf bib
Probabilistic Models of Nonprojective Dependency Trees
David A. Smith | Noah A. Smith
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)

pdf bib
Bootstrapping Feature-Rich Dependency Parsers with Entropic Priors
David A. Smith | Jason Eisner
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)

pdf bib
Log-Linear Models of Non-Projective Trees, k-best MST Parsing and Tree-Ranking
Keith Hall | Jiří Havelka | David A. Smith
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)

2006

pdf bib
Minimum Risk Annealing for Training Log-Linear Models
David A. Smith | Jason Eisner
Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions

pdf bib
Vine Parsing and Minimum Risk Reranking for Speed and Precision
Markus Dreyer | David A. Smith | Noah A. Smith
Proceedings of the Tenth Conference on Computational Natural Language Learning (CoNLL-X)

pdf bib
Quasi-Synchronous Grammars: Alignment by Soft Projection of Syntactic Dependencies
David Smith | Jason Eisner
Proceedings on the Workshop on Statistical Machine Translation

2005

pdf bib
Context-Based Morphological Disambiguation with Random Fields
Noah A. Smith | David A. Smith | Roy W. Tromble
Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing

2004

pdf bib
Bilingual Parsing with Factored Estimation: Using English to Parse Korean
David A. Smith | Noah A. Smith
Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing

pdf bib
A Smorgasbord of Features for Statistical Machine Translation
Franz Josef Och | Daniel Gildea | Sanjeev Khudanpur | Anoop Sarkar | Kenji Yamada | Alex Fraser | Shankar Kumar | Libin Shen | David Smith | Katherine Eng | Viren Jain | Zhen Jin | Dragomir Radev
Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics: HLT-NAACL 2004

2003

pdf bib
Bootstrapping toponym classifiers
David A. Smith | Gideon S. Mann
Proceedings of the HLT-NAACL 2003 Workshop on Analysis of Geographic References