Dan Klein


2019

pdf bib
A Deep Factorization of Style and Structure in Fonts
Akshay Srivatsan | Jonathan Barron | Dan Klein | Taylor Berg-Kirkpatrick
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

We propose a deep factorization model for typographic analysis that disentangles content from style. Specifically, a variational inference procedure factors each training glyph into the combination of a character-specific content embedding and a latent font-specific style variable. The underlying generative model combines these factors through an asymmetric transpose convolutional process to generate the image of the glyph itself. When trained on corpora of fonts, our model learns a manifold over font styles that can be used to analyze or reconstruct new, unseen fonts. On the task of reconstructing missing glyphs from an unknown font given only a small number of observations, our model outperforms both a strong nearest neighbors baseline and a state-of-the-art discriminative model from prior work.

pdf bib
Pragmatically Informative Text Generation
Sheng Shen | Daniel Fried | Jacob Andreas | Dan Klein
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

We improve the informativeness of models for conditional text generation using techniques from computational pragmatics. These techniques formulate language production as a game between speakers and listeners, in which a speaker should generate output text that a listener can use to correctly identify the original input that the text describes. While such approaches are widely used in cognitive science and grounded language learning, they have received less attention for more standard language generation tasks. We consider two pragmatic modeling methods for text generation: one where pragmatics is imposed by information preservation, and another where pragmatics is imposed by explicit modeling of distractors. We find that these methods improve the performance of strong existing systems for abstractive summarization and generation from structured meaning representations.

pdf bib
Cross-Domain Generalization of Neural Constituency Parsers
Daniel Fried | Nikita Kitaev | Dan Klein
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Neural parsers obtain state-of-the-art results on benchmark treebanks for constituency parsing—but to what degree do they generalize to other domains? We present three results about the generalization of neural parsers in a zero-shot setting: training on trees from one corpus and evaluating on out-of-domain corpora. First, neural and non-neural parsers generalize comparably to new domains. Second, incorporating pre-trained encoder representations into neural parsers substantially improves their performance across all domains, but does not give a larger relative improvement for out-of-domain treebanks. Finally, despite the rich input representations they learn, neural parsers still benefit from structured output prediction of output trees, yielding higher exact match accuracy and stronger generalization both to larger text spans and to out-of-domain corpora. We analyze generalization on English and Chinese corpora, and in the process obtain state-of-the-art parsing results for the Brown, Genia, and English Web treebanks.

pdf bib
Pre-Learning Environment Representations for Data-Efficient Neural Instruction Following
David Gaddy | Dan Klein
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

We consider the problem of learning to map from natural language instructions to state transitions (actions) in a data-efficient manner. Our method takes inspiration from the idea that it should be easier to ground language to concepts that have already been formed through pre-linguistic observation. We augment a baseline instruction-following learner with an initial environment-learning phase that uses observations of language-free state transitions to induce a suitable latent representation of actions before processing the instruction-following training data. We show that mapping to pre-learned representations substantially improves performance over systems whose representations are learned from limited instructional data alone.

pdf bib
Multilingual Constituency Parsing with Self-Attention and Pre-Training
Nikita Kitaev | Steven Cao | Dan Klein
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

We show that constituency parsing benefits from unsupervised pre-training across a variety of languages and a range of pre-training conditions. We first compare the benefits of no pre-training, fastText, ELMo, and BERT for English and find that BERT outperforms ELMo, in large part due to increased model capacity, whereas ELMo in turn outperforms the non-contextual fastText embeddings. We also find that pre-training is beneficial across all 11 languages tested; however, large model sizes (more than 100 million parameters) make it computationally expensive to train separate models for each language. To address this shortcoming, we show that joint multilingual pre-training and fine-tuning allows sharing all but a small number of parameters between ten languages in the final model. The 10x reduction in model size compared to fine-tuning one model per language causes only a 3.2% relative error increase in aggregate. We further explore the idea of joint fine-tuning and show that it gives low-resource languages a way to benefit from the larger datasets of other languages. Finally, we demonstrate new state-of-the-art results for 11 languages, including English (95.8 F1) and Chinese (91.8 F1).

pdf bib
Are You Looking? Grounding to Multiple Modalities in Vision-and-Language Navigation
Ronghang Hu | Daniel Fried | Anna Rohrbach | Dan Klein | Trevor Darrell | Kate Saenko
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Vision-and-Language Navigation (VLN) requires grounding instructions, such as “turn right and stop at the door”, to routes in a visual environment. The actual grounding can connect language to the environment through multiple modalities, e.g. “stop at the door” might ground into visual objects, while “turn right” might rely only on the geometric structure of a route. We investigate where the natural language empirically grounds under two recent state-of-the-art VLN models. Surprisingly, we discover that visual features may actually hurt these models: models which only use route structure, ablating visual features, outperform their visual counterparts in unseen new environments on the benchmark Room-to-Room dataset. To better use all the available modalities, we propose to decompose the grounding procedure into a set of expert models with access to different modalities (including object detections) and ensemble them at prediction time, improving the performance of state-of-the-art models on the VLN task.

2018

pdf bib
What’s Going On in Neural Constituency Parsers? An Analysis
David Gaddy | Mitchell Stern | Dan Klein
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

A number of differences have emerged between modern and classic approaches to constituency parsing in recent years, with structural components like grammars and feature-rich lexicons becoming less central while recurrent neural network representations rise in popularity. The goal of this work is to analyze the extent to which information provided directly by the model structure in classical systems is still being captured by neural methods. To this end, we propose a high-performance neural model (92.08 F1 on PTB) that is representative of recent work and perform a series of investigative experiments. We find that our model implicitly learns to encode much of the same information that was explicitly provided by grammars and lexicons in the past, indicating that this scaffolding can largely be subsumed by powerful general-purpose neural machinery.

pdf bib
Unified Pragmatic Models for Generating and Following Instructions
Daniel Fried | Jacob Andreas | Dan Klein
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

We show that explicit pragmatic inference aids in correctly generating and following natural language instructions for complex, sequential tasks. Our pragmatics-enabled models reason about why speakers produce certain instructions, and about how listeners will react upon hearing them. Like previous pragmatic models, we use learned base listener and speaker models to build a pragmatic speaker that uses the base listener to simulate the interpretation of candidate descriptions, and a pragmatic listener that reasons counterfactually about alternative descriptions. We extend these models to tasks with sequential structure. Evaluation of language generation and interpretation shows that pragmatic inference improves state-of-the-art listener models (at correctly interpreting human instructions) and speaker models (at producing instructions correctly interpreted by humans) in diverse settings.

pdf bib
Learning with Latent Language
Jacob Andreas | Dan Klein | Sergey Levine
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

The named concepts and compositional operators present in natural language provide a rich source of information about the abstractions humans use to navigate the world. Can this linguistic background knowledge improve the generality and efficiency of learned classifiers and control policies? This paper aims to show that using the space of natural language strings as a parameter space is an effective way to capture natural task structure. In a pretraining phase, we learn a language interpretation model that transforms inputs (e.g. images) into outputs (e.g. labels) given natural language descriptions. To learn a new concept (e.g. a classifier), we search directly in the space of descriptions to minimize the interpreter’s loss on training examples. Crucially, our models do not require language data to learn these concepts: language is used only in pretraining to impose structure on subsequent learning. Results on image classification, text editing, and reinforcement learning show that, in all settings, models with a linguistic parameterization outperform those without.

pdf bib
Constituency Parsing with a Self-Attentive Encoder
Nikita Kitaev | Dan Klein
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

We demonstrate that replacing an LSTM encoder with a self-attentive architecture can lead to improvements to a state-of-the-art discriminative constituency parser. The use of attention makes explicit the manner in which information is propagated between different locations in the sentence, which we use to both analyze our model and propose potential improvements. For example, we find that separating positional and content information in the encoder can lead to improved parsing accuracy. Additionally, we evaluate different approaches for lexical representation. Our parser achieves new state-of-the-art results for single models trained on the Penn Treebank: 93.55 F1 without the use of any external data, and 95.13 F1 when using pre-trained word representations. Our parser also outperforms the previous best-published accuracy figures on 8 of the 9 languages in the SPMRL dataset.

pdf bib
Policy Gradient as a Proxy for Dynamic Oracles in Constituency Parsing
Daniel Fried | Dan Klein
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Dynamic oracles provide strong supervision for training constituency parsers with exploration, but must be custom defined for a given parser’s transition system. We explore using a policy gradient method as a parser-agnostic alternative. In addition to directly optimizing for a tree-level metric such as F1, policy gradient has the potential to reduce exposure bias by allowing exploration during training; moreover, it does not require a dynamic oracle for supervision. On four constituency parsers in three languages, the method substantially outperforms static oracle likelihood training in almost all settings. For parsers where a dynamic oracle is available (including a novel oracle which we define for the transition system of Dyer et al., 2016), policy gradient typically recaptures a substantial fraction of the performance gain afforded by the dynamic oracle.

2017

pdf bib
Translating Neuralese
Jacob Andreas | Anca Dragan | Dan Klein
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Several approaches have recently been proposed for learning decentralized deep multiagent policies that coordinate via a differentiable communication channel. While these policies are effective for many tasks, interpretation of their induced communication strategies has remained a challenge. Here we propose to interpret agents’ messages by translating them. Unlike in typical machine translation problems, we have no parallel data to learn from. Instead we develop a translation model based on the insight that agent messages and natural language strings mean the same thing if they induce the same belief about the world in a listener. We present theoretical guarantees and empirical evidence that our approach preserves both the semantics and pragmatics of messages by ensuring that players communicating through a translation layer do not suffer a substantial loss in reward relative to players with a common language.

pdf bib
A Minimal Span-Based Neural Constituency Parser
Mitchell Stern | Jacob Andreas | Dan Klein
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

In this work, we present a minimal neural model for constituency parsing based on independent scoring of labels and spans. We show that this model is not only compatible with classical dynamic programming techniques, but also admits a novel greedy top-down inference algorithm based on recursive partitioning of the input. We demonstrate empirically that both prediction schemes are competitive with recent work, and when combined with basic extensions to the scoring model are capable of achieving state-of-the-art single-model performance on the Penn Treebank (91.79 F1) and strong performance on the French Treebank (82.23 F1).

pdf bib
Abstract Syntax Networks for Code Generation and Semantic Parsing
Maxim Rabinovich | Mitchell Stern | Dan Klein
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Tasks like code generation and semantic parsing require mapping unstructured (or partially structured) inputs to well-formed, executable outputs. We introduce abstract syntax networks, a modeling framework for these problems. The outputs are represented as abstract syntax trees (ASTs) and constructed by a decoder with a dynamically-determined modular structure paralleling the structure of the output tree. On the benchmark Hearthstone dataset for code generation, our model obtains 79.2 BLEU and 22.7% exact match accuracy, compared to previous state-of-the-art values of 67.1 and 6.1%. Furthermore, we perform competitively on the Atis, Jobs, and Geo semantic parsing datasets with no task-specific engineering.

pdf bib
Improving Neural Parsing by Disentangling Model Combination and Reranking Effects
Daniel Fried | Mitchell Stern | Dan Klein
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Recent work has proposed several generative neural models for constituency parsing that achieve state-of-the-art results. Since direct search in these generative models is difficult, they have primarily been used to rescore candidate outputs from base parsers in which decoding is more straightforward. We first present an algorithm for direct search in these generative models. We then demonstrate that the rescoring results are at least partly due to implicit model combination rather than reranking effects. Finally, we show that explicit model combination can improve performance even further, resulting in new state-of-the-art numbers on the PTB of 94.25 F1 when training only on gold data and 94.66 F1 when using external data.

pdf bib
Fine-Grained Entity Typing with High-Multiplicity Assignments
Maxim Rabinovich | Dan Klein
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

As entity type systems become richer and more fine-grained, we expect the number of types assigned to a given entity to increase. However, most fine-grained typing work has focused on datasets that exhibit a low degree of type multiplicity. In this paper, we consider the high-multiplicity regime inherent in data sources such as Wikipedia that have semi-open type systems. We introduce a set-prediction approach to this problem and show that our model outperforms unstructured baselines on a new Wikipedia-based fine-grained typing corpus.

pdf bib
Parsing with Traces: An O(n4) Algorithm and a Structural Representation
Jonathan K. Kummerfeld | Dan Klein
Transactions of the Association for Computational Linguistics, Volume 5

General treebank analyses are graph structured, but parsers are typically restricted to tree structures for efficiency and modeling reasons. We propose a new representation and algorithm for a class of graph structures that is flexible enough to cover almost all treebank structures, while still admitting efficient learning and inference. In particular, we consider directed, acyclic, one-endpoint-crossing graph structures, which cover most long-distance dislocation, shared argumentation, and similar tree-violating linguistic phenomena. We describe how to convert phrase structure parses, including traces, to our new representation in a reversible manner. Our dynamic program uniquely decomposes structures, is sound and complete, and covers 97.3% of the Penn English Treebank. We also implement a proof-of-concept parser that recovers a range of null elements and trace types.

pdf bib
Where is Misty? Interpreting Spatial Descriptors by Modeling Regions in Space
Nikita Kitaev | Dan Klein
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

We present a model for locating regions in space based on natural language descriptions. Starting with a 3D scene and a sentence, our model is able to associate words in the sentence with regions in the scene, interpret relations such as ‘on top of’ or ‘next to,’ and finally locate the region described in the sentence. All components form a single neural network that is trained end-to-end without prior knowledge of object segmentation. To evaluate our model, we construct and release a new dataset consisting of Minecraft scenes with crowdsourced natural language descriptions. We achieve a 32% relative error reduction compared to a strong neural baseline.

pdf bib
Effective Inference for Generative Neural Parsing
Mitchell Stern | Daniel Fried | Dan Klein
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Generative neural models have recently achieved state-of-the-art results for constituency parsing. However, without a feasible search procedure, their use has so far been limited to reranking the output of external parsers in which decoding is more tractable. We describe an alternative to the conventional action-level beam search used for discriminative neural models that enables us to decode directly in these generative models. We then show that by improving our basic candidate selection strategy and using a coarse pruning function, we can improve accuracy while exploring significantly less of the search space. Applied to the model of Choe and Charniak (2016), our inference procedure obtains 92.56 F1 on section 23 of the Penn Treebank, surpassing prior state-of-the-art results for single-model systems.

pdf bib
Analogs of Linguistic Structure in Deep Representations
Jacob Andreas | Dan Klein
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

We investigate the compositional structure of message vectors computed by a deep network trained on a communication game. By comparing truth-conditional representations of encoder-produced message vectors to human-produced referring expressions, we are able to identify aligned (vector, utterance) pairs with the same meaning. We then search for structured relationships among these aligned pairs to discover simple vector space transformations corresponding to negation, conjunction, and disjunction. Our results suggest that neural representations are capable of spontaneously developing a “syntax” with functional analogues to qualitative properties of natural language.

2016

pdf bib
Learning-Based Single-Document Summarization with Compression and Anaphoricity Constraints
Greg Durrett | Taylor Berg-Kirkpatrick | Dan Klein
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Capturing Semantic Similarity for Entity Linking with Convolutional Neural Networks
Matthew Francis-Landau | Greg Durrett | Dan Klein
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Learning to Compose Neural Networks for Question Answering
Jacob Andreas | Marcus Rohrbach | Trevor Darrell | Dan Klein
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Reasoning about Pragmatics with Neural Listeners and Speakers
Jacob Andreas | Dan Klein
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

2015

pdf bib
Neural CRF Parsing
Greg Durrett | Dan Klein
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

pdf bib
When and why are log-linear models self-normalizing?
Jacob Andreas | Dan Klein
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Disfluency Detection with a Semi-Markov Model and Prosodic Features
James Ferguson | Greg Durrett | Dan Klein
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Unsupervised Code-Switching for Multilingual Historical Document Transcription
Dan Garrette | Hannah Alpert-Abrams | Taylor Berg-Kirkpatrick | Dan Klein
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
GPU-Friendly Local Regression for Voice Conversion
Taylor Berg-Kirkpatrick | Dan Klein
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
An Empirical Analysis of Optimization for Max-Margin NLP
Jonathan K. Kummerfeld | Taylor Berg-Kirkpatrick | Dan Klein
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Alignment-Based Compositional Semantics for Instruction Following
Jacob Andreas | Dan Klein
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

2014

pdf bib
Sparser, Better, Faster GPU Parsing
David Hall | Taylor Berg-Kirkpatrick | Dan Klein
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Less Grammar, More Features
David Hall | Greg Durrett | Dan Klein
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Structured Learning for Taxonomy Induction with Belief Propagation
Mohit Bansal | David Burkett | Gerard de Melo | Dan Klein
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Improved Typesetting Models for Historical OCR
Taylor Berg-Kirkpatrick | Dan Klein
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
How much do word embeddings encode about syntax?
Jacob Andreas | Dan Klein
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Grounding Language with Points and Paths in Continuous Spaces
Jacob Andreas | Dan Klein
Proceedings of the Eighteenth Conference on Computational Natural Language Learning

pdf bib
A Joint Model for Entity Analysis: Coreference, Typing, and Linking
Greg Durrett | Dan Klein
Transactions of the Association for Computational Linguistics, Volume 2

We present a joint model of three core tasks in the entity analysis stack: coreference resolution (within-document clustering), named entity recognition (coarse semantic typing), and entity linking (matching to Wikipedia entities). Our model is formally a structured conditional random field. Unary factors encode local features from strong baselines for each task. We then add binary and ternary factors to capture cross-task interactions, such as the constraint that coreferent mentions have the same semantic type. On the ACE 2005 and OntoNotes datasets, we achieve state-of-the-art results for all three tasks. Moreover, joint modeling improves performance on each task over strong independent baselines.

2013

pdf bib
Decentralized Entity-Level Modeling for Coreference Resolution
Greg Durrett | David Hall | Dan Klein
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Unsupervised Transcription of Historical Documents
Taylor Berg-Kirkpatrick | Greg Durrett | Dan Klein
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
An Empirical Examination of Challenges in Chinese Parsing
Jonathan K. Kummerfeld | Daniel Tse | James R. Curran | Dan Klein
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Variational Inference for Structured NLP Models
David Burkett | Dan Klein
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Tutorials)

pdf bib
Learning Dependency-Based Compositional Semantics
Percy Liang | Michael I. Jordan | Dan Klein
Computational Linguistics, Volume 39, Issue 2 - June 2013

pdf bib
Error-Driven Analysis of Challenges in Coreference Resolution
Jonathan K. Kummerfeld | Dan Klein
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
Decipherment with a Million Random Restarts
Taylor Berg-Kirkpatrick | Dan Klein
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
A Multi-Teraflop Constituency Parser using GPUs
John Canny | David Hall | Dan Klein
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
Easy Victories and Uphill Battles in Coreference Resolution
Greg Durrett | Dan Klein
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

2012

pdf bib
Coreference Semantics from Web Features
Mohit Bansal | Dan Klein
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Large-Scale Syntactic Language Modeling with Treelets
Adam Pauls | Dan Klein
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Robust Conversion of CCG Derivations to Phrase Structure Trees
Jonathan K. Kummerfeld | Dan Klein | James R. Curran
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Fast Inference in Phrase Extraction Models with Belief Propagation
David Burkett | Dan Klein
Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Variational Inference for Structured NLP Models
David Burkett | Dan Klein
Tutorial Abstracts at the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Syntactic Transfer Using a Bilingual Lexicon
Greg Durrett | Adam Pauls | Dan Klein
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

pdf bib
Transforming Trees to Improve Syntactic Convergence
David Burkett | Dan Klein
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

pdf bib
An Empirical Investigation of Statistical Significance in NLP
Taylor Berg-Kirkpatrick | David Burkett | Dan Klein
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

pdf bib
Parser Showdown at the Wall Street Corral: An Empirical Investigation of Error Types in Parser Output
Jonathan K. Kummerfeld | David Hall | James R. Curran | Dan Klein
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

pdf bib
Training Factored PCFGs with Expectation Propagation
David Hall | Dan Klein
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

2011

pdf bib
Faster and Smaller N-Gram Language Models
Adam Pauls | Dan Klein
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Jointly Learning to Extract and Compress
Taylor Berg-Kirkpatrick | Dan Gillick | Dan Klein
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Learning Dependency-Based Compositional Semantics
Percy Liang | Michael Jordan | Dan Klein
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Web-Scale Features for Full-Scale Parsing
Mohit Bansal | Dan Klein
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf bib
An Empirical Investigation of Discounting in Cross-Domain Language Models
Greg Durrett | Dan Klein
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf bib
The Surprising Variance in Shortest-Derivation Parsing
Mohit Bansal | Dan Klein
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Mention Detection: Heuristics for the OntoNotes annotations
Jonathan K. Kummerfeld | Mohit Bansal | David Burkett | Dan Klein
Proceedings of the Fifteenth Conference on Computational Natural Language Learning: Shared Task

pdf bib
Simple Effective Decipherment via Combinatorial Optimization
Taylor Berg-Kirkpatrick | Dan Klein
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

pdf bib
Large-Scale Cognate Recovery
David Hall | Dan Klein
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

2010

pdf bib
Finding Cognate Groups Using Phylogenies
David Hall | Dan Klein
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

pdf bib
Simple, Accurate Parsing with an All-Fragments Grammar
Mohit Bansal | Dan Klein
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

pdf bib
Phylogenetic Grammar Induction
Taylor Berg-Kirkpatrick | Dan Klein
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

pdf bib
Discriminative Modeling of Extraction Sets for Machine Translation
John DeNero | Dan Klein
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

pdf bib
Top-Down K-Best A* Parsing
Adam Pauls | Dan Klein | Chris Quirk
Proceedings of the ACL 2010 Conference Short Papers

pdf bib
An Entity-Level Approach to Information Extraction
Aria Haghighi | Dan Klein
Proceedings of the ACL 2010 Conference Short Papers

pdf bib
Hierarchical A* Parsing with Bridge Outside Scores
Adam Pauls | Dan Klein
Proceedings of the ACL 2010 Conference Short Papers

pdf bib
Learning Better Monolingual Models with Unannotated Bilingual Text
David Burkett | Slav Petrov | John Blitzer | Dan Klein
Proceedings of the Fourteenth Conference on Computational Natural Language Learning

pdf bib
Unsupervised Syntactic Alignment with Inversion Transduction Grammars
Adam Pauls | Dan Klein | David Chiang | Kevin Knight
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
Joint Parsing and Alignment with Weakly Synchronized Grammars
David Burkett | John Blitzer | Dan Klein
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
Coreference Resolution in a Modular, Entity-Centered Model
Aria Haghighi | Dan Klein
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
Type-Based MCMC
Percy Liang | Michael I. Jordan | Dan Klein
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
Painless Unsupervised Learning with Features
Taylor Berg-Kirkpatrick | Alexandre Bouchard-Côté | John DeNero | Dan Klein
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
A Game-Theoretic Approach to Generating Spatial Descriptions
Dave Golland | Percy Liang | Dan Klein
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing

pdf bib
A Simple Domain-Independent Probabilistic Approach to Generation
Gabor Angeli | Percy Liang | Dan Klein
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing

2009

pdf bib
Simple Coreference Resolution with Rich Syntactic and Semantic Features
Aria Haghighi | Dan Klein
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf bib
Consensus Training for Consensus Decoding in Machine Translation
Adam Pauls | John DeNero | Dan Klein
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf bib
Improved Reconstruction of Protolanguage Word Forms
Alexandre Bouchard-Côté | Thomas L. Griffiths | Dan Klein
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
Efficient Parsing for Transducer Grammars
John DeNero | Mohit Bansal | Adam Pauls | Dan Klein
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
Hierarchical Search for Parsing
Adam Pauls | Dan Klein
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
Online EM for Unsupervised Models
Percy Liang | Dan Klein
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
Learning Semantic Correspondences with Less Supervision
Percy Liang | Michael Jordan | Dan Klein
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

pdf bib
Better Word Alignments with Supervised ITG Models
Aria Haghighi | John Blitzer | John DeNero | Dan Klein
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

pdf bib
K-Best A* Parsing
Adam Pauls | Dan Klein
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

pdf bib
Asynchronous Binarization for Synchronous Grammars
John DeNero | Adam Pauls | Dan Klein
Proceedings of the ACL-IJCNLP 2009 Conference Short Papers

2008

pdf bib
Coarse-to-Fine Syntactic Machine Translation using Language Projections
Slav Petrov | Aria Haghighi | Dan Klein
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing

pdf bib
Sampling Alignment Structure under a Bayesian Translation Model
John DeNero | Alexandre Bouchard-Côté | Dan Klein
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing

pdf bib
Sparse Multi-Scale Grammars for Discriminative Latent Variable Parsing
Slav Petrov | Dan Klein
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing

pdf bib
Two Languages are Better than One (for Syntactic Parsing)
David Burkett | Dan Klein
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing

pdf bib
Parsing German with Latent Variable Grammars
Slav Petrov | Dan Klein
Proceedings of the Workshop on Parsing German

pdf bib
Learning Bilingual Lexicons from Monolingual Corpora
Aria Haghighi | Percy Liang | Taylor Berg-Kirkpatrick | Dan Klein
Proceedings of ACL-08: HLT

pdf bib
Analyzing the Errors of Unsupervised Learning
Percy Liang | Dan Klein
Proceedings of ACL-08: HLT

pdf bib
The Complexity of Phrase Alignment Problems
John DeNero | Dan Klein
Proceedings of ACL-08: HLT, Short Papers

2007

pdf bib
Tailoring Word Alignments to Syntactic Machine Translation
John DeNero | Dan Klein
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics

pdf bib
Unsupervised Coreference Resolution in a Nonparametric Bayesian Model
Aria Haghighi | Dan Klein
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics

pdf bib
Improved Inference for Unlexicalized Parsing
Slav Petrov | Dan Klein
Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference

pdf bib
Approximate Factoring for A* Search
Aria Haghighi | John DeNero | Dan Klein
Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference

pdf bib
Introduction to Classification: Likelihoods, Margins, Features, and Kernels
Dan Klein
Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Tutorial Abstracts

pdf bib
The Infinite PCFG Using Hierarchical Dirichlet Processes
Percy Liang | Slav Petrov | Michael Jordan | Dan Klein
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)

pdf bib
A Probabilistic Approach to Diachronic Phonology
Alexandre Bouchard | Percy Liang | Thomas Griffiths | Dan Klein
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)

pdf bib
Learning Structured Models for Phone Recognition
Slav Petrov | Adam Pauls | Dan Klein
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)

2006

pdf bib
Learning Accurate, Compact, and Interpretable Tree Annotation
Slav Petrov | Leon Barrett | Romain Thibaux | Dan Klein
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics

pdf bib
An End-to-End Discriminative Approach to Machine Translation
Percy Liang | Alexandre Bouchard-Côté | Dan Klein | Ben Taskar
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics

pdf bib
Prototype-Driven Grammar Induction
Aria Haghighi | Dan Klein
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics

pdf bib
Alignment by Agreement
Percy Liang | Ben Taskar | Dan Klein
Proceedings of the Human Language Technology Conference of the NAACL, Main Conference

pdf bib
Word Alignment via Quadratic Assignment
Simon Lacoste-Julien | Ben Taskar | Dan Klein | Michael I. Jordan
Proceedings of the Human Language Technology Conference of the NAACL, Main Conference

pdf bib
Prototype-Driven Learning for Sequence Models
Aria Haghighi | Dan Klein
Proceedings of the Human Language Technology Conference of the NAACL, Main Conference

pdf bib
Proceedings of the Tenth Conference on Computational Natural Language Learning (CoNLL-X)
Lluís Màrquez | Dan Klein
Proceedings of the Tenth Conference on Computational Natural Language Learning (CoNLL-X)

pdf bib
Non-Local Modeling with a Mixture of PCFGs
Slav Petrov | Leon Barrett | Dan Klein
Proceedings of the Tenth Conference on Computational Natural Language Learning (CoNLL-X)

pdf bib
Why Generative Phrase Models Underperform Surface Heuristics
John DeNero | Dan Gillick | James Zhang | Dan Klein
Proceedings on the Workshop on Statistical Machine Translation

2005

pdf bib
Unsupervised Learning of Field Segmentation Models for Information Extraction
Trond Grenager | Dan Klein | Christopher Manning
Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05)

pdf bib
A Discriminative Matching Approach to Word Alignment
Ben Taskar | Simon Lacoste-Julien | Dan Klein
Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing

pdf bib
A Core-Tools Statistical NLP Course
Dan Klein
Proceedings of the Second ACL Workshop on Effective Tools and Methodologies for Teaching NLP and CL

2004

pdf bib
Corpus-Based Induction of Syntactic Structure: Models of Dependency and Constituency
Dan Klein | Christopher Manning
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04)

pdf bib
Max-Margin Parsing
Ben Taskar | Dan Klein | Mike Collins | Daphne Koller | Christopher Manning
Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing

2003

pdf bib
Accurate Unlexicalized Parsing
Dan Klein | Christopher D. Manning
Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics

pdf bib
Named Entity Recognition with Character-Level Models
Dan Klein | Joseph Smarr | Huy Nguyen | Christopher D. Manning
Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003

pdf bib
A* Parsing: Fast Exact Viterbi Parse Selection
Dan Klein | Christopher D. Manning
Proceedings of the 2003 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network
Kristina Toutanova | Dan Klein | Christopher D. Manning | Yoram Singer
Proceedings of the 2003 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
Optimization, Maxent Models, and Conditional Estimation without Magic
Christopher Manning | Dan Klein
Companion Volume of the Proceedings of HLT-NAACL 2003 - Tutorial Abstracts

2002

pdf bib
A Generative Constituent-Context Model for Improved Grammar Induction
Dan Klein | Christopher D. Manning
Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics

pdf bib
Combining Heterogeneous Classifiers for Word Sense Disambiguation
Dan Klein | Kristina Toutanova | H. Tolga Ilhan | Sepandar D. Kamvar | Christopher D. Manning
Proceedings of the ACL-02 Workshop on Word Sense Disambiguation: Recent Successes and Future Directions

pdf bib
Conditional Structure versus Conditional Estimation in NLP Models
Dan Klein | Christopher D. Manning
Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP 2002)

2001

pdf bib
Parsing with Treebank Grammars: Empirical Bounds, Theoretical Models, and the Structure of the Penn Treebank
Dan Klein | Christopher D. Manning
Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics

pdf bib
Distributional phrase structure induction
Dan Klein | Christopher D. Manning
Proceedings of the ACL 2001 Workshop on Computational Natural Language Learning (ConLL)

pdf bib
Parsing and Hypergraphs
Dan Klein | Christopher D. Manning
Proceedings of the Seventh International Workshop on Parsing Technologies

pdf bib
Combining Heterogeneous Classifiers for Word-Sense Disambiguation
H. Tolga Ilhan | Sepandar D. Kamvar | Dan Klein | Christopher D. Manning | Kristina Toutanova
Proceedings of SENSEVAL-2 Second International Workshop on Evaluating Word Sense Disambiguation Systems