Christopher D. Manning

Also published as: Chris Manning, Christopher Manning


2019

pdf bib
Do Massively Pretrained Language Models Make Better Storytellers?
Abigail See | Aneesh Pappu | Rohun Saxena | Akhila Yerukola | Christopher D. Manning
Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)

Large neural language models trained on massive amounts of text have emerged as a formidable strategy for Natural Language Understanding tasks. However, the strength of these models as Natural Language Generators is less clear. Though anecdotal evidence suggests that these models generate better quality text, there has been no detailed study characterizing their generation abilities. In this work, we compare the performance of an extensively pretrained model, OpenAI GPT2-117 (Radford et al., 2019), to a state-of-the-art neural story generation model (Fan et al., 2018). By evaluating the generated text across a wide variety of automatic metrics, we characterize the ways in which pretrained models do, and do not, make better storytellers. We find that although GPT2-117 conditions more strongly on context, is more sensitive to ordering of events, and uses more unusual words, it is just as likely to produce repetitive and under-diverse text when using likelihood-maximizing decoding algorithms.

pdf bib
BAM! Born-Again Multi-Task Networks for Natural Language Understanding
Kevin Clark | Minh-Thang Luong | Urvashi Khandelwal | Christopher D. Manning | Quoc V. Le
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

It can be challenging to train multi-task neural networks that outperform or even match their single-task counterparts. To help address this, we propose using knowledge distillation where single-task models teach a multi-task model. We enhance this training with teacher annealing, a novel method that gradually transitions the model from distillation to supervised learning, helping the multi-task model surpass its single-task teachers. We evaluate our approach by multi-task fine-tuning BERT on the GLUE benchmark. Our method consistently improves over standard single-task and multi-task training.

pdf bib
Answering Complex Open-domain Questions Through Iterative Query Generation
Peng Qi | Xiaowen Lin | Leo Mehr | Zijian Wang | Christopher D. Manning
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

It is challenging for current one-step retrieve-and-read question answering (QA) systems to answer questions like “Which novel by the author of ‘Armada’ will be adapted as a feature film by Steven Spielberg?” because the question seldom contains retrievable clues about the missing entity (here, the author). Answering such a question requires multi-hop reasoning where one must gather information about the missing entity (or facts) to proceed with further reasoning. We present GoldEn (Gold Entity) Retriever, which iterates between reading context and retrieving more supporting documents to answer open-domain multi-hop questions. Instead of using opaque and computationally expensive neural retrieval models, GoldEn Retriever generates natural language search queries given the question and available context, and leverages off-the-shelf information retrieval systems to query for missing entities. This allows GoldEn Retriever to scale up efficiently for open-domain multi-hop reasoning while maintaining interpretability. We evaluate GoldEn Retriever on the recently proposed open-domain multi-hop QA dataset, HotpotQA, and demonstrate that it outperforms the best previously published model despite not using pretrained language models such as BERT.

pdf bib
What Does BERT Look at? An Analysis of BERT’s Attention
Kevin Clark | Urvashi Khandelwal | Omer Levy | Christopher D. Manning
Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP

Large pre-trained neural networks such as BERT have had great recent success in NLP, motivating a growing body of research investigating what aspects of language they are able to learn from unlabeled data. Most recent analysis has focused on model outputs (e.g., language model surprisal) or internal vector representations (e.g., probing classifiers). Complementary to these works, we propose methods for analyzing the attention mechanisms of pre-trained models and apply them to BERT. BERT’s attention heads exhibit patterns such as attending to delimiter tokens, specific positional offsets, or broadly attending over the whole sentence, with heads in the same layer often exhibiting similar behaviors. We further show that certain attention heads correspond well to linguistic notions of syntax and coreference. For example, we find heads that attend to the direct objects of verbs, determiners of nouns, objects of prepositions, and coreferent mentions with remarkably high accuracy. Lastly, we propose an attention-based probing classifier and use it to further demonstrate that substantial syntactic information is captured in BERT’s attention.

pdf bib
A Structural Probe for Finding Syntax in Word Representations
John Hewitt | Christopher D. Manning
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Recent work has improved our ability to detect linguistic knowledge in word representations. However, current methods for detecting syntactic knowledge do not test whether syntax trees are represented in their entirety. In this work, we propose a structural probe, which evaluates whether syntax trees are embedded in a linear transformation of a neural network’s word representation space. The probe identifies a linear transformation under which squared L2 distance encodes the distance between words in the parse tree, and one in which squared L2 norm encodes depth in the parse tree. Using our probe, we show that such transformations exist for both ELMo and BERT but not in baselines, providing evidence that entire syntax trees are embedded implicitly in deep models’ vector geometry.

pdf bib
CoQA: A Conversational Question Answering Challenge
Siva Reddy | Danqi Chen | Christopher D. Manning
Transactions of the Association for Computational Linguistics, Volume 7

Humans gather information through conversations involving a series of interconnected questions and answers. For machines to assist in information gathering, it is therefore essential to enable them to answer conversational questions. We introduce CoQA, a novel dataset for building Conversational Question Answering systems. Our dataset contains 127k questions with answers, obtained from 8k conversations about text passages from seven diverse domains. The questions are conversational, and the answers are free-form text with their corresponding evidence highlighted in the passage. We analyze CoQA in depth and show that conversational questions have challenging phenomena not present in existing reading comprehension datasets (e.g., coreference and pragmatic reasoning). We evaluate strong dialogue and reading comprehension models on CoQA. The best system obtains an F1 score of 65.4%, which is 23.4 points behind human performance (88.8%), indicating that there is ample room for improvement. We present CoQA as a challenge to the community at https://stanfordnlp.github.io/coqa.

2018

pdf bib
Textual Analogy Parsing: What’s Shared and What’s Compared among Analogous Facts
Matthew Lamm | Arun Chaganty | Christopher D. Manning | Dan Jurafsky | Percy Liang
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

To understand a sentence like “whereas only 10% of White Americans live at or below the poverty line, 28% of African Americans do” it is important not only to identify individual facts, e.g., poverty rates of distinct demographic groups, but also the higher-order relations between them, e.g., the disparity between them. In this paper, we propose the task of Textual Analogy Parsing (TAP) to model this higher-order meaning. Given a sentence such as the one above, TAP outputs a frame-style meaning representation which explicitly specifies what is shared (e.g., poverty rates) and what is compared (e.g., White Americans vs. African Americans, 10% vs. 28%) between its component facts. Such a meaning representation can enable new applications that rely on discourse understanding such as automated chart generation from quantitative text. We present a new dataset for TAP, baselines, and a model that successfully uses an ILP to enforce the structural constraints of the problem.

pdf bib
Semi-Supervised Sequence Modeling with Cross-View Training
Kevin Clark | Minh-Thang Luong | Christopher D. Manning | Quoc Le
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Unsupervised representation learning algorithms such as word2vec and ELMo improve the accuracy of many supervised NLP models, mainly because they can take advantage of large amounts of unlabeled text. However, the supervised models only learn from task-specific labeled data during the main training phase. We therefore propose Cross-View Training (CVT), a semi-supervised learning algorithm that improves the representations of a Bi-LSTM sentence encoder using a mix of labeled and unlabeled data. On labeled examples, standard supervised learning is used. On unlabeled examples, CVT teaches auxiliary prediction modules that see restricted views of the input (e.g., only part of a sentence) to match the predictions of the full model seeing the whole input. Since the auxiliary modules and the full model share intermediate representations, this in turn improves the full model. Moreover, we show that CVT is particularly effective when combined with multi-task learning. We evaluate CVT on five sequence tagging tasks, machine translation, and dependency parsing, achieving state-of-the-art results.

pdf bib
Graph Convolution over Pruned Dependency Trees Improves Relation Extraction
Yuhao Zhang | Peng Qi | Christopher D. Manning
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Dependency trees help relation extraction models capture long-range relations between words. However, existing dependency-based models either neglect crucial information (e.g., negation) by pruning the dependency trees too aggressively, or are computationally inefficient because it is difficult to parallelize over different tree structures. We propose an extension of graph convolutional networks that is tailored for relation extraction, which pools information over arbitrary dependency structures efficiently in parallel. To incorporate relevant information while maximally removing irrelevant content, we further apply a novel pruning strategy to the input trees by keeping words immediately around the shortest path between the two entities among which a relation might hold. The resulting model achieves state-of-the-art performance on the large-scale TACRED dataset, outperforming existing sequence and dependency-based neural models. We also show through detailed analysis that this model has complementary strengths to sequence models, and combining them further improves the state of the art.

pdf bib
HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering
Zhilin Yang | Peng Qi | Saizheng Zhang | Yoshua Bengio | William Cohen | Ruslan Salakhutdinov | Christopher D. Manning
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Existing question answering (QA) datasets fail to train QA systems to perform complex reasoning and provide explanations for answers. We introduce HotpotQA, a new dataset with 113k Wikipedia-based question-answer pairs with four key features: (1) the questions require finding and reasoning over multiple supporting documents to answer; (2) the questions are diverse and not constrained to any pre-existing knowledge bases or knowledge schemas; (3) we provide sentence-level supporting facts required for reasoning, allowing QA systems to reason with strong supervision and explain the predictions; (4) we offer a new type of factoid comparison questions to test QA systems’ ability to extract relevant facts and perform necessary comparison. We show that HotpotQA is challenging for the latest QA systems, and the supporting facts enable models to improve performance and make explainable predictions.

pdf bib
Learning to Summarize Radiology Findings
Yuhao Zhang | Daisy Yi Ding | Tianpei Qian | Christopher D. Manning | Curtis P. Langlotz
Proceedings of the Ninth International Workshop on Health Text Mining and Information Analysis

The Impression section of a radiology report summarizes crucial radiology findings in natural language and plays a central role in communicating these findings to physicians. However, the process of generating impressions by summarizing findings is time-consuming for radiologists and prone to errors. We propose to automate the generation of radiology impressions with neural sequence-to-sequence learning. We further propose a customized neural model for this task which learns to encode the study background information and use this information to guide the decoding process. On a large dataset of radiology reports collected from actual hospital studies, our model outperforms existing non-neural and neural baselines under the ROUGE metrics. In a blind experiment, a board-certified radiologist indicated that 67% of sampled system summaries are at least as good as the corresponding human-written summaries, suggesting significant clinical validity. To our knowledge our work represents the first attempt in this direction.

pdf bib
Universal Dependency Parsing from Scratch
Peng Qi | Timothy Dozat | Yuhao Zhang | Christopher D. Manning
Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies

This paper describes Stanford’s system at the CoNLL 2018 UD Shared Task. We introduce a complete neural pipeline system that takes raw text as input, and performs all tasks required by the shared task, ranging from tokenization and sentence segmentation, to POS tagging and dependency parsing. Our single system submission achieved very competitive performance on big treebanks. Moreover, after fixing an unfortunate bug, our corrected system would have placed the 2nd, 1st, and 3rd on the official evaluation metrics LAS, MLAS, and BLEX, and would have outperformed all submission systems on low-resource treebank categories on all metrics by a large margin. We further show the effectiveness of different model components through extensive ablation studies.

pdf bib
Simpler but More Accurate Semantic Dependency Parsing
Timothy Dozat | Christopher D. Manning
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

While syntactic dependency annotations concentrate on the surface or functional structure of a sentence, semantic dependency annotations aim to capture between-word relationships that are more closely related to the meaning of a sentence, using graph-structured representations. We extend the LSTM-based syntactic parser of Dozat and Manning (2017) to train on and generate these graph structures. The resulting system on its own achieves state-of-the-art performance, beating the previous, substantially more complex state-of-the-art system by 0.6% labeled F1. Adding linguistically richer input representations pushes the margin even higher, allowing us to beat it by 1.9% labeled F1.

pdf bib
Sentences with Gapping: Parsing and Reconstructing Elided Predicates
Sebastian Schuster | Joakim Nivre | Christopher D. Manning
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

Sentences with gapping, such as Paul likes coffee and Mary tea, lack an overt predicate to indicate the relation between two or more arguments. Surface syntax representations of such sentences are often produced poorly by parsers, and even if correct, not well suited to downstream natural language understanding tasks such as relation extraction that are typically designed to extract information from sentences with canonical clause structure. In this paper, we present two methods for parsing to a Universal Dependencies graph representation that explicitly encodes the elided material with additional nodes and edges. We find that both methods can reconstruct elided material from dependency trees with high accuracy when the parser correctly predicts the existence of a gap. We further demonstrate that one of our methods can be applied to other languages based on a case study on Swedish.

2017

pdf bib
Gapping Constructions in Universal Dependencies v2
Sebastian Schuster | Matthew Lamm | Christopher D. Manning
Proceedings of the NoDaLiDa 2017 Workshop on Universal Dependencies (UDW 2017)

pdf bib
Key-Value Retrieval Networks for Task-Oriented Dialogue
Mihail Eric | Lakshmi Krishnan | Francois Charette | Christopher D. Manning
Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue

Neural task-oriented dialogue systems often struggle to smoothly interface with a knowledge base. In this work, we seek to address this problem by proposing a new neural dialogue agent that is able to effectively sustain grounded, multi-domain discourse through a novel key-value retrieval mechanism. The model is end-to-end differentiable and does not need to explicitly model dialogue state or belief trackers. We also release a new dataset of 3,031 dialogues that are grounded through underlying knowledge bases and span three distinct tasks in the in-car personal assistant space: calendar scheduling, weather information retrieval, and point-of-interest navigation. Our architecture is simultaneously trained on data from all domains and significantly outperforms a competitive rule-based system and other existing neural dialogue architectures on the provided domains according to both automatic and human evaluation metrics.

pdf bib
Position-aware Attention and Supervised Data Improve Slot Filling
Yuhao Zhang | Victor Zhong | Danqi Chen | Gabor Angeli | Christopher D. Manning
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Organized relational knowledge in the form of “knowledge graphs” is important for many applications. However, the ability to populate knowledge bases with facts automatically extracted from documents has improved frustratingly slowly. This paper simultaneously addresses two issues that have held back prior work. We first propose an effective new model, which combines an LSTM sequence model with a form of entity position-aware attention that is better suited to relation extraction. Then we build TACRED, a large (119,474 examples) supervised relation extraction dataset obtained via crowdsourcing and targeted towards TAC KBP relations. The combination of better supervised data and a more appropriate high-capacity model enables much better relation extraction performance. When the model trained on this new dataset replaces the previous relation extraction component of the best TAC KBP 2015 slot filling system, its F1 score increases markedly from 22.2% to 26.7%.

pdf bib
Importance sampling for unbiased on-demand evaluation of knowledge base population
Arun Chaganty | Ashwin Paranjape | Percy Liang | Christopher D. Manning
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Knowledge base population (KBP) systems take in a large document corpus and extract entities and their relations. Thus far, KBP evaluation has relied on judgements on the pooled predictions of existing systems. We show that this evaluation is problematic: when a new system predicts a previously unseen relation, it is penalized even if it is correct. This leads to significant bias against new systems, which counterproductively discourages innovation in the field. Our first contribution is a new importance-sampling based evaluation which corrects for this bias by annotating a new system’s predictions on-demand via crowdsourcing. We show this eliminates bias and reduces variance using data from the 2015 TAC KBP task. Our second contribution is an implementation of our method made publicly available as an online KBP evaluation service. We pilot the service by testing diverse state-of-the-art systems on the TAC KBP 2016 corpus and obtain accurate scores in a cost effective manner.

pdf bib
A Copy-Augmented Sequence-to-Sequence Architecture Gives Good Performance on Task-Oriented Dialogue
Mihail Eric | Christopher Manning
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers

Task-oriented dialogue focuses on conversational agents that participate in dialogues with user goals on domain-specific topics. In contrast to chatbots, which simply seek to sustain open-ended meaningful discourse, existing task-oriented agents usually explicitly model user intent and belief states. This paper examines bypassing such an explicit representation by depending on a latent neural embedding of state and learning selective attention to dialogue history together with copying to incorporate relevant prior context. We complement recent work by showing the effectiveness of simple sequence-to-sequence neural architectures with a copy mechanism. Our model outperforms more complex memory-augmented models by 7% in per-response generation and is on par with the current state-of-the-art on DSTC2, a real-world task-oriented dialogue dataset.

pdf bib
CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies
Daniel Zeman | Martin Popel | Milan Straka | Jan Hajič | Joakim Nivre | Filip Ginter | Juhani Luotolahti | Sampo Pyysalo | Slav Petrov | Martin Potthast | Francis Tyers | Elena Badmaeva | Memduh Gokirmak | Anna Nedoluzhko | Silvie Cinková | Jan Hajič jr. | Jaroslava Hlaváčová | Václava Kettnerová | Zdeňka Urešová | Jenna Kanerva | Stina Ojala | Anna Missilä | Christopher D. Manning | Sebastian Schuster | Siva Reddy | Dima Taji | Nizar Habash | Herman Leung | Marie-Catherine de Marneffe | Manuela Sanguinetti | Maria Simi | Hiroshi Kanayama | Valeria de Paiva | Kira Droganova | Héctor Martínez Alonso | Çağrı Çöltekin | Umut Sulubacak | Hans Uszkoreit | Vivien Macketanz | Aljoscha Burchardt | Kim Harris | Katrin Marheinecke | Georg Rehm | Tolga Kayadelen | Mohammed Attia | Ali Elkahky | Zhuoran Yu | Emily Pitler | Saran Lertpradit | Michael Mandl | Jesse Kirchner | Hector Fernandez Alcalde | Jana Strnadová | Esha Banerjee | Ruli Manurung | Antonio Stella | Atsuko Shimada | Sookyoung Kwak | Gustavo Mendonça | Tatiana Lando | Rattima Nitisaroj | Josie Li
Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies

The Conference on Computational Natural Language Learning (CoNLL) features a shared task, in which participants train and test their learning systems on the same data sets. In 2017, the task was devoted to learning dependency parsers for a large number of languages, in a real-world setting without any gold-standard annotation on input. All test sets followed a unified annotation scheme, namely that of Universal Dependencies. In this paper, we define the task and evaluation methodology, describe how the data sets were prepared, report and analyze the main results, and provide a brief categorization of the different approaches of the participating systems.

pdf bib
Stanford’s Graph-based Neural Dependency Parser at the CoNLL 2017 Shared Task
Timothy Dozat | Peng Qi | Christopher D. Manning
Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies

This paper describes the neural dependency parser submitted by Stanford to the CoNLL 2017 Shared Task on parsing Universal Dependencies. Our system uses relatively simple LSTM networks to produce part of speech tags and labeled dependency parses from segmented and tokenized sequences of words. In order to address the rare word problem that abounds in languages with complex morphology, we include a character-based word representation that uses an LSTM to produce embeddings from sequences of characters. Our system was ranked first according to all five relevant metrics for the system: UPOS tagging (93.09%), XPOS tagging (82.27%), unlabeled attachment score (81.30%), labeled attachment score (76.30%), and content word labeled attachment score (72.57%).

pdf bib
Naturalizing a Programming Language via Interactive Learning
Sida I. Wang | Samuel Ginn | Percy Liang | Christopher D. Manning
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Our goal is to create a convenient natural language interface for performing well-specified but complex actions such as analyzing data, manipulating text, and querying databases. However, existing natural language interfaces for such tasks are quite primitive compared to the power one wields with a programming language. To bridge this gap, we start with a core programming language and allow users to “naturalize” the core language incrementally by defining alternative, more natural syntax and increasingly complex concepts in terms of compositions of simpler ones. In a voxel world, we show that a community of users can simultaneously teach a common system a diverse language and use it to build hundreds of complex voxel structures. Over the course of three days, these users went from using only the core language to using the naturalized language in 85.9% of the last 10K utterances.

pdf bib
Get To The Point: Summarization with Pointer-Generator Networks
Abigail See | Peter J. Liu | Christopher D. Manning
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Neural sequence-to-sequence models have provided a viable new approach for abstractive text summarization (meaning they are not restricted to simply selecting and rearranging passages from the original text). However, these models have two shortcomings: they are liable to reproduce factual details inaccurately, and they tend to repeat themselves. In this work we propose a novel architecture that augments the standard sequence-to-sequence attentional model in two orthogonal ways. First, we use a hybrid pointer-generator network that can copy words from the source text via pointing, which aids accurate reproduction of information, while retaining the ability to produce novel words through the generator. Second, we use coverage to keep track of what has been summarized, which discourages repetition. We apply our model to the CNN / Daily Mail summarization task, outperforming the current abstractive state-of-the-art by at least 2 ROUGE points.

pdf bib
Arc-swift: A Novel Transition System for Dependency Parsing
Peng Qi | Christopher D. Manning
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Transition-based dependency parsers often need sequences of local shift and reduce operations to produce certain attachments. Correct individual decisions hence require global information about the sentence context and mistakes cause error propagation. This paper proposes a novel transition system, arc-swift, that enables direct attachments between tokens farther apart with a single transition. This allows the parser to leverage lexical information more directly in transition decisions. Hence, arc-swift can achieve significantly better performance with a very small beam size. Our parsers reduce error by 3.7–7.6% relative to those using existing transition systems on the Penn Treebank dependency parsing task and English Universal Dependencies.

2016

pdf bib
Combining Natural Logic and Shallow Reasoning for Question Answering
Gabor Angeli | Neha Nayak | Christopher D. Manning
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Improving Coreference Resolution by Learning Entity-Level Distributed Representations
Kevin Clark | Christopher D. Manning
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Achieving Open Vocabulary Neural Machine Translation with Hybrid Word-Character Models
Minh-Thang Luong | Christopher D. Manning
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
A Fast Unified Model for Parsing and Sentence Understanding
Samuel R. Bowman | Jon Gauthier | Abhinav Rastogi | Raghav Gupta | Christopher D. Manning | Christopher Potts
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
A Thorough Examination of the CNN/Daily Mail Reading Comprehension Task
Danqi Chen | Jason Bolton | Christopher D. Manning
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Learning Language Games through Interaction
Sida I. Wang | Percy Liang | Christopher D. Manning
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

bib
Neural Machine Translation
Thang Luong | Kyunghyun Cho | Christopher D. Manning
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts

Neural Machine Translation (NMT) is a simple new architecture for getting machines to learn to translate. Despite being relatively new (Kalchbrenner and Blunsom, 2013; Cho et al., 2014; Sutskever et al., 2014), NMT has already shown promising results, achieving state-of-the-art performances for various language pairs (Luong et al, 2015a; Jean et al, 2015; Luong et al, 2015b; Sennrich et al., 2016; Luong and Manning, 2016). While many of these NMT papers were presented to the ACL community, research and practice of NMT are only at their beginning stage. This tutorial would be a great opportunity for the whole community of machine translation and natural language processing to learn more about a very promising new approach to MT. This tutorial has four parts.In the first part, we start with an overview of MT approaches, including: (a) traditional methods that have been dominant over the past twenty years and (b) recent hybrid models with the use of neural network components. From these, we motivate why an end-to-end approach like neural machine translation is needed. The second part introduces a basic instance of NMT. We start out with a discussion of recurrent neural networks, including the back-propagation-through-time algorithm and stochastic gradient descent optimizers, as these are the foundation on which NMT builds. We then describe in detail the basic sequence-to-sequence architecture of NMT (Cho et al., 2014; Sutskever et al., 2014), the maximum likelihood training approach, and a simple beam-search decoder to produce translations.The third part of our tutorial describes techniques to build state-of-the-art NMT. We start with approaches to extend the vocabulary coverage of NMT (Luong et al., 2015a; Jean et al., 2015; Chitnis and DeNero, 2015). We then introduce the idea of jointly learning both translations and alignments through an attention mechanism (Bahdanau et al., 2015); other variants of attention (Luong et al., 2015b; Tu et al., 2016) are discussed too. We describe a recent trend in NMT, that is to translate at the sub-word level (Chung et al., 2016; Luong and Manning, 2016; Sennrich et al., 2016), so that language variations can be effectively handled. We then give tips on training and testing NMT systems such as batching and ensembling. In the final part of the tutorial, we briefly describe promising approaches, such as (a) how to combine multiple tasks to help translation (Dong et al., 2015; Luong et al., 2016; Firat et al., 2016; Zoph and Knight, 2016) and (b) how to utilize monolingual corpora (Sennrich et al., 2016). Lastly, we conclude with challenges remained to be solved for future NMT.PS: we would also like to acknowledge the very first paper by Forcada and Ñeco (1997) on sequence-to-sequence models for translation!

pdf bib
Compression of Neural Machine Translation Models via Pruning
Abigail See | Minh-Thang Luong | Christopher D. Manning
Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning

pdf bib
Evaluating Word Embeddings Using a Representative Suite of Practical Tasks
Neha Nayak | Gabor Angeli | Christopher D. Manning
Proceedings of the 1st Workshop on Evaluating Vector-Space Representations for NLP

pdf bib
Deep Reinforcement Learning for Mention-Ranking Coreference Models
Kevin Clark | Christopher D. Manning
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
A comparison of Named-Entity Disambiguation and Word Sense Disambiguation
Angel Chang | Valentin I. Spitkovsky | Christopher D. Manning | Eneko Agirre
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Named Entity Disambiguation (NED) is the task of linking a named-entity mention to an instance in a knowledge-base, typically Wikipedia-derived resources like DBpedia. This task is closely related to word-sense disambiguation (WSD), where the mention of an open-class word is linked to a concept in a knowledge-base, typically WordNet. This paper analyzes the relation between two annotated datasets on NED and WSD, highlighting the commonalities and differences. We detail the methods to construct a NED system following the WSD word-expert approach, where we need a dictionary and one classifier is built for each target entity mention string. Constructing a dictionary for NED proved challenging, and although similarity and ambiguity are higher for NED, the results are also higher due to the larger number of training data, and the more crisp and skewed meaning differences.

pdf bib
Universal Dependencies v1: A Multilingual Treebank Collection
Joakim Nivre | Marie-Catherine de Marneffe | Filip Ginter | Yoav Goldberg | Jan Hajič | Christopher D. Manning | Ryan McDonald | Slav Petrov | Sampo Pyysalo | Natalia Silveira | Reut Tsarfaty | Daniel Zeman
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Cross-linguistically consistent annotation is necessary for sound comparative evaluation and cross-lingual learning experiments. It is also useful for multilingual system development and comparative linguistic studies. Universal Dependencies is an open community effort to create cross-linguistically consistent treebank annotation for many languages within a dependency-based lexicalist framework. In this paper, we describe v1 of the universal guidelines, the underlying design principles, and the currently available treebanks for 33 languages.

pdf bib
Enhanced English Universal Dependencies: An Improved Representation for Natural Language Understanding Tasks
Sebastian Schuster | Christopher D. Manning
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Many shallow natural language understanding tasks use dependency trees to extract relations between content words. However, strict surface-structure dependency trees tend to follow the linguistic structure of sentences too closely and frequently fail to provide direct relations between content words. To mitigate this problem, the original Stanford Dependencies representation also defines two dependency graph representations which contain additional and augmented relations that explicitly capture otherwise implicit relations between content words. In this paper, we revisit and extend these dependency graph representations in light of the recent Universal Dependencies (UD) initiative and provide a detailed account of an enhanced and an enhanced++ English UD representation. We further present a converter from constituency to basic, i.e., strict surface structure, UD trees, and a converter from basic UD trees to enhanced and enhanced++ English UD graphs. We release both converters as part of Stanford CoreNLP and the Stanford Parser.

2015

pdf bib
A large annotated corpus for learning natural language inference
Samuel R. Bowman | Gabor Angeli | Christopher Potts | Christopher D. Manning
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Effective Approaches to Attention-based Neural Machine Translation
Thang Luong | Hieu Pham | Christopher D. Manning
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Last Words: Computational Linguistics and Deep Learning
Christopher D. Manning
Computational Linguistics, Volume 41, Issue 4 - December 2015

pdf bib
Distributed Representations of Words to Guide Bootstrapped Entity Classifiers
Sonal Gupta | Christopher D. Manning
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Text to 3D Scene Generation with Rich Lexical Grounding
Angel Chang | Will Monroe | Manolis Savva | Christopher Potts | Christopher D. Manning
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

pdf bib
Leveraging Linguistic Structure For Open Domain Information Extraction
Gabor Angeli | Melvin Jose Johnson Premkumar | Christopher D. Manning
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

pdf bib
Robust Subgraph Generation Improves Abstract Meaning Representation Parsing
Keenon Werling | Gabor Angeli | Christopher D. Manning
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

pdf bib
Entity-Centric Coreference Resolution with Model Stacking
Kevin Clark | Christopher D. Manning
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

pdf bib
Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks
Kai Sheng Tai | Richard Socher | Christopher D. Manning
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

pdf bib
Deep Neural Language Models for Machine Translation
Thang Luong | Michael Kayser | Christopher D. Manning
Proceedings of the Nineteenth Conference on Computational Natural Language Learning

pdf bib
Learning Distributed Representations for Multilingual Text Sequences
Hieu Pham | Thang Luong | Christopher Manning
Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing

pdf bib
Bilingual Word Representations with Monolingual Quality in Mind
Thang Luong | Hieu Pham | Christopher D. Manning
Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing

pdf bib
Invited Talk: The Case for Universal Dependencies
Christopher Manning
Proceedings of the Third International Conference on Dependency Linguistics (Depling 2015)

pdf bib
Does Universal Dependencies need a parsing representation? An investigation of English
Natalia Silveira | Christopher Manning
Proceedings of the Third International Conference on Dependency Linguistics (Depling 2015)

pdf bib
Generating Semantically Precise Scene Graphs from Textual Descriptions for Improved Image Retrieval
Sebastian Schuster | Ranjay Krishna | Angel Chang | Li Fei-Fei | Christopher D. Manning
Proceedings of the Fourth Workshop on Vision and Language

pdf bib
Recursive Neural Networks Can Learn Logical Semantics
Samuel R. Bowman | Christopher Potts | Christopher D. Manning
Proceedings of the 3rd Workshop on Continuous Vector Space Models and their Compositionality

2014

pdf bib
Robust Logistic Regression using Shift Parameters
Julie Tibshirani | Christopher D. Manning
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Faster Phrase-Based Decoding by Refining Feature State
Kenneth Heafield | Michael Kayser | Christopher D. Manning
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Two Knives Cut Better Than One: Chinese Word Segmentation with Dual Decomposition
Mengqiu Wang | Rob Voigt | Christopher D. Manning
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Word Segmentation of Informal Arabic with Domain Adaptation
Will Monroe | Spence Green | Christopher D. Manning
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
The Stanford CoreNLP Natural Language Processing Toolkit
Christopher Manning | Mihai Surdeanu | John Bauer | Jenny Finkel | Steven Bethard | David McClosky
Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations

pdf bib
Universal Stanford dependencies: A cross-linguistic typology
Marie-Catherine de Marneffe | Timothy Dozat | Natalia Silveira | Katri Haverinen | Filip Ginter | Joakim Nivre | Christopher D. Manning
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

pdf bib
A Gold Standard Dependency Corpus for English
Natalia Silveira | Timothy Dozat | Marie-Catherine de Marneffe | Samuel Bowman | Miriam Connor | John Bauer | Chris Manning
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

pdf bib
Event Extraction Using Distant Supervision
Kevin Reschke | Martin Jankowiak | Mihai Surdeanu | Christopher Manning | Daniel Jurafsky
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

pdf bib
Cross-lingual Projected Expectation Regularization for Weakly Supervised Learning
Mengqiu Wang | Christopher D. Manning
Transactions of the Association for Computational Linguistics, Volume 2

We consider a multilingual weakly supervised learning scenario where knowledge from annotated corpora in a resource-rich language is transferred via bitext to guide the learning in other languages. Past approaches project labels across bitext and use them as features or gold labels for training. We propose a new method that projects model expectations rather than labels, which facilities transfer of model uncertainty across language boundaries. We encode expectations as constraints and train a discriminative CRF model using Generalized Expectation Criteria (Mann and McCallum, 2010). Evaluated on standard Chinese-English and German-English NER datasets, our method demonstrates F1 scores of 64% and 60% when no labeled data is used. Attaining the same accuracy with supervised CRFs requires 12k and 1.5k labeled sentences. Furthermore, when combined with labeled examples, our method yields significant improvements over state-of-the-art supervised methods, achieving best reported numbers to date on Chinese OntoNotes and German CoNLL-03 datasets.

pdf bib
Grounded Compositional Semantics for Finding and Describing Images with Sentences
Richard Socher | Andrej Karpathy | Quoc V. Le | Christopher D. Manning | Andrew Y. Ng
Transactions of the Association for Computational Linguistics, Volume 2

Previous work on Recursive Neural Networks (RNNs) shows that these models can produce compositional feature vectors for accurately representing and classifying sentences or images. However, the sentence vectors of previous models cannot accurately represent visually grounded meaning. We introduce the DT-RNN model which uses dependency trees to embed sentences into a vector space in order to retrieve images that are described by those sentences. Unlike previous RNN-based models which use constituency trees, DT-RNNs naturally focus on the action and agents in a sentence. They are better able to abstract from the details of word order and syntactic expression. DT-RNNs outperform other recursive and recurrent neural networks, kernelized CCA and a bag-of-words baseline on the tasks of finding an image that fits a sentence description and vice versa. They also give more similar representations to sentences that describe the same image.

pdf bib
NaturalLI: Natural Logic Inference for Common Sense Reasoning
Gabor Angeli | Christopher D. Manning
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf bib
A Fast and Accurate Dependency Parser using Neural Networks
Danqi Chen | Christopher Manning
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf bib
Human Effort and Machine Learnability in Computer Aided Translation
Spence Green | Sida I. Wang | Jason Chuang | Jeffrey Heer | Sebastian Schuster | Christopher D. Manning
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf bib
Modeling Biological Processes for Reading Comprehension
Jonathan Berant | Vivek Srikumar | Pei-Chun Chen | Abby Vander Linden | Brittany Harding | Brad Huang | Peter Clark | Christopher D. Manning
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf bib
Glove: Global Vectors for Word Representation
Jeffrey Pennington | Richard Socher | Christopher Manning
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf bib
Combining Distant and Partial Supervision for Relation Extraction
Gabor Angeli | Julie Tibshirani | Jean Wu | Christopher D. Manning
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf bib
Learning Spatial Knowledge for Text to 3D Scene Generation
Angel Chang | Manolis Savva | Christopher D. Manning
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf bib
Proceedings of the 2nd Workshop on Continuous Vector Space Models and their Compositionality (CVSC)
Alexandre Allauzen | Raffaella Bernardi | Edward Grefenstette | Hugo Larochelle | Christopher Manning | Scott Wen-tau Yih
Proceedings of the 2nd Workshop on Continuous Vector Space Models and their Compositionality (CVSC)

pdf bib
Improved Pattern Learning for Bootstrapped Entity Extraction
Sonal Gupta | Christopher Manning
Proceedings of the Eighteenth Conference on Computational Natural Language Learning

pdf bib
Semantic Parsing for Text to 3D Scene Generation
Angel Chang | Manolis Savva | Christopher Manning
Proceedings of the ACL 2014 Workshop on Semantic Parsing

pdf bib
Interactive Learning of Spatial Knowledge for Text to 3D Scene Generation
Angel Chang | Manolis Savva | Christopher Manning
Proceedings of the Workshop on Interactive Language Learning, Visualization, and Interfaces

pdf bib
SPIED: Stanford Pattern based Information Extraction and Diagnostics
Sonal Gupta | Christopher Manning
Proceedings of the Workshop on Interactive Language Learning, Visualization, and Interfaces

pdf bib
Phrasal: A Toolkit for New Directions in Statistical Machine Translation
Spence Green | Daniel Cer | Christopher Manning
Proceedings of the Ninth Workshop on Statistical Machine Translation

pdf bib
Stanford University’s Submissions to the WMT 2014 Translation Task
Julia Neidert | Sebastian Schuster | Spence Green | Kenneth Heafield | Christopher Manning
Proceedings of the Ninth Workshop on Statistical Machine Translation

pdf bib
An Empirical Comparison of Features and Tuning for Phrase-based Machine Translation
Spence Green | Daniel Cer | Christopher Manning
Proceedings of the Ninth Workshop on Statistical Machine Translation

2013

pdf bib
SUTime: Evaluation in TempEval-3
Angel Chang | Christopher D. Manning
Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013)

pdf bib
Feature Noising for Log-Linear Structured Prediction
Sida Wang | Mengqiu Wang | Stefan Wager | Percy Liang | Christopher D. Manning
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
Bilingual Word Embeddings for Phrase-Based Machine Translation
Will Y. Zou | Richard Socher | Daniel Cer | Christopher D. Manning
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank
Richard Socher | Alex Perelygin | Jean Wu | Jason Chuang | Christopher D. Manning | Andrew Ng | Christopher Potts
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
Learning Biological Processes with Global Constraints
Aju Thalappillil Scaria | Jonathan Berant | Mengqiu Wang | Peter Clark | Justin Lewis | Brittany Harding | Christopher D. Manning
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
Fast and Adaptive Online Training of Feature-Rich Translation Models
Spence Green | Sida Wang | Daniel Cer | Christopher D. Manning
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Parsing with Compositional Vector Grammars
Richard Socher | John Bauer | Christopher D. Manning | Andrew Y. Ng
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Joint Word Alignment and Bilingual Named Entity Recognition Using Dual Decomposition
Mengqiu Wang | Wanxiang Che | Christopher D. Manning
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Named Entity Recognition with Bilingual Constraints
Wanxiang Che | Mengqiu Wang | Christopher D. Manning | Ting Liu
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Deep Learning for NLP (without Magic)
Richard Socher | Christopher D. Manning
NAACL HLT 2013 Tutorial Abstracts

pdf bib
Feature-Rich Phrase-based Translation: Stanford University’s Submission to the WMT 2013 Translation Task
Spence Green | Daniel Cer | Kevin Reschke | Rob Voigt | John Bauer | Sida Wang | Natalia Silveira | Julia Neidert | Christopher D. Manning
Proceedings of the Eighth Workshop on Statistical Machine Translation

pdf bib
Positive Diversity Tuning for Machine Translation System Combination
Daniel Cer | Christopher D. Manning | Dan Jurafsky
Proceedings of the Eighth Workshop on Statistical Machine Translation

pdf bib
Proceedings of the Workshop on Continuous Vector Space Models and their Compositionality
Alexandre Allauzen | Hugo Larochelle | Christopher Manning | Richard Socher
Proceedings of the Workshop on Continuous Vector Space Models and their Compositionality

pdf bib
Better Word Representations with Recursive Neural Networks for Morphology
Thang Luong | Richard Socher | Christopher Manning
Proceedings of the Seventeenth Conference on Computational Natural Language Learning

pdf bib
Philosophers are Mortal: Inferring the Truth of Unseen Facts
Gabor Angeli | Christopher Manning
Proceedings of the Seventeenth Conference on Computational Natural Language Learning

pdf bib
More Constructions, More Genres: Extending Stanford Dependencies
Marie-Catherine de Marneffe | Miriam Connor | Natalia Silveira | Samuel R. Bowman | Timothy Dozat | Christopher D. Manning
Proceedings of the Second International Conference on Dependency Linguistics (DepLing 2013)

pdf bib
Learning a Product of Experts with Elitist Lasso
Mengqiu Wang | Christopher D. Manning
Proceedings of the Sixth International Joint Conference on Natural Language Processing

pdf bib
Effect of Non-linear Deep Architecture in Sequence Labeling
Mengqiu Wang | Christopher D. Manning
Proceedings of the Sixth International Joint Conference on Natural Language Processing

pdf bib
Parsing Models for Identifying Multiword Expressions
Spence Green | Marie-Catherine de Marneffe | Christopher D. Manning
Computational Linguistics, Volume 39, Issue 1 - March 2013

2012

pdf bib
Improving Word Representations via Global Context and Multiple Word Prototypes
Eric Huang | Richard Socher | Christopher Manning | Andrew Ng
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Baselines and Bigrams: Simple, Good Sentiment and Topic Classification
Sida Wang | Christopher Manning
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Deep Learning for NLP (without Magic)
Richard Socher | Yoshua Bengio | Christopher D. Manning
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts

pdf bib
SUTime: A library for recognizing and normalizing time expressions
Angel X. Chang | Christopher Manning
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

pdf bib
Did It Happen? The Pragmatic Complexity of Veridicality Assessment
Marie-Catherine de Marneffe | Christopher D. Manning | Christopher Potts
Computational Linguistics, Volume 38, Issue 2 - June 2012

pdf bib
SPEDE: Probabilistic Edit Distance Metrics for MT Evaluation
Mengqiu Wang | Christopher Manning
Proceedings of the Seventh Workshop on Statistical Machine Translation

pdf bib
Accurate Unsupervised Joint Named-Entity Extraction from Unaligned Parallel Text
Robert Munro | Christopher D. Manning
Proceedings of the 4th Named Entity Workshop (NEWS) 2012

pdf bib
Entity Clustering Across Languages
Spence Green | Nicholas Andrews | Matthew R. Gormley | Mark Dredze | Christopher D. Manning
Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Parsing Time: Learning to Interpret Time Expressions
Gabor Angeli | Christopher Manning | Daniel Jurafsky
Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Multi-instance Multi-label Learning for Relation Extraction
Mihai Surdeanu | Julie Tibshirani | Ramesh Nallapati | Christopher D. Manning
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

pdf bib
Learning Constraints for Consistent Timeline Extraction
David McClosky | Christopher D. Manning
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

pdf bib
Probabilistic Finite State Machines for Regression-based MT Evaluation
Mengqiu Wang | Christopher D. Manning
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

pdf bib
Semantic Compositionality through Recursive Matrix-Vector Spaces
Richard Socher | Brody Huval | Christopher D. Manning | Andrew Y. Ng
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

2011

pdf bib
Event Extraction as Dependency Parsing
David McClosky | Mihai Surdeanu | Christopher Manning
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Proceedings of the Fifteenth Conference on Computational Natural Language Learning
Sharon Goldwater | Christopher Manning
Proceedings of the Fifteenth Conference on Computational Natural Language Learning

pdf bib
Customizing an Information Extraction System to a New Domain
Mihai Surdeanu | David McClosky | Mason Smith | Andrey Gusev | Christopher Manning
Proceedings of the ACL 2011 Workshop on Relational Models of Semantics

pdf bib
Event Extraction as Dependency Parsing for BioNLP 2011
David McClosky | Mihai Surdeanu | Christopher Manning
Proceedings of BioNLP Shared Task 2011 Workshop

pdf bib
Model Combination for Event Extraction in BioNLP 2011
Sebastian Riedel | David McClosky | Mihai Surdeanu | Andrew McCallum | Christopher D. Manning
Proceedings of BioNLP Shared Task 2011 Workshop

pdf bib
Analyzing the Dynamics of Research by Extracting Key Aspects of Scientific Papers
Sonal Gupta | Christopher Manning
Proceedings of 5th International Joint Conference on Natural Language Processing

pdf bib
Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions
Richard Socher | Jeffrey Pennington | Eric H. Huang | Andrew Y. Ng | Christopher D. Manning
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

pdf bib
Multiword Expression Identification with Tree Substitution Grammars: A Parsing tour de force with French
Spence Green | Marie-Catherine de Marneffe | John Bauer | Christopher D. Manning
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

2010

pdf bib
Proceedings of the NAACL HLT 2010 First International Workshop on Formalisms and Methodology for Learning by Reading
Rutu Mulkar-Mehta | James Allen | Jerry Hobbs | Eduard Hovy | Bernardo Magnini | Chris Manning
Proceedings of the NAACL HLT 2010 First International Workshop on Formalisms and Methodology for Learning by Reading

pdf bib
Viterbi Training Improves Unsupervised Dependency Parsing
Valentin I. Spitkovsky | Hiyan Alshawi | Daniel Jurafsky | Christopher D. Manning
Proceedings of the Fourteenth Conference on Computational Natural Language Learning

pdf bib
Parsing to Stanford Dependencies: Trade-offs between Speed and Accuracy
Daniel Cer | Marie-Catherine de Marneffe | Dan Jurafsky | Chris Manning
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

pdf bib
“Was It Good? It Was Provocative.” Learning the Meaning of Scalar Adjectives
Marie-Catherine de Marneffe | Christopher D. Manning | Christopher Potts
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

pdf bib
Hierarchical Joint Learning: Improving Joint Parsing and Named Entity Recognition with Non-Jointly Labeled Data
Jenny Rose Finkel | Christopher D. Manning
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

pdf bib
Better Arabic Parsing: Baselines, Evaluations, and Analysis
Spence Green | Christopher D. Manning
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

pdf bib
Probabilistic Tree-Edit Models with Structured Latent Variables for Textual Entailment and Question Answering
Mengqiu Wang | Christopher Manning
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

pdf bib
Subword Variation in Text Message Classification
Robert Munro | Christopher D. Manning
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
The Best Lexical Metric for Phrase-Based Statistical MT System Optimization
Daniel Cer | Christopher D. Manning | Daniel Jurafsky
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
Ensemble Models for Dependency Parsing: Cheap and Good?
Mihai Surdeanu | Christopher D. Manning
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
Improved Models of Distortion Cost for Statistical Machine Translation
Spence Green | Michel Galley | Christopher D. Manning
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
Accurate Non-Hierarchical Phrase-Based Translation
Michel Galley | Christopher D. Manning
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
Phrasal: A Statistical Machine Translation Toolkit for Exploring New Model Features
Daniel Cer | Michel Galley | Daniel Jurafsky | Christopher D. Manning
Proceedings of the NAACL HLT 2010 Demonstration Session

pdf bib
A Multi-Pass Sieve for Coreference Resolution
Karthik Raghunathan | Heeyoung Lee | Sudarshan Rangarajan | Nathanael Chambers | Mihai Surdeanu | Dan Jurafsky | Christopher Manning
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing

2009

pdf bib
Nested Named Entity Recognition
Jenny Rose Finkel | Christopher D. Manning
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf bib
Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora
Daniel Ramage | David Hall | Ramesh Nallapati | Christopher D. Manning
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf bib
Joint Parsing and Named Entity Recognition
Jenny Rose Finkel | Christopher D. Manning
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
Hierarchical Bayesian Domain Adaptation
Jenny Rose Finkel | Christopher D. Manning
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
Robust Machine Translation Evaluation with Entailment Features
Sebastian Padó | Michel Galley | Dan Jurafsky | Christopher D. Manning
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

pdf bib
Quadratic-Time Dependency Parsing for Machine Translation
Michel Galley | Christopher D. Manning
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

pdf bib
Machine Translation Evaluation with Textual Entailment Features
Sebastian Padó | Michel Galley | Daniel Jurafsky | Christopher D. Manning
Proceedings of the Fourth Workshop on Statistical Machine Translation

pdf bib
Disambiguating “DE” for Chinese-English Machine Translation
Pi-Chuan Chang | Daniel Jurafsky | Christopher D. Manning
Proceedings of the Fourth Workshop on Statistical Machine Translation

pdf bib
Discriminative Reordering with Chinese Grammatical Relations Features
Pi-Chuan Chang | Huihsin Tseng | Dan Jurafsky | Christopher D. Manning
Proceedings of the Third Workshop on Syntax and Structure in Statistical Translation (SSST-3) at NAACL HLT 2009

pdf bib
Proceedings of the 2009 Workshop on Applied Textual Inference (TextInfer)
Chris Callison-Burch | Ido Dagan | Christopher Manning | Marco Pennacchiotti | Fabio Massimo Zanzotto
Proceedings of the 2009 Workshop on Applied Textual Inference (TextInfer)

pdf bib
Multi-word expressions in textual inference: Much ado about nothing?
Marie-Catherine de Marneffe | Sebastian Padó | Christopher D. Manning
Proceedings of the 2009 Workshop on Applied Textual Inference (TextInfer)

pdf bib
Presupposed Content and Entailments in Natural Language Inference
David Clausen | Christopher D. Manning
Proceedings of the 2009 Workshop on Applied Textual Inference (TextInfer)

pdf bib
Random Walks for Text Semantic Similarity
Daniel Ramage | Anna N. Rafferty | Christopher D. Manning
Proceedings of the 2009 Workshop on Graph-based Methods for Natural Language Processing (TextGraphs-4)

pdf bib
WikiWalk: Random walks on Wikipedia for Semantic Relatedness
Eric Yeh | Daniel Ramage | Christopher D. Manning | Eneko Agirre | Aitor Soroa
Proceedings of the 2009 Workshop on Graph-based Methods for Natural Language Processing (TextGraphs-4)

pdf bib
An extended model of natural logic
Bill MacCartney | Christopher D. Manning
Proceedings of the Eight International Conference on Computational Semantics

2008

pdf bib
Studying the History of Ideas Using Topic Models
David Hall | Daniel Jurafsky | Christopher D. Manning
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing

pdf bib
Legal Docket Classification: Where Machine Learning Stumbles
Ramesh Nallapati | Christopher D. Manning
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing

pdf bib
A Phrase-Based Alignment Model for Natural Language Inference
Bill MacCartney | Michel Galley | Christopher D. Manning
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing

pdf bib
A Simple and Effective Hierarchical Phrase Reordering Model
Michel Galley | Christopher D. Manning
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing

pdf bib
Lexicon Schemas and Related Data Models: when Standards Meet Users
Thorsten Trippel | Michael Maxwell | Greville Corbett | Cambell Prince | Christopher Manning | Stephen Grimes | Steve Moran
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

pdf bib
Modeling Semantic Containment and Exclusion in Natural Language Inference
Bill MacCartney | Christopher D. Manning
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)

pdf bib
A Global Joint Model for Semantic Role Labeling
Kristina Toutanova | Aria Haghighi | Christopher D. Manning
Computational Linguistics, Volume 34, Number 2, June 2008 - Special Issue on Semantic Role Labeling

pdf bib
Regularization and Search for Minimum Error Rate Training
Daniel Cer | Dan Jurafsky | Christopher D. Manning
Proceedings of the Third Workshop on Statistical Machine Translation

pdf bib
Optimizing Chinese Word Segmentation for Machine Translation Performance
Pi-Chuan Chang | Michel Galley | Christopher D. Manning
Proceedings of the Third Workshop on Statistical Machine Translation

pdf bib
Parsing Three German Treebanks: Lexicalized and Unlexicalized Baselines
Anna Rafferty | Christopher D. Manning
Proceedings of the Workshop on Parsing German

pdf bib
Coling 2008: Proceedings of the workshop on Cross-Framework and Cross-Domain Parser Evaluation
Johan Bos | Edward Briscoe | Aoife Cahill | John Carroll | Stephen Clark | Ann Copestake | Dan Flickinger | Josef van Genabith | Julia Hockenmaier | Aravind Joshi | Ronald Kaplan | Tracy Holloway King | Sandra Kuebler | Dekang Lin | Jan Tore Lønning | Christopher Manning | Yusuke Miyao | Joakim Nivre | Stephan Oepen | Kenji Sagae | Nianwen Xue | Yi Zhang
Coling 2008: Proceedings of the workshop on Cross-Framework and Cross-Domain Parser Evaluation

pdf bib
The Stanford Typed Dependencies Representation
Marie-Catherine de Marneffe | Christopher D. Manning
Coling 2008: Proceedings of the workshop on Cross-Framework and Cross-Domain Parser Evaluation

pdf bib
Which Words Are Hard to Recognize? Prosodic, Lexical, and Disfluency Factors that Increase ASR Error Rates
Sharon Goldwater | Dan Jurafsky | Christopher D. Manning
Proceedings of ACL-08: HLT

pdf bib
Efficient, Feature-based, Conditional Random Field Parsing
Jenny Rose Finkel | Alex Kleeman | Christopher D. Manning
Proceedings of ACL-08: HLT

pdf bib
Finding Contradictions in Text
Marie-Catherine de Marneffe | Anna N. Rafferty | Christopher D. Manning
Proceedings of ACL-08: HLT

pdf bib
Enforcing Transitivity in Coreference Resolution
Jenny Rose Finkel | Christopher D. Manning
Proceedings of ACL-08: HLT, Short Papers

2007

pdf bib
The Infinite Tree
Jenny Rose Finkel | Trond Grenager | Christopher D. Manning
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics

pdf bib
Learning Alignments and Leveraging Natural Logic
Nathanael Chambers | Daniel Cer | Trond Grenager | David Hall | Chloe Kiddon | Bill MacCartney | Marie-Catherine de Marneffe | Daniel Ramage | Eric Yeh | Christopher D. Manning
Proceedings of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing

pdf bib
Natural Logic for Textual Inference
Bill MacCartney | Christopher D. Manning
Proceedings of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing

2006

pdf bib
Learning to recognize features of valid textual entailments
Bill MacCartney | Trond Grenager | Marie-Catherine de Marneffe | Daniel Cer | Christopher D. Manning
Proceedings of the Human Language Technology Conference of the NAACL, Main Conference

pdf bib
Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Tutorial Abstracts
Chris Manning | Doug Oard | Jim Glass
Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Tutorial Abstracts

pdf bib
Unsupervised Discovery of a Statistical Verb Lexicon
Trond Grenager | Christopher D. Manning
Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing

pdf bib
Solving the Problem of Cascading Errors: Approximate Bayesian Inference for Linguistic Annotation Pipelines
Jenny Rose Finkel | Christopher D. Manning | Andrew Y. Ng
Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing

pdf bib
Generating Typed Dependency Parses from Phrase Structure Parses
Marie-Catherine de Marneffe | Bill MacCartney | Christopher D. Manning
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

pdf bib
An Effective Two-Stage Model for Exploiting Non-Local Dependencies in Named Entity Recognition
Vijay Krishnan | Christopher D. Manning
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics

2005

pdf bib
Morphological features help POS tagging of unknown words across language varieties
Huihsin Tseng | Daniel Jurafsky | Christopher Manning
Proceedings of the Fourth SIGHAN Workshop on Chinese Language Processing

pdf bib
A Conditional Random Field Word Segmenter for Sighan Bakeoff 2005
Huihsin Tseng | Pichuan Chang | Galen Andrew | Daniel Jurafsky | Christopher Manning
Proceedings of the Fourth SIGHAN Workshop on Chinese Language Processing

pdf bib
A Joint Model for Semantic Role Labeling
Aria Haghighi | Kristina Toutanova | Christopher Manning
Proceedings of the Ninth Conference on Computational Natural Language Learning (CoNLL-2005)

pdf bib
Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling
Jenny Rose Finkel | Trond Grenager | Christopher Manning
Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05)

pdf bib
Unsupervised Learning of Field Segmentation Models for Information Extraction
Trond Grenager | Dan Klein | Christopher Manning
Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05)

pdf bib
Joint Learning Improves Semantic Role Labeling
Kristina Toutanova | Aria Haghighi | Christopher Manning
Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05)

pdf bib
Robust Textual Inference via Graph Matching
Aria Haghighi | Andrew Ng | Christopher Manning
Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing

2004

pdf bib
Deep Dependencies from Context-Free Statistical Parsers: Correcting the Surface Dependency Approximation
Roger Levy | Christopher Manning
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04)

pdf bib
Corpus-Based Induction of Syntactic Structure: Models of Dependency and Constituency
Dan Klein | Christopher Manning
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04)

pdf bib
Solving logic puzzles: From robust processing to precise semantics
Iddo Lev | Bill MacCartney | Christopher Manning | Roger Levy
Proceedings of the 2nd Workshop on Text Meaning and Interpretation

pdf bib
Exploiting Context for Biomedical Entity Recognition: From Syntax to the Web
Jenny Finkel | Shipra Dingare | Huy Nguyen | Malvina Nissim | Christopher Manning | Gail Sinclair
Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications (NLPBA/BioNLP)

pdf bib
Language Learning: Beyond Thunderdome
Christopher D. Manning
Proceedings of the Eighth Conference on Computational Natural Language Learning (CoNLL-2004) at HLT-NAACL 2004

pdf bib
Max-Margin Parsing
Ben Taskar | Dan Klein | Mike Collins | Daphne Koller | Christopher Manning
Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing

pdf bib
Verb Sense and Subcategorization: Using Joint Inference to Improve Performance on Complementary Task
Galen Andrew | Trond Grenager | Christopher Manning
Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing

pdf bib
The Leaf Path Projection View of Parse Trees: Exploring String Kernels for HPSG Parse Selection
Kristina Toutanova | Penka Markova | Christopher Manning
Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing

2003

pdf bib
Named Entity Recognition with Character-Level Models
Dan Klein | Joseph Smarr | Huy Nguyen | Christopher D. Manning
Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003

pdf bib
A* Parsing: Fast Exact Viterbi Parse Selection
Dan Klein | Christopher D. Manning
Proceedings of the 2003 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network
Kristina Toutanova | Dan Klein | Christopher D. Manning | Yoram Singer
Proceedings of the 2003 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
Optimization, Maxent Models, and Conditional Estimation without Magic
Christopher Manning | Dan Klein
Companion Volume of the Proceedings of HLT-NAACL 2003 - Tutorial Abstracts

pdf bib
Accurate Unlexicalized Parsing
Dan Klein | Christopher D. Manning
Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics

pdf bib
Is it Harder to Parse Chinese, or the Chinese Treebank?
Roger Levy | Christopher D. Manning
Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics

2002

pdf bib
The LinGO Redwoods Treebank: Motivation and Preliminary Applications
Stephan Oepen | Kristina Toutanova | Stuart Shieber | Christopher Manning | Dan Flickinger | Thorsten Brants
COLING 2002: The 17th International Conference on Computational Linguistics: Project Notes

pdf bib
Combining Heterogeneous Classifiers for Word Sense Disambiguation
Dan Klein | Kristina Toutanova | H. Tolga Ilhan | Sepandar D. Kamvar | Christopher D. Manning
Proceedings of the ACL-02 Workshop on Word Sense Disambiguation: Recent Successes and Future Directions

pdf bib
Conditional Structure versus Conditional Estimation in NLP Models
Dan Klein | Christopher D. Manning
Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP 2002)

pdf bib
Extentions to HMM-based Statistical Word Alignment Models
Kristina Toutanova | H. Tolga Ilhan | Christopher Manning
Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP 2002)

pdf bib
Feature Selection for a Rich HPSG Grammar Using Decision Trees
Kristina Toutanova | Christopher D. Manning
COLING-02: The 6th Conference on Natural Language Learning 2002 (CoNLL-2002)

pdf bib
A Generative Constituent-Context Model for Improved Grammar Induction
Dan Klein | Christopher D. Manning
Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics

2001

pdf bib
Distributional phrase structure induction
Dan Klein | Christopher D. Manning
Proceedings of the ACL 2001 Workshop on Computational Natural Language Learning (ConLL)

pdf bib
Parsing and Hypergraphs
Dan Klein | Christopher D. Manning
Proceedings of the Seventh International Workshop on Parsing Technologies

pdf bib
Parsing with Treebank Grammars: Empirical Bounds, Theoretical Models, and the Structure of the Penn Treebank
Dan Klein | Christopher D. Manning
Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics

pdf bib
Combining Heterogeneous Classifiers for Word-Sense Disambiguation
H. Tolga Ilhan | Sepandar D. Kamvar | Dan Klein | Christopher D. Manning | Kristina Toutanova
Proceedings of SENSEVAL-2 Second International Workshop on Evaluating Word Sense Disambiguation Systems

2000

pdf bib
Enriching the Knowledge Sources Used in a Maximum Entropy Part-of-Speech Tagger
Kristina Toutanvoa | Christopher D. Manning
2000 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora

1999

pdf bib
Linguistics in an Age of Engineering
Christopher Manning
Proceedings of the 13th Pacific Asia Conference on Language, Information and Computation

1998

pdf bib
The segmentation problem in morphology learning
Christopher D. Manning
New Methods in Language Processing and Computational Natural Language Learning

1993

pdf bib
Automatic Acquisition of a Large Sub Categorization Dictionary From Corpora
Christopher D. Manning
31st Annual Meeting of the Association for Computational Linguistics

Search
Co-authors