Duyu Tang


2019

pdf bib
Asking Clarification Questions in Knowledge-Based Question Answering
Jingjing Xu | Yuechen Wang | Duyu Tang | Nan Duan | Pengcheng Yang | Qi Zeng | Ming Zhou | Xu SUN
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

The ability to ask clarification questions is essential for knowledge-based question answering (KBQA) systems, especially for handling ambiguous phenomena. Despite its importance, clarification has not been well explored in current KBQA systems. Further progress requires supervised resources for training and evaluation, and powerful models for clarification-related text understanding and generation. In this paper, we construct a new clarification dataset, CLAQUA, with nearly 40K open-domain examples. The dataset supports three serial tasks: given a question, identify whether clarification is needed; if yes, generate a clarification question; then predict answers base on external user feedback. We provide representative baselines for these tasks and further introduce a coarse-to-fine model for clarification question generation. Experiments show that the proposed model achieves better performance than strong baselines. The further analysis demonstrates that our dataset brings new challenges and there still remain several unsolved problems, like reasonable automatic evaluation metrics for clarification question generation and powerful models for handling entity sparsity.

pdf bib
Multi-Task Learning for Conversational Question Answering over a Large-Scale Knowledge Base
Tao Shen | Xiubo Geng | Tao QIN | Daya Guo | Duyu Tang | Nan Duan | Guodong Long | Daxin Jiang
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

We consider the problem of conversational question answering over a large-scale knowledge base. To handle huge entity vocabulary of a large-scale knowledge base, recent neural semantic parsing based approaches usually decompose the task into several subtasks and then solve them sequentially, which leads to following issues: 1) errors in earlier subtasks will be propagated and negatively affect downstream ones; and 2) each subtask cannot naturally share supervision signals with others. To tackle these issues, we propose an innovative multi-task learning framework where a pointer-equipped semantic parsing model is designed to resolve coreference in conversations, and naturally empower joint learning with a novel type-aware entity detection model. The proposed framework thus enables shared supervisions and alleviates the effect of error propagation. Experiments on a large-scale conversational question answering dataset containing 1.6M question answering pairs over 12.8M entities show that the proposed framework improves overall F1 score from 67% to 79% compared with previous state-of-the-art work.

pdf bib
Coupling Retrieval and Meta-Learning for Context-Dependent Semantic Parsing
Daya Guo | Duyu Tang | Nan Duan | Ming Zhou | Jian Yin
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

In this paper, we present an approach to incorporate retrieved datapoints as supporting evidence for context-dependent semantic parsing, such as generating source code conditioned on the class environment. Our approach naturally combines a retrieval model and a meta-learner, where the former learns to find similar datapoints from the training data, and the latter considers retrieved datapoints as a pseudo task for fast adaptation. Specifically, our retriever is a context-aware encoder-decoder model with a latent variable which takes context environment into consideration, and our meta-learner learns to utilize retrieved datapoints in a model-agnostic meta-learning paradigm for fast adaptation. We conduct experiments on CONCODE and CSQA datasets, where the context refers to class environment in JAVA codes and conversational history, respectively. We use sequence-to-action model as the base semantic parser, which performs the state-of-the-art accuracy on both datasets. Results show that both the context-aware retriever and the meta-learning strategy improve accuracy, and our approach performs better than retrieve-and-edit baselines.

2018

pdf bib
Question Generation from SQL Queries Improves Neural Semantic Parsing
Daya Guo | Yibo Sun | Duyu Tang | Nan Duan | Jian Yin | Hong Chi | James Cao | Peng Chen | Ming Zhou
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

In this paper, we study how to learn a semantic parser of state-of-the-art accuracy with less supervised training data. We conduct our study on WikiSQL, the largest hand-annotated semantic parsing dataset to date. First, we demonstrate that question generation is an effective method that empowers us to learn a state-of-the-art neural network based semantic parser with thirty percent of the supervised training data. Second, we show that applying question generation to the full supervised training data further improves the state-of-the-art model. In addition, we observe that there is a logarithmic relationship between the accuracy of a semantic parser and the amount of training data.

pdf bib
Learning to Collaborate for Question Answering and Asking
Duyu Tang | Nan Duan | Zhao Yan | Zhirui Zhang | Yibo Sun | Shujie Liu | Yuanhua Lv | Ming Zhou
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

Question answering (QA) and question generation (QG) are closely related tasks that could improve each other; however, the connection of these two tasks is not well explored in literature. In this paper, we give a systematic study that seeks to leverage the connection to improve both QA and QG. We present a training algorithm that generalizes both Generative Adversarial Network (GAN) and Generative Domain-Adaptive Nets (GDAN) under the question answering scenario. The two key ideas are improving the QG model with QA through incorporating additional QA-specific signal as the loss function, and improving the QA model with QG through adding artificially generated training instances. We conduct experiments on both document based and knowledge based question answering tasks. We have two main findings. Firstly, the performance of a QG model (e.g in terms of BLEU score) could be easily improved by a QA model via policy gradient. Secondly, directly applying GAN that regards all the generated questions as negative instances could not improve the accuracy of the QA model. Learning when to regard generated questions as positive instances could bring performance boost.

pdf bib
Semantic Parsing with Syntax- and Table-Aware SQL Generation
Yibo Sun | Duyu Tang | Nan Duan | Jianshu Ji | Guihong Cao | Xiaocheng Feng | Bing Qin | Ting Liu | Ming Zhou
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

We present a generative model to map natural language questions into SQL queries. Existing neural network based approaches typically generate a SQL query word-by-word, however, a large portion of the generated results is incorrect or not executable due to the mismatch between question words and table contents. Our approach addresses this problem by considering the structure of table and the syntax of SQL language. The quality of the generated SQL query is significantly improved through (1) learning to replicate content from column names, cells or SQL keywords; and (2) improving the generation of WHERE clause by leveraging the column-cell relation. Experiments are conducted on WikiSQL, a recently released dataset with the largest question- SQL pairs. Our approach significantly improves the state-of-the-art execution accuracy from 69.0% to 74.4%.

2017

pdf bib
Question Generation for Question Answering
Nan Duan | Duyu Tang | Peng Chen | Ming Zhou
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

This paper presents how to generate questions from given passages using neural networks, where large scale QA pairs are automatically crawled and processed from Community-QA website, and used as training data. The contribution of the paper is 2-fold: First, two types of question generation approaches are proposed, one is a retrieval-based method using convolution neural network (CNN), the other is a generation-based method using recurrent neural network (RNN); Second, we show how to leverage the generated questions to improve existing question answering systems. We evaluate our question generation method for the answer sentence selection task on three benchmark datasets, including SQuAD, MS MARCO, and WikiQA. Experimental results show that, by using generated questions as an extra signal, significant QA improvement can be achieved.

2016

pdf bib
Aspect Level Sentiment Classification with Deep Memory Network
Duyu Tang | Bing Qin | Ting Liu
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
English-Chinese Knowledge Base Translation with Neural Network
Xiaocheng Feng | Duyu Tang | Bing Qin | Ting Liu
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Knowledge base (KB) such as Freebase plays an important role for many natural language processing tasks. English knowledge base is obviously larger and of higher quality than low resource language like Chinese. To expand Chinese KB by leveraging English KB resources, an effective way is to translate English KB (source) into Chinese (target). In this direction, two major challenges are to model triple semantics and to build a robust KB translator. We address these challenges by presenting a neural network approach, which learns continuous triple representation with a gated neural network. Accordingly, source triples and target triples are mapped in the same semantic vector space. We build a new dataset for English-Chinese KB translation from Freebase, and compare with several baselines on it. Experimental results show that the proposed method improves translation accuracy compared with baseline methods. We show that adaptive composition model improves standard solution such as neural tensor network in terms of translation accuracy.

pdf bib
Effective LSTMs for Target-Dependent Sentiment Classification
Duyu Tang | Bing Qin | Xiaocheng Feng | Ting Liu
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Target-dependent sentiment classification remains a challenge: modeling the semantic relatedness of a target with its context words in a sentence. Different context words have different influences on determining the sentiment polarity of a sentence towards the target. Therefore, it is desirable to integrate the connections between target word and context words when building a learning system. In this paper, we develop two target dependent long short-term memory (LSTM) models, where target information is automatically taken into account. We evaluate our methods on a benchmark dataset from Twitter. Empirical results show that modeling sentence representation with standard LSTM does not perform well. Incorporating target information into LSTM can significantly boost the classification accuracy. The target-dependent LSTM models achieve state-of-the-art performances without using syntactic parser or external sentiment lexicons.

pdf bib
A Language-Independent Neural Network for Event Detection
Xiaocheng Feng | Lifu Huang | Duyu Tang | Heng Ji | Bing Qin | Ting Liu
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2015

pdf bib
Document Modeling with Gated Recurrent Neural Network for Sentiment Classification
Duyu Tang | Bing Qin | Ting Liu
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Learning Semantic Representations of Users and Products for Document Level Sentiment Classification
Duyu Tang | Bing Qin | Ting Liu
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

2014

pdf bib
Building Large-Scale Twitter-Specific Sentiment Lexicon : A Representation Learning Approach
Duyu Tang | Furu Wei | Bing Qin | Ming Zhou | Ting Liu
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf bib
A Joint Segmentation and Classification Framework for Sentiment Analysis
Duyu Tang | Furu Wei | Bing Qin | Li Dong | Ting Liu | Ming Zhou
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf bib
Learning Sentiment-Specific Word Embedding for Twitter Sentiment Classification
Duyu Tang | Furu Wei | Nan Yang | Ming Zhou | Ting Liu | Bing Qin
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Adaptive Recursive Neural Network for Target-dependent Twitter Sentiment Classification
Li Dong | Furu Wei | Chuanqi Tan | Duyu Tang | Ming Zhou | Ke Xu
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Coooolll: A Deep Learning System for Twitter Sentiment Classification
Duyu Tang | Furu Wei | Bing Qin | Ting Liu | Ming Zhou
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)