Qingkai Zeng


2023

pdf bib
Auto-Instruct: Automatic Instruction Generation and Ranking for Black-Box Language Models
Zhihan Zhang | Shuohang Wang | Wenhao Yu | Yichong Xu | Dan Iter | Qingkai Zeng | Yang Liu | Chenguang Zhu | Meng Jiang
Findings of the Association for Computational Linguistics: EMNLP 2023

Large language models (LLMs) can perform a wide range of tasks by following natural language instructions, without the necessity of task-specific fine-tuning. Unfortunately, the performance of LLMs is greatly influenced by the quality of these instructions, and manually writing effective instructions for each task is a laborious and subjective process. In this paper, we introduce Auto-Instruct, a novel method to automatically improve the quality of instructions provided to LLMs. Our method leverages the inherent generative ability of LLMs to produce diverse candidate instructions for a given task, and then ranks them using a scoring model trained on a variety of 575 existing NLP tasks. In experiments on 118 out-of-domain tasks, Auto-Instruct surpasses both human-written instructions and existing baselines of LLM-generated instructions. Furthermore, our method exhibits notable generalizability even with other LLMs that are not incorporated into its training process.

2021

pdf bib
Enhancing Factual Consistency of Abstractive Summarization
Chenguang Zhu | William Hinthorn | Ruochen Xu | Qingkai Zeng | Michael Zeng | Xuedong Huang | Meng Jiang
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Automatic abstractive summaries are found to often distort or fabricate facts in the article. This inconsistency between summary and original text has seriously impacted its applicability. We propose a fact-aware summarization model FASum to extract and integrate factual relations into the summary generation process via graph attention. We then design a factual corrector model FC to automatically correct factual errors from summaries generated by existing systems. Empirical results show that the fact-aware summarization can produce abstractive summaries with higher factual consistency compared with existing systems, and the correction model improves the factual consistency of given summaries via modifying only a few keywords.

pdf bib
Technical Question Answering across Tasks and Domains
Wenhao Yu | Lingfei Wu | Yu Deng | Qingkai Zeng | Ruchi Mahindru | Sinem Guven | Meng Jiang
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Papers

Building automatic technical support system is an important yet challenge task. Conceptually, to answer a user question on a technical forum, a human expert has to first retrieve relevant documents, and then read them carefully to identify the answer snippet. Despite huge success the researchers have achieved in coping with general domain question answering (QA), much less attentions have been paid for investigating technical QA. Specifically, existing methods suffer from several unique challenges (i) the question and answer rarely overlaps substantially and (ii) very limited data size. In this paper, we propose a novel framework of deep transfer learning to effectively address technical QA across tasks and domains. To this end, we present an adjustable joint learning approach for document retrieval and reading comprehension tasks. Our experiments on the TechQA demonstrates superior performance compared with state-of-the-art methods.

pdf bib
Validating Label Consistency in NER Data Annotation
Qingkai Zeng | Mengxia Yu | Wenhao Yu | Tianwen Jiang | Meng Jiang
Proceedings of the 2nd Workshop on Evaluation and Comparison of NLP Systems

Data annotation plays a crucial role in ensuring your named entity recognition (NER) projects are trained with the right information to learn from. Producing the most accurate labels is a challenge due to the complexity involved with annotation. Label inconsistency between multiple subsets of data annotation (e.g., training set and test set, or multiple training subsets) is an indicator of label mistakes. In this work, we present an empirical method to explore the relationship between label (in-)consistency and NER model performance. It can be used to validate the label consistency (or catches the inconsistency) in multiple sets of NER data annotation. In experiments, our method identified the label inconsistency of test data in SCIERC and CoNLL03 datasets (with 26.7% and 5.4% label mistakes). It validated the consistency in the corrected version of both datasets.

2020

pdf bib
A Technical Question Answering System with Transfer Learning
Wenhao Yu | Lingfei Wu | Yu Deng | Ruchi Mahindru | Qingkai Zeng | Sinem Guven | Meng Jiang
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

In recent years, the need for community technical question-answering sites has increased significantly. However, it is often expensive for human experts to provide timely and helpful responses on those forums. We develop TransTQA, which is a novel system that offers automatic responses by retrieving proper answers based on correctly answered similar questions in the past. TransTQA is built upon a siamese ALBERT network, which enables it to respond quickly and accurately. Furthermore, TransTQA adopts a standard deep transfer learning strategy to improve its capability of supporting multiple technical domains.

pdf bib
Tri-Train: Automatic Pre-Fine Tuning between Pre-Training and Fine-Tuning for SciNER
Qingkai Zeng | Wenhao Yu | Mengxia Yu | Tianwen Jiang | Tim Weninger | Meng Jiang
Findings of the Association for Computational Linguistics: EMNLP 2020

The training process of scientific NER models is commonly performed in two steps: i) Pre-training a language model by self-supervised tasks on huge data and ii) fine-tune training with small labelled data. The success of the strategy depends on the relevance between the data domains and between the tasks. However, gaps are found in practice when the target domains are specific and small. We propose a novel framework to introduce a “pre-fine tuning” step between pre-training and fine-tuning. It constructs a corpus by selecting sentences from unlabeled documents that are the most relevant with the labelled training data. Instead of predicting tokens in random spans, the pre-fine tuning task is to predict tokens in entity candidates identified by text mining methods. Pre-fine tuning is automatic and light-weight because the corpus size can be much smaller than pre-training data to achieve a better performance. Experiments on seven benchmarks demonstrate the effectiveness.

pdf bib
Crossing Variational Autoencoders for Answer Retrieval
Wenhao Yu | Lingfei Wu | Qingkai Zeng | Shu Tao | Yu Deng | Meng Jiang
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Answer retrieval is to find the most aligned answer from a large set of candidates given a question. Learning vector representations of questions/answers is the key factor. Question-answer alignment and question/answer semantics are two important signals for learning the representations. Existing methods learned semantic representations with dual encoders or dual variational auto-encoders. The semantic information was learned from language models or question-to-question (answer-to-answer) generative processes. However, the alignment and semantics were too separate to capture the aligned semantics between question and answer. In this work, we propose to cross variational auto-encoders by generating questions with aligned answers and generating answers with aligned questions. Experiments show that our method outperforms the state-of-the-art answer retrieval method on SQuAD.

2019

pdf bib
Faceted Hierarchy: A New Graph Type to Organize Scientific Concepts and a Construction Method
Qingkai Zeng | Mengxia Yu | Wenhao Yu | JinJun Xiong | Yiyu Shi | Meng Jiang
Proceedings of the Thirteenth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-13)

On a scientific concept hierarchy, a parent concept may have a few attributes, each of which has multiple values being a group of child concepts. We call these attributes facets: classification has a few facets such as application (e.g., face recognition), model (e.g., svm, knn), and metric (e.g., precision). In this work, we aim at building faceted concept hierarchies from scientific literature. Hierarchy construction methods heavily rely on hypernym detection, however, the faceted relations are parent-to-child links but the hypernym relation is a multi-hop, i.e., ancestor-to-descendent link with a specific facet “type-of”. We use information extraction techniques to find synonyms, sibling concepts, and ancestor-descendent relations from a data science corpus. And we propose a hierarchy growth algorithm to infer the parent-child links from the three types of relationships. It resolves conflicts by maintaining the acyclic structure of a hierarchy.