Kun Wang


2022

pdf bib
Prompt for Extraction? PAIE: Prompting Argument Interaction for Event Argument Extraction
Yubo Ma | Zehao Wang | Yixin Cao | Mukai Li | Meiqi Chen | Kun Wang | Jing Shao
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

In this paper, we propose an effective yet efficient model PAIE for both sentence-level and document-level Event Argument Extraction (EAE), which also generalizes well when there is a lack of training data. On the one hand, PAIE utilizes prompt tuning for extractive objectives to take the best advantages of Pre-trained Language Models (PLMs). It introduces two span selectors based on the prompt to select start/end tokens among input texts for each role. On the other hand, it captures argument interactions via multi-role prompts and conducts joint optimization with optimal span assignments via a bipartite matching loss. Also, with a flexible prompt design, PAIE can extract multiple arguments with the same role instead of conventional heuristic threshold tuning. We have conducted extensive experiments on three benchmarks, including both sentence- and document-level EAE. The results present promising improvements from PAIE (3.5% and 2.3% F1 gains in average on three benchmarks, for PAIE-base and PAIE-large respectively). Further analysis demonstrates the efficiency, generalization to few-shot settings, and effectiveness of different extractive prompt tuning strategies. Our code is available at https://github.com/mayubo2333/PAIE.

pdf bib
MMEKG: Multi-modal Event Knowledge Graph towards Universal Representation across Modalities
Yubo Ma | Zehao Wang | Mukai Li | Yixin Cao | Meiqi Chen | Xinze Li | Wenqi Sun | Kunquan Deng | Kun Wang | Aixin Sun | Jing Shao
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: System Demonstrations

Events are fundamental building blocks of real-world happenings. In this paper, we present a large-scale, multi-modal event knowledge graph named MMEKG. MMEKG unifies different modalities of knowledge via events, which complement and disambiguate each other. Specifically, MMEKG incorporates (i) over 990 thousand concept events with 644 relation types to cover most types of happenings, and (ii) over 863 million instance events connected through 934 million relations, which provide rich contextual information in texts and/or images. To collect billion-scale instance events and relations among them, we additionally develop an efficient yet effective pipeline for textual/visual knowledge extraction system. We also develop an induction strategy to create million-scale concept events and a schema organizing all events and relations in MMEKG. To this end, we also provide a pipeline enabling our system to seamlessly parse texts/images to event graphs and to retrieve multi-modal knowledge at both concept- and instance-levels.

pdf bib
ERGO: Event Relational Graph Transformer for Document-level Event Causality Identification
Meiqi Chen | Yixin Cao | Kunquan Deng | Mukai Li | Kun Wang | Jing Shao | Yan Zhang
Proceedings of the 29th International Conference on Computational Linguistics

Document-level Event Causality Identification (DECI) aims to identify event-event causal relations in a document. Existing works usually build an event graph for global reasoning across multiple sentences. However, the edges between events have to be carefully designed through heuristic rules or external tools. In this paper, we propose a novel Event Relational Graph TransfOrmer (ERGO) framework for DECI, to ease the graph construction and improve it over the noisy edge issue. Different from conventional event graphs, we define a pair of events as a node and build a complete event relational graph without any prior knowledge or tools. This naturally formulates DECI as a node classification problem, and thus we capture the causation transitivity among event pairs via a graph transformer. Furthermore, we design a criss-cross constraint and an adaptive focal loss for the imbalanced classification, to alleviate the issues of false positives and false negatives. Extensive experiments on two benchmark datasets show that ERGO greatly outperforms previous state-of-the-art (SOTA) methods (12.8% F1 gains on average).

pdf bib
R2F: A General Retrieval, Reading and Fusion Framework for Document-level Natural Language Inference
Hao Wang | Yixin Cao | Yangguang Li | Zhen Huang | Kun Wang | Jing Shao
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Document-level natural language inference (DOCNLI) is a new challenging task in natural language processing, aiming at judging the entailment relationship between a pair of hypothesis and premise documents. Current datasets and baselines largely follow sentence-level settings, but fail to address the issues raised by longer documents. In this paper, we establish a general solution, named Retrieval, Reading and Fusion (R2F) framework, and a new setting, by analyzing the main challenges of DOCNLI: interpretability, long-range dependency, and cross-sentence inference. The basic idea of the framework is to simplify document-level task into a set of sentence-level tasks, and improve both performance and interpretability with the power of evidence. For each hypothesis sentence, the framework retrieves evidence sentences from the premise, and reads to estimate its credibility. Then the sentence-level results are fused to judge the relationship between the documents. For the setting, we contribute complementary evidence and entailment label annotation on hypothesis sentences, for interpretability study. Our experimental results show that R2F framework can obtain state-of-the-art performance and is robust for diverse evidence retrieval methods. Moreover, it can give more interpretable prediction results. Our model and code are released at https://github.com/phoenixsecularbird/R2F.

2021

pdf bib
A Comparison between Pre-training and Large-scale Back-translation for Neural Machine Translation
Dandan Huang | Kun Wang | Yue Zhang
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

2020

pdf bib
What Have We Achieved on Text Summarization?
Dandan Huang | Leyang Cui | Sen Yang | Guangsheng Bao | Kun Wang | Jun Xie | Yue Zhang
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Deep learning has led to significant improvement in text summarization with various methods investigated and improved ROUGE scores reported over the years. However, gaps still exist between summaries produced by automatic summarizers and human professionals. Aiming to gain more understanding of summarization systems with respect to their strengths and limits on a fine-grained syntactic and semantic level, we consult the Multidimensional Quality Metric (MQM) and quantify 8 major sources of errors on 10 representative summarization models manually. Primarily, we find that 1) under similar settings, extractive summarizers are in general better than their abstractive counterparts thanks to strength in faithfulness and factual-consistency; 2) milestone techniques such as copy, coverage and hybrid extractive/abstractive methods do bring specific improvements but also demonstrate limitations; 3) pre-training techniques, and in particular sequence-to-sequence pre-training, are highly effective for improving text summarization, with BART giving the best results.

2019

pdf bib
Code-Switching for Enhancing NMT with Pre-Specified Translation
Kai Song | Yue Zhang | Heng Yu | Weihua Luo | Kun Wang | Min Zhang
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Leveraging user-provided translation to constrain NMT has practical significance. Existing methods can be classified into two main categories, namely the use of placeholder tags for lexicon words and the use of hard constraints during decoding. Both methods can hurt translation fidelity for various reasons. We investigate a data augmentation method, making code-switched training data by replacing source phrases with their target translations. Our method does not change the MNT model or decoding algorithm, allowing the model to learn lexicon translations by copying source-side target words. Extensive experiments show that our method achieves consistent improvements over existing approaches, improving translation of constrained words without hurting unconstrained words.

2015

pdf bib
Well-Formed Dependency to String translation with BTG Grammar
Xiaoqing Li | Kun Wang | Dakun Zhang | Jie Hao
Proceedings of the 29th Pacific Asia Conference on Language, Information and Computation

2014

pdf bib
Dynamically Integrating Cross-Domain Translation Memory into Phrase-Based Machine Translation during Decoding
Kun Wang | Chengqing Zong | Keh-Yih Su
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf bib
Knowledge Sharing via Social Login: Exploiting Microblogging Service for Warming up Social Question Answering Websites
Yang Xiao | Wayne Xin Zhao | Kun Wang | Zhen Xiao
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

2013

pdf bib
Integrating Translation Memory into Phrase-Based Machine Translation during Decoding
Kun Wang | Chengqing Zong | Keh-Yih Su
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2012

pdf bib
Integrating Surface and Abstract Features for Robust Cross-Domain Chinese Word Segmentation
Xiaoqing Li | Kun Wang | Chengqing Zong | Keh-Yih Su
Proceedings of COLING 2012

2010

pdf bib
A Character-Based Joint Model for CIPS-SIGHAN Word Segmentation Bakeoff 2010
Kun Wang | Chengqing Zong | Keh-Yih Su
CIPS-SIGHAN Joint Conference on Chinese Language Processing

pdf bib
A Character-Based Joint Model for Chinese Word Segmentation
Kun Wang | Chengqing Zong | Keh-Yih Su
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

2009

pdf bib
Which is More Suitable for Chinese Word Segmentation, the Generative Model or the Discriminative One?
Kun Wang | Chengqing Zong | Keh-Yih Su
Proceedings of the 23rd Pacific Asia Conference on Language, Information and Computation, Volume 2