Jun Ma

2023

pdf bib abs
PivotFEC: Enhancing Few-shot Factual Error Correction with a Pivot Task Approach using Large Language Models
Xingwei He | A-Long Jin | Jun Ma | Yuan Yuan | Siu Yiu
Findings of the Association for Computational Linguistics: EMNLP 2023

Factual Error Correction (FEC) aims to rectify false claims by making minimal revisions to align them more accurately with supporting evidence. However, the lack of datasets containing false claims and their corresponding corrections has impeded progress in this field. Existing distantly supervised models typically employ the mask-then-correct paradigm, where a masker identifies problematic spans in false claims, followed by a corrector to predict the masked portions. Unfortunately, accurately identifying errors in claims is challenging, leading to issues like over-erasure and incorrect masking. To overcome these challenges, we present PivotFEC, a method that enhances few-shot FEC with a pivot task approach using large language models (LLMs). Specifically, we introduce a pivot task called factual error injection, which leverages LLMs (e.g., ChatGPT) to intentionally generate text containing factual errors under few-shot settings; then, the generated text with factual errors can be used to train the FEC corrector. Our experiments on a public dataset demonstrate the effectiveness of PivotFEC in two significant ways: Firstly, it improves the widely-adopted SARI metrics by 11.3 compared to the best-performing distantly supervised methods. Secondly, it outperforms its few-shot counterpart (i.e., LLMs are directly used to solve FEC) by 7.9 points in SARI, validating the efficacy of our proposed pivot task.

2022

Few-shot named entity recognition (NER) enables us to build a NER system for a new domain using very few labeled examples. However, existing prototypical networks for this task suffer from roughly estimated label dependency and closely distributed prototypes, thus often causing misclassifications. To address the above issues, we propose EP-Net, an Entity-level Prototypical Network enhanced by dispersedly distributed prototypes. EP-Net builds entity-level prototypes and considers text spans to be candidate entities, so it no longer requires the label dependency. In addition, EP-Net trains the prototypes from scratch to distribute them dispersedly and aligns spans to prototypes in the embedding space using a space projection. Experimental results on two evaluation tasks and the Few-NERD settings demonstrate that EP-Net consistently outperforms the previous strong models in terms of overall performance. Extensive analyses further validate the effectiveness of EP-Net.

2021

Successful conversational search systems can present natural, adaptive and interactive shopping experience for online shopping customers. However, building such systems from scratch faces real word challenges from both imperfect product schema/knowledge and lack of training dialog data. In this work we first propose ConvSearch, an end-to-end conversational search system that deeply combines the dialog system with search. It leverages the text profile to retrieve products, which is more robust against imperfect product schema/knowledge compared with using product attributes alone. We then address the lack of data challenges by proposing an utterance transfer approach that generates dialogue utterances by using existing dialog from other domains, and leveraging the search behavior data from e-commerce retailer. With utterance transfer, we introduce a new conversational search dataset for online shopping. Experiments show that our utterance transfer method can significantly improve the availability of training dialogue data without crowd-sourcing, and the conversational search system significantly outperformed the best tested baseline.

2020

pdf bib abs
TXtract: Taxonomy-Aware Knowledge Extraction for Thousands of Product Categories
Giannis Karamanolakis | Jun Ma | Xin Luna Dong
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Extracting structured knowledge from product profiles is crucial for various applications in e-Commerce. State-of-the-art approaches for knowledge extraction were each designed for a single category of product, and thus do not apply to real-life e-Commerce scenarios, which often contain thousands of diverse categories. This paper proposes TXtract, a taxonomy-aware knowledge extraction model that applies to thousands of product categories organized in a hierarchical taxonomy. Through category conditional self-attention and multi-task learning, our approach is both scalable, as it trains a single model for thousands of categories, and effective, as it extracts category-specific attribute values. Experiments on products from a taxonomy with 4,000 categories show that TXtract outperforms state-of-the-art approaches by up to 10% in F1 and 15% in coverage across all categories.

Span-based joint extraction models have shown their efficiency on entity recognition and relation extraction. These models regard text spans as candidate entities and span tuples as candidate relation tuples. Span semantic representations are shared in both entity recognition and relation extraction, while existing models cannot well capture semantics of these candidate entities and relations. To address these problems, we introduce a span-based joint extraction framework with attention-based semantic representations. Specially, attentions are utilized to calculate semantic representations, including span-specific and contextual ones. We further investigate effects of four attention variants in generating contextual semantic representations. Experiments show that our model outperforms previous systems and achieves state-of-the-art results on ACE2005, CoNLL2004 and ADE.

pdf bib abs
Contextualized Emotion Recognition in Conversation as Sequence Tagging
Yan Wang | Jiayu Zhang | Jun Ma | Shaojun Wang | Jing Xiao
Proceedings of the 21th Annual Meeting of the Special Interest Group on Discourse and Dialogue

Emotion recognition in conversation (ERC) is an important topic for developing empathetic machines in a variety of areas including social opinion mining, health-care and so on. In this paper, we propose a method to model ERC task as sequence tagging where a Conditional Random Field (CRF) layer is leveraged to learn the emotional consistency in the conversation. We employ LSTM-based encoders that capture self and inter-speaker dependency of interlocutors to generate contextualized utterance representations which are fed into the CRF layer. For capturing long-range global context, we use a multi-layer Transformer encoder to enhance the LSTM-based encoder. Experiments show that our method benefits from modeling the emotional consistency and outperforms the current state-of-the-art methods on multiple emotion classification datasets.

2018

pdf bib abs
LinkNBed: Multi-Graph Representation Learning with Entity Linkage
Rakshit Trivedi | Bunyamin Sisman | Xin Luna Dong | Christos Faloutsos | Jun Ma | Hongyuan Zha
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Knowledge graphs have emerged as an important model for studying complex multi-relational data. This has given rise to the construction of numerous large scale but incomplete knowledge graphs encoding information extracted from various resources. An effective and scalable approach to jointly learn over multiple graphs and eventually construct a unified graph is a crucial next step for the success of knowledge-based inference for many downstream applications. To this end, we propose LinkNBed, a deep relational learning framework that learns entity and relationship representations across multiple graphs. We identify entity linkage across graphs as a vital component to achieve our goal. We design a novel objective that leverage entity linkage and build an efficient multi-task training procedure. Experiments on link prediction and entity linkage demonstrate substantial improvements over the state-of-the-art relational learning approaches.

2016

pdf bib abs
A Redundancy-Aware Sentence Regression Framework for Extractive Summarization
Pengjie Ren | Furu Wei | Zhumin Chen | Jun Ma | Ming Zhou
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Existing sentence regression methods for extractive summarization usually model sentence importance and redundancy in two separate processes. They first evaluate the importance f(s) of each sentence s and then select sentences to generate a summary based on both the importance scores and redundancy among sentences. In this paper, we propose to model importance and redundancy simultaneously by directly evaluating the relative importance f(s|S) of a sentence s given a set of selected sentences S. Specifically, we present a new framework to conduct regression with respect to the relative gain of s given S calculated by the ROUGE metric. Besides the single sentence features, additional features derived from the sentence relations are incorporated. Experiments on the DUC 2001, 2002 and 2004 multi-document summarization datasets show that the proposed method outperforms state-of-the-art extractive summarization approaches.