Lei Lin


2023

pdf bib
Learning to Compose Representations of Different Encoder Layers towards Improving Compositional Generalization
Lei Lin | Shuangtao Li | Yafang Zheng | Biao Fu | Shan Liu | Yidong Chen | Xiaodong Shi
Findings of the Association for Computational Linguistics: EMNLP 2023

Recent studies have shown that sequence-to-sequence (seq2seq) models struggle with compositional generalization (CG), i.e., the ability to systematically generalize to unseen compositions of seen components. There is mounting evidence that one of the reasons hindering CG is the representation of the encoder uppermost layer is entangled, i.e., the syntactic and semantic representations of sequences are entangled. However, we consider that the previously identified representation entanglement problem is not comprehensive enough. Additionally, we hypothesize that the source keys and values representations passing into different decoder layers are also entangled. Starting from this intuition, we propose CompoSition (Compose Syntactic and Semantic Representations), an extension to seq2seq models which learns to compose representations of different encoder layers dynamically for different tasks, since recent studies reveal that the bottom layers of the Transformer encoder contain more syntactic information and the top ones contain more semantic information. Specifically, we introduce a composed layer between the encoder and decoder to compose different encoder layers’ representations to generate specific keys and values passing into different decoder layers. CompoSition achieves competitive results on two comprehensive and realistic benchmarks, which empirically demonstrates the effectiveness of our proposal. Codes are available at https://github.com/thinkaboutzero/COMPOSITION.

2021

pdf bib
ITNLP at SemEval-2021 Task 11: Boosting BERT with Sampling and Adversarial Training for Knowledge Extraction
Genyu Zhang | Yu Su | Changhong He | Lei Lin | Chengjie Sun | Lili Shan
Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)

This paper describes the winning system in the End-to-end Pipeline phase for the NLPContributionGraph task. The system is composed of three BERT-based models and the three models are used to extract sentences, entities and triples respectively. Experiments show that sampling and adversarial training can greatly boost the system. In End-to-end Pipeline phase, our system got an average F1 of 0.4703, significantly higher than the second-placed system which got an average F1 of 0.3828.

pdf bib
KuiLeiXi: a Chinese Open-Ended Text Adventure Game
Yadong Xi | Xiaoxi Mao | Le Li | Lei Lin | Yanjiang Chen | Shuhan Yang | Xuhan Chen | Kailun Tao | Zhi Li | Gongzheng Li | Lin Jiang | Siyan Liu | Zeng Zhao | Minlie Huang | Changjie Fan | Zhipeng Hu
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations

There is a long history of research related to automated story generation, dating back as far as the 1970s. Recently, the rapid development of pre-trained language models has spurred great progresses in this field. Equipped with GPT-2 and the latest GPT-3, AI Dungeon has been seen as a famous example of the powerful text generation capabilities of large-scale pre-trained language models, and a possibility for future games. However, as a game, AI Dungeon lacks incentives to players and relies entirely on players to explore on their own. This makes players’ enthusiasm decline rapidly. In this paper, we present an open-ended text adventure game in Chinese, named as KuiLeiXi. In KuiLeiXi, players need to interact with the AI until the pre-determined plot goals are reached. By introducing the plot goals, players have a stronger incentive to explore ways to reach plot goals, while the AI’s abilities are not abused to generate harmful contents. This limited freedom allows this game to be integrated as a part of a romance simulation mobile game, Yu Jian Love. Since KuiLeiXi was launched, it has received a lot of positive feedbacks from more than 100,000 players. A demo video is available at https://youtu.be/DyYZhxMRrkk.

2018

pdf bib
ITNLP-ARC at SemEval-2018 Task 12: Argument Reasoning Comprehension with Attention
Wenjie Liu | Chengjie Sun | Lei Lin | Bingquan Liu
Proceedings of the 12th International Workshop on Semantic Evaluation

Reasoning is a very important topic and has many important applications in the field of natural language processing. Semantic Evaluation (SemEval) 2018 Task 12 “The Argument Reasoning Comprehension” committed to research natural language reasoning. In this task, we proposed a novel argument reasoning comprehension system, ITNLP-ARC, which use Neural Networks technology to solve this problem. In our system, the LSTM model is involved to encode both the premise sentences and the warrant sentences. The attention model is used to merge the two premise sentence vectors. Through comparing the similarity between the attention vector and each of the two warrant vectors, we choose the one with higher similarity as our system’s final answer.

2017

pdf bib
ITNLP-AiKF at SemEval-2017 Task 1: Rich Features Based SVR for Semantic Textual Similarity Computing
Wenjie Liu | Chengjie Sun | Lei Lin | Bingquan Liu
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

Semantic Textual Similarity (STS) devotes to measuring the degree of equivalence in the underlying semantic of the sentence pair. We proposed a new system, ITNLP-AiKF, which applies in the SemEval 2017 Task1 Semantic Textual Similarity track 5 English monolingual pairs. In our system, rich features are involved, including Ontology based, word embedding based, Corpus based, Alignment based and Literal based feature. We leveraged the features to predict sentence pair similarity by a Support Vector Regression (SVR) model. In the result, a Pearson Correlation of 0.8231 is achieved by our system, which is a competitive result in the contest of this track.

2015

pdf bib
Computing Semantic Text Similarity Using Rich Features
Yang Liu | Chengjie Sun | Lei Lin | Xiaolong Wang | Yuming Zhao
Proceedings of the 29th Pacific Asia Conference on Language, Information and Computation

pdf bib
yiGou: A Semantic Text Similarity Computing System Based on SVM
Yang Liu | Chengjie Sun | Lei Lin | Xiaolong Wang
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

2010

pdf bib
CRF tagging for head recognition based on Stanford parser
Yong Cheng | Chengjie Sun | Bingquan Liu | Lei Lin
CIPS-SIGHAN Joint Conference on Chinese Language Processing