OpenUE: An Open Toolkit of Universal Extraction from Text

Natural language processing covers a wide variety of tasks with token-level or sentence-level understandings. In this paper, we provide a simple insight that most tasks can be represented in a single universal extraction format. We introduce a prototype model and provide an open-source and extensible toolkit called OpenUE for various extraction tasks. OpenUE allows developers to train custom models to extract information from the text and supports quick model validation for researchers. Besides, OpenUE provides various functional modules to maintain sufficient modularity and extensibility. Except for the toolkit, we also deploy an online demo with restful APIs to support real-time extraction without training and deploying. Additionally, the online system can extract information in various tasks, including relational triple extraction, slot & intent detection, event extraction, and so on. We release the source code, datasets, and pre-trained models to promote future researches in http://github.com/zjunlp/openue.


Introduction
A large number of natural language processing (NLP) tasks exist to analyze various aspects of human language. Most of them focus on tokenlevel classification (e.g., named entity recognition, slot filling, and argument role classification) or sentence-level understanding (e.g., relation classification, intent detection, and event classification). Previous researchers usually use specifically designed neural network architectures for those tasks. Note that most of those tasks share similar encoder and decoder modules (Jiang et al., 2019). It is beneficial to achieve a unified model for all diverse information extraction tasks without task-specific architectures.
Intuitively, we rethink most of the previous tasks and find that most tasks fall in two categories: token-oriented tasks, where the goal is to predict labeled spans (e.g., named entities, slots, aspects) and sentence-oriented tasks, where the goal is to predict labels regarding the semantic understanding of sentences (e.g., relations, intents, sentiments). The commonality of these tasks inspires us whether there is a universal framework. Moreover, in the domain of efficient human annotation interfaces, it is already standard to use unified representations for a wide variety of NLP tasks. Taking the BRAT (Stenetorp et al., 2012) annotation as an example, this framework has a single unified format which consists of spans (e.g., the span of an entity), and labeled relations between the spans (e.g., "born-in" and "live-in").
Motivated by this, we formulate those tasks regarding both token and sentence as universal extraction and design a simple unified model. Our prototype model studies the possibility of bridging the gap between tasks with single architecture and providing future insight for unified natural language understanding. Furthermore, as there is a lack of practical and stable toolkit to support the implementation, deployment and evaluation of those tasks, we develop a toolkit which is a compliment for existing toolsets such as Spacy 2 for named entity recognition (NER), TagMe (Ferragina and Scaiella, 2010) for entity linking (EL), OpenKE (Han et al., 2018) for knowledge embedding, Stanford OpenIE (Angeli et al., 2015) for open information extraction, and OpenNER  for relation extraction.
To be specific, we develop an open and extensible toolkit named "OpenUE". The toolkit prioritizes operational efficiency based on TensorFlow

Husband Wife
Place Trigger Argument Role

Event Extraction
Jack is married to the microbiologist known as Dr. Germ in the USA.

Relational Triple Extraction
In 1979  and Pytorch 3 , which supports quick model training and validation. Besides, OpenUE is able to meet some individual requirements of incorporating new models with system encapsulation and model extensibility. OpenUE provides interfaces for developers aiming at custom models; thus, it is convenient to start up an extraction system based on OpenUE without writing tedious glue code and knowing too many technical details. We provide an online system to extract structured relational facts, slots as well as intents or events from the text with friendly interactive interfaces and fast reaction speed. We will provide maintenance to meet new requests, add new tasks, and fix bugs in the future. This toolkit may benefit both researchers and industry developers. We highlight our contributions as follows: • We provide a simple prototype implementation of one single model to perform various NLP tasks.
• We provide an open and extensible toolkit to train, evaluate, and serve with multilingual support for universal extraction.
• We open-source our code and release dataset, as well as pre-trained models with open restful APIs for future researchers.
3 Pytorch version is under development.

Application Scenarios
OpenUE is designed for various tasks, including relational triple extraction, slot filling, intent detection, event extraction, and knowledge extraction from the Web, etc. As shown in Figure 1, we give some examples of these application scenarios.

Relational Triple Extraction
Relational Triple Extraction is an essential task in Information Extraction (IE) for Natural Language Processing (NLP) and Knowledge Graph (KG) (Zhang et al., 2018b;Yu et al., 2017;Nan et al., 2020;Zhang et al., 2020a;Ye et al., 2020;Zhang et al., 2020b), which is aimed at detecting a pair of entities along with their relations from unstructured text. For instance, there is a sentence "Paris is known as the romantic capital of France.", and in this example, an ideal relational triple extraction system should extract the relational triple Paris, Capital of, France , in which Capital of is the relation between Paris and France. In this paper, we provide a simple implementation which firstly classifies relations with the sentence and then conduct sequence labeling to extract entities. The relation first approach is beneficial in the real-world setting as most of the sentences contain NA relations; therefore, openUE can filter out noisy candidates that do not have relations to improve computation efficacy.
[SEP] w6 w7 w14 w13 w12 w11 w10 w9 w8 Moreover, we also provide a simple implementation of knowledge extraction from the Web. We implement crawler to obtain raw web pages and apply our approach to extract fact knowledge. Note that the recent knowledge graph is far from complete, while vast numbers of facts exist in web pages. OpenUE is suitable to serve as a schemabased never ended learner from the Web.

Event Extraction
Extracting events from natural language text is an essential yet challenging task for natural language understanding . When given a document, event extraction systems need to recognize event triggers with their specific types and their corresponding arguments with the roles. In real-world settings, classifying documents with specific event types and extracting arguments with role types is necessary. We integrate event extraction into OpenUE (without trigger identification).

Slot Filling and Intent Detection
Natural language understanding (NLU) is critical to the performance of goal-oriented spoken dialogue systems. NLU typically includes the intent detection and slot filling tasks, aiming to form a semantic parse for user utterances. For example, given an utterance from the user, the slot filling annotates the utterance on a word-level, indicating the slot type mentioned by a specific word such as the slot artist mentioned by the word westbam as shown in Figure 1. At the same time, the intent detection works on the utterance-level to give intent label(s) to the whole sentence. As slot filling and intent detection rely on both token-level and sentence-level understanding, we integrate this task into OpenUE.
Note that the OpenUE can also be applied to more related tasks such as aspect-based sentiment analysis (Pontiki et al., 2016), semantic role labeling (Carreras and Màrquez, 2005), and so on.

Toolkit Design and Implementation
To implement a single prototype model for all tasks, we introduce our OpenUE approach, as Figure  2 shows. We design the prototype implementation with separated sentence classification and sequence labeling modules based the following three empirical observations: 1) joint optimization of sequence labeling and sentence classification requires labor-intensive hyper-parameters fine-tuning ; 2) sentence classification first can filter out vast amounts of instances which can reduce computation for sequence labeling; 3) sentence labels with additional information (like a machine reading comprehension style) can provide more signals for sequence labeling (Li et al., 2019).
To design the toolkit, we build a unified underlying platform. OpenUE encapsulates various data processing and training strategies, which implies that developers can maximize the reuse of code to avoid redundant and unnecessary model implementations. We design OpenUE based on TensorFlow and PyTorch, enabling developers to train models on GPUs for operational efficiency. We introduce the model and design details in the following sections.

Tokenization
The tokenization module is designed for tokenizing input text into several tokens. In OpenUE, we implement both word-level tokenization and subword-level tokenization. These two kinds of tokenization can satisfy most tokenization demands; thus, developers can avoid spending too much time writing glue code for data processing. Developers can also build customer tokenizer by extending the BasicTokenizer class and implementing specific tokenization operations.

Classification
The classification module is designed for the sentence-level task. We adopt pre-trained language models as default instance encoders in OpenUE. For each sentence x = {w 1 , w 2 , . . . , w n } in the training set, where w i ∈ x is the word token in sentence x, we first construct input sentence in the form: {[CLS], w 1 , w 2 , . . . , w n , [SEP]}. Then we leverage the output of [CLS] representation to encode the entire sentence information. We apply an MLP layer with a cross-entropy loss to perform sentence classification. In OpenUE, we have also implemented other common encoders such as XL-Net (Yang et al., 2019).

Sequence Labeling
The sequence labeling module is designed for the token-level task. We utilize the same encoder from the previous section to represent instance. We concatenate the output of the classification module (e.g., relations, event types or intents) with the raw sentence as input for sequence labeling. Specifically, take the relational triple extraction as an example, the input is {[CLS], relation, [SEP ], w 1 , w 2 , ..., w n }. To perform sequence labeling, we provide different kinds of implementations. Traditionally, when the hidden states of the words in the sentence are learned, it is convenient to apply the softmax function to obtain final logits. Moreover, we also provide sequence labeling implementations such as CRF (Ye et al., 2009) to tag dependencies for each transition pattern between adjacent tags.

Extractor
To obtain final results, we implement an extractor module to combine the outputs of classification and sequence labeling. For entity and relation extraction, we utilize greedy methods to combine the final results. For other tasks such as slot filling and intent detection, we group those outputs as final predictions.

Experiment and Evaluation
In this section, we evaluate our toolkit OpenUE on several datasets in different tasks. The experimental results illustrate that our implementation with OpenUE can achieve comparable or even better performance compared to some state-of-the-art results.

Relational Triple Extraction
We carry out experiments on four datasets of relational triple extraction: NYT (Riedel et al., 2010), WebNLG (Gardent et al., 2017), SKE and ChMedIE. NYT dataset was originally produced by the distant supervision method. It consists of 1.18M sentences with 24 predefined relation types. WebNLG dataset was originally created for Natural Language Generation (NLG) tasks and adapted by (Zeng et al., 2018) for relational triple extraction task. It contains 246 predefined relation types. Different from the two previous English datasets, SKE is a Chinese dataset for information extraction, which is released in the 2019 Language and Intelligence Challenge 4 . SKE contains 50 relation types, and training texts exceed 200,000. We build our training set, development set, and test set by randomly selecting 50,000, 5,000, and 5,000 texts. ChMedIE is also a Chinese dataset for information extraction in the medical domain. We craw corpus from the Chinese health website 5 and build this dataset via distant supervision. It contains 4 relation types.
We compare our OpenUE with four baseliens. Tagging (Zheng et al., 2017) is an end-to-end method with a novel tagging scheme. CopyR (Zeng et al., 2018) is a Seq2Seq learning framework with a copy mechanism. HRL (Takanobu   We compare our OpenUE with three baseliens. DMCNN (Chen et al., 2015) uses dynamic multipooling to keep multiple events' information. dbRNN (Sha et al., 2018) adds dependency which bridges over Bi-LSTM for event extraction. JMEE  proposes an approach which jointly extract multiple event triggers and arguments by introducing syntactic shortcut arcs to enhance information flow and attention-based graph convolution networks to model graph information. From Table 3 we observe that OpenUE can archive comparable results with JMEE,

Slot Filling and Intent Detection
We conduct experiments on two benchmarks NLU datasets: SNIPS Natural Language Understanding benchmark 8 (SNIPS-NLU) and the Airline Travel Information Systems (ATIS) dataset (Tur et al., 2010). SNIPS-NLU dataset is collected from the Snips personal voice assistant. There are 72 slot labels and 7 intent types in the SNIPS dataset. ATIS dataset is a widely used dataset in NLU research, which includes audio recordings of people making flight reservations. There are 120 slot labels and 21 intent types in the ATIS dataset.
We compare OpenUE with six baselines as follows: CNN TriCRF (Xu and Sarikaya, 2013) Table 1, we observe that OpenUE can archive comparable performance with Capsule-NLU.
In summary, we conclude that there exist general architectures for diverse tasks and OpenUE can achieve comparable performance compared with baselines.

Online System
Besides the toolkit, we also release an online system in http://openue.top. As shown in Figure 3, we train models in different scenarios with multilingual support (English and Chinese) and deploy the model for online access. The online system can be directly applied to extract structured facts, events, and slots & intents from plain text. We also visualize the graph of relational triples and the probabilities of sentence logits (e.g., relation probabilities) to help to analyze model performance.
Additionally, we deploy a schema-based never ended learner that can extract factual knowledge from the Web. Our system has already obtained millions of facts. More details can be shown in the https://openue-docs.readthedocs.io/ en/latest/.
Moreover, we provide open restful APIs 9 for diverse tasks by OpenUE. More tasks, such as aspectbased sentiment analysis, semantic role labeling, and more domains, will be supported in the future.

Conclusion
We provide a simple insight that lots of NLP tasks can be represented in a single format. To this end, we provide a prototype model implementation of universal extraction and introduce an open and extensible toolkit, namely, OpenUE. We conduct extensive experiments which demonstrate that the models implemented by OpenUE are efficient, effective, and can achieve comparable performance compared to the state-of-the-art results. Furthermore, we also provide an online system with restful APIs for meeting real-time extraction without training and deploying. In the future, we plan to utilize the multitask learning or meta-learning algorithms to enhance extraction performance. We will provide long-term maintenance to fix bugs and meet new requests.