NeuronBlocks – Building Your NLP DNN Models Like Playing Lego

When building deep neural network models for natural language processing tasks, engineers often spend a lot of efforts on coding details and debugging, instead of focusing on model architecture design and hyper-parameter tuning. In this paper, we introduce NeuronBlocks, a deep neural network toolkit for natural language processing tasks. In NeuronBlocks, a suite of neural network layers are encapsulated as building blocks, which can easily be used to build complicated deep neural network models by configuring a simple JSON file. NeuronBlocks empowers engineers to build and train various NLP models in seconds even without a single line of code. A series of experiments on real NLP datasets such as GLUE and WikiQA have been conducted, which demonstrates the effectiveness of NeuronBlocks.


Introduction
Deep neural network (DNN) models have been widely used for solving various natural language processing (NLP) tasks, such as text classification, slot tagging, question answering, etc.However, when engineers try to address specific NLP tasks with DNN models, they often face the following challenges: 1) Lots of DNN frameworks to choose and high studying cost, such as TensorFlow, PyTorch, Keras, etc. 2) Heavy coding cost.Too many details make DNN model code hard to debug.3) Fast Model Architecture Evolution.It is difficult for engineers to understand the mathematical principles behind them.With the above challenges, engineers often spend too much time on code writing and debugging, which limits the efficiency of model building.Our practice shows that in real applications, it is usually more important for engineers to focus on data preparation rather than model implementation, since high quality data often brings more improvement than small variation of model itself.Moreover, simple and standard models are often more robust than complicated variations of models and are also easy to maintain, share, and reuse.For those tasks requiring complex models, the agility of model iteration becomes the key.Once training data has been well collected, it is critical to try different architectures and parameters with small efforts.
Through analyzing NLP jobs submitted to a commercial centralized GPU cluster, we found that about 87.5% NLP related jobs belong to common tasks, such as sentence classification, sequence labelling etc.By further analysis, we found that more than 90% of models can be composed of common components like embedding layer, CNN/RNN, Transformer etc.Based on this finding, our thinking is that whether it is possible to provide a suite of reusable and standard components, as well as a set of popular models for common tasks.This would greatly boost model training productivity by saving lots of duplicate efforts.
Based on the above motivations, we developed NeuronBlocks 1 , a DNN toolkit for NLP tasks, which provides a suite of reusable and standard components, as well as a set of popular models for common tasks.In Block Zoo, we abstract and encapsulate the most commonly used components of deep neural networks into standard and reusable blocks.These block units act like building blocks to form various complicated neural network architectures for different NLP tasks.In Model Zoo, we select a rich scope of model architectures covering popular NLP tasks in the form of simple JSON configuration files.With Model Zoo, users can easily navigate these model configurations and select propriate models for their target task, modify a few configurations if needed, and then start training immediately without any coding efforts.

Overall Framework
The overall framework of NeuronBlocks is shown in Figure 1 which consists of two major components: Block Zoo and Model Zoo.In Block Zoo, we provide commonly used neural network components as building blocks for model architecture design.Model Zoo consists of various DNN models for common NLP tasks, in the form of JSON configuration files.

Model Zoo
NeuronBlocks supports 4 major NLP tasks (more will be added):  Text classification, including single sentence, sentence pair classification, such as query domain classification, intent classification, natural language inference, etc.
 Sequence tagging.Given a sentence, classify each token in the sequence into some predefined categories.Common tasks include NER, POS tagging, slot tagging, etc.
 Regression task.Unlike classification, the output of a regression task is a continuous real number.
 Extractive machine reading comprehension.
Given a question and a passage, predict whether a span from the passage a direct answer to the question or not.
"loss": { "losses": [ { "type": "CrossEntropyLoss", "conf": { "size_average": true }, "inputs": ["output", "label"] } ] }  Metrics: Task metrics are defined here.Users can define a metric list that they want to check during model testing and validation stages.The first metric in the metric list is what our toolkit uses to select the best model during training.

Workflow
The

Model Visualizer
To check the correctness of model configuration and visualize model architecture, NeuronBlocks provides a visualization tool.For example, the model graph of the configuration file we mentioned in Section 3.4 is visualized as Figure 3.

Experiments
To verify the performance of NeuronBlocks, we conducted several experiments on some common NLP tasks like GLUE benchmark (Wang et al., 2018) and WikiQA corpus (Yang, Yih, & Meek, 2015).Results show that models built with NeuronBlocks can get reliable and competitive results on various tasks, with productivity greatly boosted.

GLUE Benchmark
The General Language Understanding Evaluation (GLUE) benchmark (Wang et al., 2018) is a collection of various natural language understanding tasks.We performed experiments on the GLUE benchmark tasks using BiLSTM and Attention based models.We report NeuronBlocks' results on the development sets, since GLUE does not distribute labels for the test sets.The detailed results are showed in Table 1.

Model
CoLA Platform Compatibility Requirement.It requires extra coding work for the model to run on different platforms, such as Linux/Windows, GPU/CPU.

For
model hyper-parameter tuning or model structure modification, just change the JSON config file.Experienced users can even add new customized blocks to Block Zoo.Then these new blocks can be easily used for model architecture design.NeuronBlocks also supports model training on GPU management platform like PAI.

Figure 3 .
Figure 3. Model Visualizer Result of a sample model.
NeuronBlocks minimizes duplicated efforts of code writing during DNN model training.With NeuronBlocks, users only need to define a simple JSON configuration file to design and train models, which enables engineers to focus more on high level model design, instead of programming language and framework details.Block Zoo and Model Zoo.
train a DNN model or inference with one exiting model, users need to define model architecture, training data and hyper-parameters in a JSON configuration file.In this subsection, we will introduce the details of the configuration file through an example for question answer matching task.The configuration file consists of Inputs, Outputs, Training Parameters, Model Architecture, Loss, Metrics, etc.
 Inputs: This part defines the data input configuration, including training data location, data schema, model inputs, model targets, etc.
whole workflow of leveraging NeuronBlocks for model building is quite simple.Users only need to write a JSON configuration file, by either using existing models from Model Zoo or building new models with blocks from Block Zoo.This configuration file is shared for training, testing, and prediction.The commands are as follows:

Table 1 :
(Wang et al., 2018)ts on the GLUE benchmark' development sets.As described in(Wang et al., 2018), for CoLA, we report Matthews correlation.For QQP, we report accuracy and F1.For MNLI, we report accuracy averaged over the matched and mismatched development sets.For all other tasks we report accuracy.All values have been scaled by 100.

Table 2 .
NeuronBlocks' results on WikiQA.AutoML support for automatic neural network architecture search.Currently, NeuronBlocks provides users with easy-to-use experience to build models on top of Model Zoo and Block Zoo.With AutoML support, automatic model architecture design can be achieved for specific task and data.6ConclusionInthis paper, we introduce NeuronBlocks, a DNN toolkit for NLP tasks built on PyTorch.NeuronBlocks provides easy-to-use model training and inference experience for users.Model building time can be significantly reduced from days to hours, even minutes.We also provide a suite of pre-built DNN models for popular NLP tasks in Model Zoo.Majority of the users can simply pick one from Model Zoo to start model training.For some experienced users, they can also create new blocks to Block Zoo, which will greatly benefit the top users for model building.Besides, a series of experiments on real NLP tasks have been conducted to verify the effectiveness of this toolkit. 