jhan014 at SemEval-2019 Task 6: Identifying and Categorizing Offensive Language in Social Media

In this paper, we present two methods to identify and categorize the offensive language in Twitter. In the first method, we establish a probabilistic model to evaluate the sentence offensiveness level and target level according to different sub-tasks. In the second method, we develop a deep neural network consisting of bidirectional recurrent layers with Gated Recurrent Unit (GRU) cells and fully connected layers. In the comparison of two methods, we find both method has its own advantages and drawbacks while they have similar accuracy.


Introduction
With the popularity of social media like Twitter, offensive language has become a serious problem (Zampieri et al., 2019b) on these media platforms. People have to face with abusive behavior from others in social media from time to time. To solve this problem, finding a method to identify and categorize offensive languages is an urgent need.
In this paper, two different methods, deep learning method and modified sentence offensiveness calculation method, are used to categorize the type and target of offensive language and the difference of results are revealed and analyzed.

Related Work
Deep learning method: Deep learning methods are widely used in natural language processing (Liu et al., 2016). Models like Recursive neural network are commonly used to identify if a sentence contain certain emotion. In our work, a deep neural network with GRU layers and all connection layers is built.
Offensiveness Content Filtering: Offensive language targets can be understood through the sentence structure (Silva et al., 2016) or lexical analysis (ElSherief et al., 2018). We take both sentence structure and the offensiveness level of words into consideration. Furthermore, we also concentrate on the special punctuation (like @ and # ) in online social media.

Deep Neural Network
In the offensive language detection task, we developed a deep neural network based system with binary cross-entropy output. System Design The system consists of bidirectional recurrent layers with Gated Recurrent Unit(GRU) cells and fully connected layers (Chung et al., 2014). Because the output of the last time-step is used as the embedding of a sentence, we conduct zero padding in the beginning of each sequence to construct the feature matrix. The system architecture is shown in Table 1.
Optimization Steps Parameters in both RNN layers and Dense layers are initialized by Xiaver initialization method (Glorot and Bengio, 2010 neural network, an early stopping method with 2iteration tolerance is applied to monitor the process. Once the early stopping method is triggered, we manually lower the learning rate by 1/10 to overcome the vibration and search for a smaller minimum loss.

Modified Sentence Offensiveness Calculation
Based on the sentence offensiveness calculation method in this paper (Chen et al., 2012), we develop a model to evaluate the sentence offensiveness.
Offensiveness Dictionary Construction We can always find pejoratives, profanities, or obscenities in offensive twitters. Strongly profanities are always undoubtedly offensive when at users or related to some topics (like #) directly; but there are many other weakly pejoratives and obscenities that may also be offensive. Word offensiveness is defined (Chen et al., 2012) as: for each offensive word, w, its offensiveness where 0 < a 1 < a 2 < 1, for the offensiveness of strongly offensive words is higher than weakly offensive words.
Syntactic Intensifier Detection We also built the syntactic features by an intensifier (Zhang et al., 2009). In a sentence, words syntactically related to offensive word, w, are categorized in an intensifier set, i w = {c 1 , . . . , c k }, for each word c j , its intensify value, d j , is defined as for offensive words used to describe users are more offensive than the words used to describe other offensive words. Thus, the value of intensifier, I w , for offensive word, w, can be calculated as k j=1 d j . Sentence Level Offensiveness Value Consequently, the offensiveness value of sentence, s, becomes a determined linear combination of words' offensiveness From the training data, we learn two thresholds θ 1 , θ 2 . For each sentence, s, we apply these two values If the offensiveness value is greater than θ 1 , the language will be seen as offensive, while if it is smaller than θ 2 then the language will be not offensive. Otherwise, the result will follow a probabilistic distribution.
When solving other sub-tasks, this method can also be used with changing the dictionary and redefine the target words list.

Data
We use the datasets in Zampieri et al. (2019a) and apply following methods to preprocess or transform the data.

Preprocessing
The raw twitter data is preprocessed by a data pipeline. All the information which has nothing to do with word vectors such as stop words and emojis are stripped and the output of the pipeline are lower-case stemmed word sequencies.

Word Embedding
A word embedding step is applied to transform the text into numerics for deep neural networks. 100dimensional Global Vectors(GloVe) word embeddings trained with twitter data are applied in this study considering the trade-off between performance and efficiency of the training process (Pennington et al., 2014). We also explore embedding layers in this study and the pretrained embedding out-performs the embedding layer because of the immense amount of information brought by GloVe's training set with 27-billion tweets.

Sub-task A -Offensive language identification
When identifying whether a sentence is offensive or not, two methods show great difference while the accuracy and F1-score are close (see Table 2). In RNN method, there is more type I error (see Figure 1) which means the model classifies some non-offensive sentences as offensive ones. Since origin dataset is unbalanced, the neural network may not have enough non-offensive training examples to learn. Consequently, it cannot catch the feature and structure of the non-offensive sentences.
In MSOC method, this problem is improved. Due to fixed human defined offensiveness dictionary, the non-offensive sentence is not easily misclassified as offensive one. However, since there are still some offensive words appeared in dataset that are not defined in the dictionary, there is still much type II error (see Figure 2).   The behavior of MSOC method defeats RNN method from all aspects (see Table 3 and Figure 3, 4) when categorizing the types of offense. This is because usually targeted offensive language have different sentence structure with untargetted ones, this make it a really high accuracy approach to categorize offensive type. In details, a target sentence always contains third-person pronouns like him her it them. And in most target tweets, the sentence has some special punctuation like @ and also related to some hot topics #.

Sub-task C -Offense target identification
In offense target identification, RNN method, although has the similar accuracy and F1 score with MSOC method (see Table 4), fails to classify any of the test sentences into 'OTH' class.(see Figure  5) The main reason of this result is 'OTH' class is not as characteristic as other two classes and the partition of this class is the smallest as well. On contrast, MSOC method can successfully classify some test sentences in 'OTH' class. (see Figure 6) This may contribute to the predefined dictionary and sentence structure.

Conclusion
RNN is an easy-implemented and high-efficiency method to solve classification problem in natural language processing. In this case, RNN shows an acceptable result but it has many obvious drawbacks. Such as high recall rate when handling unbalanced data, fail to classify certain class if the class is lack of obvious character. The MSOC mehod, on the contrary, can give classification result of same quality. Even though MSOC cannot improve the accuracy or the F1 score of classification to a great extend, we believe we can combine this method with deep learning method to get a better result in similar problems in the future.