CIC at SemEval-2019 Task 5: Simple Yet Very Efficient Approach to Hate Speech Detection, Aggressive Behavior Detection, and Target Classification in Twitter

In recent years, the use of social media has in-creased incredibly. Social media permits Inter-net users a friendly platform to express their views and opinions. Along with these nice and distinct communication chances, it also allows bad things like usage of hate speech. Online automatic hate speech detection in various aspects is a significant scientific problem. This paper presents the Instituto Politécnico Nacional (Mexico) approach for the Semeval 2019 Task-5 [Hateval 2019] (Basile et al., 2019) competition for Multilingual Detection of Hate Speech on Twitter. The goal of this paper is to detect (A) Hate speech against immigrants and women, (B) Aggressive behavior and target classification, both for English and Spanish. In the proposed approach, we used a bag of words model with preprocessing (stem-ming and stop words removal). We submitted two different systems with names: (i) CIC-1 and (ii) CIC-2 for Hateval 2019 shared task. We used TF values in the first system and TF-IDF for the second system. The first system, CIC-1 got 2nd rank in subtask B for both English and Spanish languages with EMR score of 0.568 for English and 0.675 for Spanish. The second system, CIC-2 was ranked 4th in sub-task A and 1st in subtask B for Spanish language with a macro-F1 score of 0.727 and EMR score of 0.705 respectively.


Introduction
The social media applications enable users to discover, create and share contents handily, without specific expertise. This remarkably boosted the amount of data generated by the users, within a process that some people call "democratization" of the web (Silva et al., 2016). Still, this liberty also permits for the publication of data, which is insulting and hurtful both regarding the ethics of democracy and the privileges of some categories of peoplehate speech (HS). The Hate Speech (HS) term is defined in the literature as an expression "that is abusive, insulting, intimidating, harassing, and incites to violence, hatred, or discrimination. It is directed against people by their race, ethnic origin, religion, gender, age, physical condition, disability, sexual orientation, political conviction, and so forth." (Erjavec and Kovacic, 2012). HS has turned into the main issue for each sort of online website, where user-produced content comes into sight: from the comments on any post to live chatting in online games. Such material can isolate users and inflame violence (Allan, 2013). Website operators as Facebook, Twitter, and gaming companies like Runic Games recognize that hateful data are creating both practical and ethical issues and have attempted to demoralize them, causing changes in their platforms or strategies.
As stated by Pew 1 , women experienced more sexualized forms of abuse than men. Platforms as Twitter are flopping in acting immediately against real-time misogyny and taking a lot of time to delete the hateful data 2 . The researchers began to concentrate on this problem and are building techniques to detect misogyny in real time (Fersini et al., 2018;Hewitt et al., 2016;Poland, 2016). Realtime HS about groups of people like asylum searchers and visitors is common all over the world, but it is rarely investigated.
In this article, we worked on the detection of (A) Hate speech against immigrants and women, (B) Aggressive behavior and target classification, both for English and Spanish languages at Hateval 2019. For this task, we submitted two systems with names: (i) CIC-1 and (ii) CIC-2. We used the bagof-words model (plus stemming) with TF and TF-IDF as feature values and then we classified these vectors using various machine learning classifiers. We submitted two approaches (systems). Subtask A is ranked by macro-F1 score, whereas subtask B is ranked by EMR score. Our system CIC-1 got 2 nd rank in subtask B for the both English (2 nd out of 42 teams) and Spanish (2 nd out of 25 teams) languages with EMR score of 0.568 for English; 0.675 for Spanish (accuracy score of 0.766 for English; 0.787 for Spanish). The second system, CIC-2 was ranked 4 th (out of 39 teams) in subtask A and 1 st (out of 23 teams) in subtask B for Spanish language with a macro-F1 score of 0.727 and EMR score of 0.705 respectively (accuracy score of 0.727 in subtask A; 0.791 for subtask B).

Related work
A wide range of work has been devoted to HS detection. Xu et al. (2012) applied sentiment analysis to classify bullying in tweets with the usage of Latent Dirichlet Allocation (LDA) topic models (Blei et al., 2003) to recognize related topics in these scripts.
HS detection has been improved by a diverse range of features such as n-grams (Nobata et al., 2016), character n-grams , paragraph embeddings (Nobata et al., 2016;Djuric et al., 2015) and average word embeddings. (Silva et al., 2016) proposed to detect target groups regarding their class and background on Twitter by looking for sentence structures like "I <intensity> hate <targeted group>".
Currently, interest is increasing in the identification of HS against women on the web (Ging et al., 2018). Initially, Hewitt (2016) worked on the identification of HS against women in social media. Fox (2015) observed that the reaction on hated contents posted against women by unknown and know accounts is different. In (Fox et al., 2015), the authors study the roles of anonymity and interactivity in response to sexist content posted on social media. They inferred that content from unknown account advances more prominent threatening sexism than the known ones.

Corpora and task description
Multilingual detection of hate speech on Twitter shared task at Hateval 2019 had two datasets for the English and Spanish language. We participated in both subtasks for both languages.

Corpora
Corpora for the training of the model consist of 9,000 labeled tweets, and the development dataset includes 1,000 unlabeled tweets. The English data set statistics of different labels is given in Table 1 and the Spanish statistics in Table 2. The corpora is manually labeled by different annotators according to three types: • Hate speech (present vs not-present), • Target range (whole group vs individual), • Aggressiveness (present vs not-present).
We describe these types in the following section.

Description of the subtasks
Subtask A: Hate speech detection against immigrants and women: it is a binary classification problem, where it is asked to predict if a specific piece of text (tweet) with a given target (women or immigrants) expresses hatred or not. The systems are evaluated using standard evaluation metrics, containing accuracy, precision, recall, and macro-F1 score. The submissions are ranked by macro-F1 score. Subtask B: aggressive behavior and target classification: it is required to identify hatred text (tweet) (e.g., tweets, where there is HS against women or immigrants were marked before) as aggressive or not, and on the second place to recognize a harassing target, either the text (tweet) is against an individual or a group. The evaluation of subtask B was carried out using a partial match and exact match (Basile et al., 2019). The submissions are ranked by EMR score. A tweet must be identified exclusively in one of the following types: 1. Hateful: an expression with feelings of dislike, very unpleasant or filled with hatred. 2. Target Range: the tweet contains offensive messages intentionally sent to a particular individual or to a group. 3. Aggressiveness: it is based on the person's purpose to be aggressive, damaging, or even to provoke.

Baselines
The Hateval 2019 has set up two following baselines: • SVC baseline: the SVC baseline is a linear Support Vector Machine (SVM) based on TF-IDF representation.
• MFC baseline: The MFC baseline is a trivial model that assigns the most frequent label (estimated on the training set) to all the instances in the test set.

Description of our approach
In this section, we describe the two submitted approaches (systems) considering the features and machines learning models used for this shared task.

Features
The pre-processed text was used to generate the features for the machine learning (ML) algorithms. We used a well-known bag of words model, for example, (Sidorov, 2013;Sidorov, 2019). For the first system, we used TF and for the second system TF-IDF values.

Machine learning algorithms
In our two systems, we used four different classifiers for both subtasks A and B. In CIC-1: Subtask A: Logistic regression, subtask B: Majority voting. In CIC-2: Subtask A: Multinomial Naive Bayes, subtask B: Classifier chains. For all classifiers, we used available implementation in scikit-learn 3 .

Results and analysis
Results of our both systems CIC-1 and CIC-2 are presented in Table 3, for both shared subtasks, i.e., A and B with our rank in Hateval 2019 competition. Table 3, subtask A ranked by macro-F1 and B by EMR, we used the following conventions. In the first column, "Team" refers to both different systems (CIC-1 and CIC-2) submitted for the shared task. "Task" represents two different subtasks A and B (AF1 means that scores of the subtask A are ranked by macro-F1 and BEMR means that scores of the subtask B are ranked by EMR), see section 3.2. "Classifier" states different classifiers, which we used in this competition. "English" and "Spanish" indicate scores for English and Spanish respectively. "Rankeng." and "Rankspa." mean our team's rank in the competition in both subtasks.
The system CIC-1 got 2 nd rank in subtask B for the both English and Spanish languages with EMR score of 0.568 for English; 0.675 for Spanish. We used majority voting classifier for both languages.  in subtask B (to classify aggressive behavior and target), but was not able to perform well in subtask A (to detect hate speech against immigrants and women) for English language, although we obtained the 2 nd position in subtask A for Spanish language. For Spanish subtask B, we tried to reproduce SVM baseline as by organizers but we failed, our SVM baseline gave us 0.550 accuracy.
We made experiments without stop words removal and stemming, and accuracy, in this case, goes down by 2-3%. We discovered that imbalanced data was the main reason for poor performance on English for subtasks A and B. We noticed that most of the submitted systems achieved poor results on the subtask A.

Conclusion and future work
In this article, we described our approach to detect (1) Hate Speech Detection against immigrants and women; (2) aggressive behavior and target on the Twitter corpus. We submitted two different systems namely: (i) CIC-1 and (i) CIC-2. We used a bag of words model with TF and TF-IDF values. The vectors are then used as features for classifiers like MultinomialNB, Majority Voting, Logistic Regression, and Classifier Chains. Our CIC-1 system ranked 2 nd in task B for both English and Spanish languages. Our system CIC-2 ranked 1 st in task B for Spanish and 4 th for the same language in task A.
In future work, we can consider embeddings with TF-IDF weighting (Arroyo-Fernández et al., 2019) and learning of document embeddings like in (Gómez-Adorno et al,. 2018). We also plan to consider syntactic n-grams (n-grams obtained by following paths in syntactic dependency trees) (Sidorov 2013; 2019).
We have also made the winning model public 4 for other researchers to use.