Hope at SemEval-2019 Task 6: Mining social media language to discover offensive language

User’s content share through social media has reached huge proportions nowadays. However, along with the free expression of thoughts on social media, people risk getting exposed to various aggressive statements. In this paper, we present a system able to identify and classify offensive user-generated content.


Introduction
With the constant spread of social media, users are spending increasing amounts of time on various social networking sites aiming to connect with peers, to share information or common interests. While users benefit from their use of social media by interacting with and learning from others, they are also at the risk of being exposed to large amounts of offensive contents.
Considering that people are negatively affected by harmful contents, detecting online offensive language to protect users online safety becomes an urgent task. To address concerns on people's access to offensive content over the internet, social media administrators often need to manually review online texts to detect and delete offensive materials. However, manually reviewing and identifying offensive messages is a highly human and time consuming task. Some automatic content filtering software packages have been developed to detect and filter offensive WebPages or paragraphs, mostly word-based approaches.
The "OffensEval: Identifying and Categorizing Offensive Language in Social Media" task at the SemEval 2019 competition (Zampieri et al., 2019a) focuses on detecting and classifying offenses, pervasive in social media.
In this paper, we present a system able to identify whether a tweet is abusive language or not, and if abusive, if it is offensive or not. We trained a model to differentiate between these categories and then analyzed the results to better understand how we can improve the system.
The rest of the paper is organized as follows: section 2 presents other projects related to offensive language identification, section 3 presents the project's data set and methods, section 4 presents the results we have obtained and a short analysis, followed by our last point represented by section 5 with the conclusions.

Related Work
This topic has attracted significant attention in recent years, evidenced by increasing number of recent publications and a several scientific events such as ALW and TRAC workshops.
Offensive language is often subdivided into various intercalated categories, since different subtasks have been grouped under this label. One of the most analyzed such language is "hate speech", i.e. discriminative remarks, such as the racist or sexist ones (Norbata et al., 2016).
Based on work on hate speech, cyberbullying and online abuse, Waseem et al., 2017 proposses a typology that captures central similarities and differences between subtasks and discuss its implications for data annotation and feature construction. Additionally, Waseem et al. (2017) emphasize the practical actions that can be taken by researchers to best approach their abusive language detection subtask of interest.
Lexical detection methods for the offensive language tend to have low precision because they fail to classify messages not containing listed offensive terms. On the other hand, various Hope at SemEval-2019 Task 6: Mining social media language to discover offensive language machine learning methods are used in the literature, from Logistic regression, Naïve Bayes, Decision Trees, Random forests, SMVs to neural networks. Previous analysis of hate speech modeling (Schmidt and Wiegand, 2017) shows that there is a too wide range of features used, and a more advanced feature relevance analysis was needed (Waseem et al., 2017). A first shared task on aggression identification aiming to classify aggressive speech into overt, covert or no aggression was held at the TRAC Workshop collocated with COLING 2018 (Kumar et al., 2018). 130 teams registered to participate in the task, 30 teams submitted their test runs and 20 teams sent their system description paper, which are included in the TRAC workshop proceedings.
The problem of distinguishing general profanity from hate speech is not a trivial task  and requires features that capture a deeper understanding of the text not always possible with surface grams.

Data set and Methods
The data set for SemEval 2019 task 6 was formed from 14100 tweets, 13240 training instances, retrieved from social media and distributed in tabseparated format and 860 tweets for testing (Zampieri et al., 2019b). Using this data set, we were able to identify offense, aggression and hate speech in user generated content.
This section presents our approach for the different subtask, for each submission we uploaded.

Sub-task A: Offensive language identification
Submission 1. We analyzed the training data to identify specific words or expressions for offensive, respectively non-offensive tweets. Based on these expressions, we crafted a set of rules consisting in exact or partial matches of these expressions in the test corpus. Tweets that have complied with these rules have been annotated as offensive. Tweets containing such expression only in a negated form were annotatd as non-offensive. The rest of the tweets were randomly classified in offensive or non-offensive. The application code was written in the Java programming language and the results are presented in Table 1.1.

Submission 2.
We created a lexicon based on two lists of words of offensive lexicons 1 , freely available online, along with the list resulted from the analysis of the set of training tweets, as described above. Using these offensive words or expressions, we developed patterns and we classified the tweets in offensive tweets and nonoffensive tweets. If the tweet was containing at least one word from the lists, it means that the tweet is offensive, otherwise the tweet would be considered not offensive. The results are presented in Table 1.2.
Submission 3: For this submission, we used the same lists of offensive words obtained from external sources, along with the list of offensive words found in the training data, but we put a restriction on the size of the words (more than 4 letters). This constraint was considered due to the fact that we noticed that they introduced noise in the non-offensive tweets. Additionally, we used WordNet to obtain the synonyms of the words we had in our lists. The results are presented in Table  1.3.

Sub-task B: Automatic categorization of offense types
Submission 1: We tokenized the tweets annotated with targeted offensive words and collected different lists of cue words. Additionally, we noticed that if the tweet contained a proper name towards the middle of the sentence, the tweet was marked as a targeted tweet; otherwise it was marked as an untargeted tweet. We used this restriction and made the first submission, with the results presented in Table 2.1.

Submission 2:
For the second submission, we tokenized the test tweets and checked if those words were found in the list of pronouns 2 . If a tweet was containing a pronoun from that list, then that tweet was marked as a targeted offensive one, otherwise it was marked as an untargeted offensive tweet. The results are presented in Table  2.2.

Submission 3:
We separated the tweets in words and counted how many words begin with a capital letter. We didn't take into consideration the "#" (hashtags) and @(@USER) because the vast majority were written with a capital letter. If a tweet was containing at least 2 words with capital letter, then the tweet was marked as being a targeted offensive tweet, otherwise was marked as an untargeted offensive tweet. The results are presented in Table 2.3.

Sub-task C: Offense target identification
We created two lists with pronouns. One list was used for the personal pronouns in singular for and the second one for the personal pronouns in plural. Therefore, we obtained 3 scenarios: -If the tweet contains a personal pronoun from the singular pronoun list, then the tweet is marked IND.
-If the tweet contains a personal pronoun from the plural pronoun list, the tweet is marked GRP.
-If the tweet does not contain any pronouns from the above lists then the tweet is marked as OTH. The results are presented in Table 3.

Results
Below are the results for each individual level using the test set. We report Precision (P), Recall (R), and F-measure (F) for each baseline on all classes along with weighted averages and Macro-F1. The result for sub-task A are presented in table 1, the results for sub-task B are presented in table 2 and the results for sub-task C are presented in Table 3.

Conclusions
The offensive language in social media commonly comes from an unpleasant condition or something that is disgusting or forbidden. We discussed the challenges in detecting offensive language including the abusive words writing patterns in social media.
This paper presents our system participating at SemEval Task 6. We present simple baseline scores on all classes in all of the three sub-tasks.
In the future, we would like to make a comparison between our system and datasets annotation for similar tasks such as aggression or abusive identification and hate speech detection.
As further work, we have already started to study how to use the datasets for applying deep learning techniques to improve our results, based on word embedding, similar to the work presented in (Badjatiya et al., 2017).