Inducing a Lexicon of Abusive Words – a Feature-Based Approach

We address the detection of abusive words. The task is to identify such words among a set of negative polar expressions. We propose novel features employing information from both corpora and lexical resources. These features are calibrated on a small manually annotated base lexicon which we use to produce a large lexicon. We show that the word-level information we learn cannot be equally derived from a large dataset of annotated microposts. We demonstrate the effectiveness of our (domain-independent) lexicon in the cross-domain detection of abusive microposts.


Introduction
Abusive or offensive language is commonly defined as hurtful, derogatory or obscene utterances made by one person to another person. 1 Examples are (1)-(3). In the literature, closely related terms include hate speech (Waseem and Hovy, 2016) or cyber bullying (Zhong et al., 2016). While there may be nuanced differences in meaning 2 , they are all compatible with the general definition above for abusive language. 3 (1) stop editing this, you dumbass.
(2) Just want to slap the stupid out of these bimbos!!! (3) Go lick a pig you arab muslim piece of scum.
Due to the rise of user-generated web content, in particular on social media networks, the amount of abusive language is also steadily growing. NLP methods are required to focus human review efforts towards the most relevant microposts.
In this paper, we address the task of detecting abusive words (e.g. dumbass, bimbo, scum). Our main assumption is that abusive words form a subset of negative polar expressions. The classification task is to filter the abusive words from a given set of negative polar expressions. We proceed as follows. On a base lexicon that is a small subset of negative polar expressions where the abusive words among them have been marked via crowdsourcing ( §3), we calibrate a supervised classifier by examining various novel features ( §4). A classifier trained on that base lexicon, which contains 551 abusive words, is then applied to a very large list of unlabeled negative polar expressions (from Wiktionary) to extract an expanded lexicon of 2989 abusive words ( §5).
We extrinsically evaluate our new lexicon in the novel task of cross-domain classification of abusive documents ( §6) where we use it as a highlevel feature. In this work, we consider microposts as documents. While for in-domain classification, supervised classifiers trained on generic features, such as bag of words or word embeddings, usually score very well, on cross-domain classification they perform poorly since they latch on to domain-specific information. In subjectivity, polarity and emotion classification, high-level features based on predictive domain-independent word lists have been proposed to bridge the domain mismatch (Dias et al., 2009;Mohammad, 2012;Wiegand et al., 2013).
New abusive words constantly enter natural language. For example, according to Wiktionary 4 the word gimboid, which refers to an incompetent person, was coined in the British television series Red Dwarf, possibly from the word gimp and the suffix -oid. According to Urban Dictionary 5 , the word twunt, which is a portmanteau of the swearwords twat and cunt, has been invented by humourist Chris Morris for the Channel 4 series 'Jam' in 2000. One of the most recent abusive words is remoaner which describes someone who complains about or rejects the outcome of the 2016 EU referendum on the UK's membership of the European Union. It is a blend of moan and remainer. Wiktionary states that this word has a pejorative connotation.
These examples show that the task of creating a lexicon of abusive words cannot be reduced to a one-time manual annotation effort. Recent web corpora and crowdsourced dictionaries (e.g. Wiktionary) should be ideal resources to find evidence of such words.
Our contribution is that we present the first work that systematically describes the automatic construction of a lexicon of abusive words. We examine novel features derived from various textual resources. We show that the information we learn cannot be equally derived from a large dataset with labeled microposts. The effectiveness of our expanded lexicon is demonstrated on cross-domain detection of abusive microposts. This is also the first work to address this task in general. The supplementary material to this paper 6 includes all resources newly created for our research.
We frame our task as a binary classification problem. Each given expression is to be classified as either abusive or not. We study this problem on English. However, many of our features should also be applicable to other languages.

Related Work
Lexical knowledge for the detection of abusive language has only received little attention in previous work. Most approaches consider it as one feature among many. Very often existing word lists from the web are employed (Xiang et al., 2012;Burnap and Williams, 2015;Nobata et al., 2016). Their limited effectiveness may be due to the fact that they were not built for the task of abusive language detection. Only the manually-compiled lexicon from Razavi et al. (2010) and the lexicon of hate verbs from Gitari et al. (2015) have been compiled for this specific task. Since the latter lexicon is not publicly available we can only consider the former in our evaluation. In both publications, very little is said on the creation of these resources.
Previous work focused on in-domain classification, a setting where generic features (e.g. bag of words) work well and word lists are less important. There have been investigations examining features on various datasets (Nobata et al., 2016;Samghabadi et al., 2017), however, these studies always trained and tested on the same domain. We show that a lexicon-based approach is effective in cross-domain classification.
For a more detailed overview on previous work on the detection of abusive language in general, we refer the reader to Schmidt and Wiegand (2017).

Data
Base Lexicon. Our base lexicon exclusively comprises negative polar expressions. It is a small set which we have annotated via crowdsourcing. We consider abusive words to be a proper subset of negative polar expressions. By just focusing on these types of words, we are more likely to obtain a significant amount of abusive words than just considering a sample of arbitrary words. This lexicon will be used as a gold standard for calibrating features of a classifier. That classifier will be run on a large set of unlabeled negative polar expressions to produce our expanded lexicon ( §5).
We sampled 500 negative nouns, verbs and adjectives each from the Subjectivity Lexicon (Wilson et al., 2005). We chose that lexicon since we have extra information available for its entries that we want to examine, namely polar intensity ( §4.1.1) and sentiment views ( §4.1.2). However, since we noted that the Subjectivity Lexicon misses some prototypical abusive words (e.g. nigger, slut, cunt) we added another 10% (i.e. 150 words) which are abusive words frequently occurring in the word lists mentioned in Schmidt and Wiegand (2017).
Each of the negative polar expressions was judged by 5 annotators from the crowdsourcing platform ProlificAcademic. 7 Each annotator had to be a native speaker of English and possess a task approval rate of at least 90%. For our base lexicon (Table 1), we considered a binary word categorization: abusive or non-abusive. A word was only classified abusive if at least 4 out of the 5 raters judged the word to be abusive. This threshold should prevent many ambiguous words from being classified as abusive, a general problem of existing resources .
Corpora. In our experiments we employ three unlabeled corpora ( Table 2). The two larger corpora, the Amazon Review Corpus -AMZ (Jindal and Liu, 2008) and the Web As Corpus -WAC (Baroni et al., 2009), are used for inducing word embeddings ( §4.2). AMZ and the smallest corpus, rateitall.com -RIA 8 , are used for computing polar word intensity ( §4.1.1) from star ratings.

Feature Calibration
In the following, we describe the two types of features of our feature-based approach: novel linguistic features and generic word embeddings. They will be examined against some baselines on our base lexicon. As a classifier we use an SVM as implemented in SVM light (Joachims, 1999). We chose that classifier since it is most commonly used for the detection of abusive language (Schmidt and Wiegand, 2017). For all classifiers in this paper, the supplementary material 6 contains information regarding (hyper)parameter settings.

Polar Intensity (INT)
Intuitively, abusive language should coincide with high polar intensity. We inspect 3 different types.
Binary Intensity (INT bin ). Our first feature is a simple binary intensity feature we obtain from the Subjectivity Lexicon. In that resource, each entry is categorized as either a weak polar expression (e.g. dirty) or a strong polar expression (e.g. filthy). Table 3 (left half), which shows the distribution of intensity on the intersection of our base lexicon and the Subjectivity Lexicon, confirms that abusive words are rarely weak polar expressions and more frequently strong polar expressions.
Fine-grained Intensity (INT fine ). We also investigate a more fine-grained feature which assigns a real-valued intensity score to polar expressions. It is computed by leveraging the star-rating assigned to the reviews comprising the AMZ corpus (  corpus. A review is awarded between 1 and 5 stars where 1 is the most negative score. We infer the polar intensity of a word by the distribution of starratings associated with the reviews in which it occurs. We assume negative polar expressions with a very high polar intensity to occur significantly more often in reviews assigned few stars (i.e. 1 or 2). Ruppenhofer et al. (2014) established that the most effective method to derive such polar intensity is by ranking words by their weighted mean of star ratings (Rill et al., 2012). All words of our base lexicon are ranked according to that score. As a feature we use the rank of a word.
Not all negative polar expressions with a high intensity are equally likely to be abusive. The high intensity expressions should also be words typically directed towards persons. Most polar statements in AMZ, however, are directed towards a movie, book or some electronic product. In order to extract negative polar intensity directed towards persons, we replace the AMZ corpus with the RIA corpus (Table 2). RIA contains reviews on arbitrary entities rather than just commercial products as in the case of AMZ. Each review has a category label (e.g. computer, person, travel) that very easily allows us to extract from RIA just those reviews that concern persons. Table 4 compares a typical 1-star review from AMZ with one from RIA. We consider the RIAreview an abusive comment. It contains many words predictive of abusive language (e.g. selfabsorbed, loser, arrogant or loud-mouthed).

Sentiment Views (VIEW)
Wiegand et al. (2016b) define sentiment views as the perspective of the opinion holder of polar expressions. They distinguish between expressions conveying the view of the implicit speaker of the utterance typically referred to as speaker views (e.g. cheating in (4); ugly and stinks in (5)), and expressions conveying the view of event participants typically referred to as actor views (e.g. disappointed and horrified in (6); protested in (7)).  Table 2: Information about unlabeled corpora used (by size we mean the number of tokens).
AMZ on Halloween 5: this movie is horrible with a bad plot a disappointment to the halloween series. RIA on Bill Maher: Self-absorbed loser who tries to pretend to be fair. He is rude, arrogant, loud-mouthed... Table 4: 1-star reviews in different corpora.

Emotion Categories (NRC)
We also examine whether knowledge of emotion categories associated with words is helpful. Potentially negative emotions, such as disgust or anger, should correlate with abusive words. We use the NRC lexicon (Mohammad and Turney, 2013) and employ the categories associated with the words contained in that resource as a feature.

Patterns (PAT)
Noun Pattern (PAT noun ). We found that the noun pattern (8) can be used to extract abusive nouns. Since this pattern is very sparse even on our largest corpus (i.e. WAC), we also run our pattern as a query on Twitter and extracted all matching tweets coming in a time period of 14 days. (We observed that by then we had reached a saturation point.) (8) pattern: called {me|him|her} a(n) <noun> (9) pattern match example: He called me a bitch. Table 5 compares the most frequent matches for that pattern. Our pattern matches much more frequently on Twitter than on WAC. The quality of the matches on Twitter is also much better than on WAC, where we still find many false positives (e.g. name or saint). We assume that tweets, in general, are much more negative in tone than arbitrary web documents (as represented by WAC) which could explain the fewer false positives on Twitter. Note that the ranking from Twitter is not restricted to just prototypical abusive words (as Table 5 might suggest). The entire ranking also contains many less common words, such as weaboo, dudebro or butterface. The frequency ranks of the nouns extracted from Twitter are used as a feature.
Adjective Pattern (PAT adj ). Abusive adjectives often modify an abusive noun as in brainless idiot, smarmy liar or gormless twat. Therefore, we mined Twitter for adjectives modifying mentions of our extracted nouns (PAT noun ). (We were not able to find a construction identifying abusive verbs, so our output from PAT includes no verbs.)

WordNet (WN) and Wiktionary (WK)
We compare WordNet (Miller et al., 1990) and Wiktionary 4 as two general-purpose lexical resources. Unlike WordNet, Wiktionary is produced collaboratively by volunteers rather than linguistic experts. It contains more abusive words from our base lexicon, i.e. 97% (WK) vs. 87% (WN).
A common way to harness a general-purpose lexicon for induction tasks in sentiment analysis is by using its glosses (Choi and Wiebe, 2014;Kang et al., 2014). Assuming that the explanatory texts of glosses are similar among abusive words, we treat glosses as a bag-of-words feature.
We also exploit information on word usage. Many abusive words are marked with tags such as pejorative, derogatory or vulgar. Both WordNet and Wiktionary contain such information. However, in Wiktionary more than 6 times as many of our entries include a tag compared to WordNet.
In order to incorporate a semantic representation more general than individual words, we employ supersenses. Supersenses are only contained in WordNet. They represent a set of 45 classes into which entries are categorized. They have been found effective for sentiment analysis (Flekova and Gurevych, 2016). Some categories correlate with abusive words. For example, 76% of the words of our base lexicon that belong to the supersense person (e.g. loser, idiot) are abusive words.

FrameNet (FN)
FrameNet (Baker et al., 1998) is a semantic resource which provides over 1200 semantic frames that comprise words with similar semantic behaviour. We use the frame-memberships of a word as features, expecting that abusive and nonabusive words occur in separate frames.

Generic Features: Word Embeddings
We induce word embeddings from the two largest corpora, i.e. AMZ and WAC (Table 2) using Word2Vec (Mikolov et al., 2013) in default configuration (i.e. 200 dimensions; cbow). The best performance was obtained by concatenating for each word the vectors induced from the two corpora. 9

Baselines to Feature-based Approach
In addition to a majority-class classifier we consider the following baselines: Weak Supervision (WSUP). With this baseline we want to build a lightweight classifier that does not require proper labeled training data. It is inspired by previous induction approaches for sentiment lexicons, such as Hatzivassiloglou and McKeown (1997) or Velikovich et al. (2010) which heuristically label some seed instances and then apply graph-based propagation to label the remaining words of a dataset. On the basis of word embeddings ( §4.2), we build a word-similarity graph, where the nodes represent our negative polar expressions and each edge denotes the seman- 9 We also ran experiments with pretrained embeddings from GoogleNews but they did not improve classification. tic similarity between two arbitrary words. We compute it by the cosine of their word-embedding vectors. The output of PAT from Twitter ( §4.1.4) is considered as positive class seed instances. We chose PAT since it is an effective feature that does not depend on a lexical resource. As negative class seeds, we use the most frequent words in the WAC corpus (Table 2). Our rationale is that highfrequency words are unlikely to be abusive. We chose WAC instead of Twitter since the evidence of PAT (Table 5) suggested less abusive language in that corpus. This word-similarity graph is illustrated in Figure 1. In order to propagate the labels to the unlabeled words from the seeds, we use the Adsorption algorithm (Talukdar et al., 2008).
Using Labeled Microposts (MICR). With our last baseline we examine in how far we can detect abusive words by only using information from labeled microposts rather than labeled words. These experiments are driven by the fact that labeled microposts already exist. We consider two methods using the largest dataset comprising manually labeled microposts, Wulczyn (Table 8). The class labels of the microposts and our base lexicon ( §3) are the same. Our aim is to produce a ranking of words where the high ranks represent words more likely to be abusive. Since we want to produce a strong baseline, we consider the best possible cut-off rank (see supplementary material 6 ). Every word higher than this rank is considered abusive and all other words not abusive.
The first method MICR:pmi ranks the words of our base lexicon by their Pointwise Mutual Information with the class label abusive that is assigned to microposts. To be even more competitive, we introduce a second method MICR:proj that learns a projection of embeddings. MICR:proj has the advantage over MICR:pmi that it does not only rank words observed in the labeled microposts but all words represented by embeddings. Since our embeddings ( §4.2) are induced on the combination of AMZ and WAC corpora, which together are about 360 times the size of the Wulczyn dataset, MICR:proj is likely to cover more abusive words. Let M = [w 1 ,. . . ,w n ] denote a labeled micropost of n words. Each column w ∈ {0, 1} v of M represents a word in a one-hot form. Our aim is learning a one-dimensional projection S · E where E ∈ R e×v represents our unsupervised embeddings of dimensionality e over the vocabulary size v ( §4.2) and S ∈ R 1×e represents the learnt Figure 1: Illustration of word-similarity graph as used for weakly-supervised baseline (WSUP); seeds for abusive words (e.g. bitch) are obtained by the output of feature PAT ( §4.1.4); seeds for non-abusive words (e.g. disagree) are high-frequency negative polar expressions.  projection matrix. We compute a projected micropost h = S·E·M which is an n-dimensional vector. Each component represents a word from the micropost. The value represents the predictability of the word towards being abusive. We then apply a bag-of-words assumption to use that projected micropost to predict the binary class label y: p(y|M) ∝ exp(h·1) where 1 ∈ {1} n . This model is a feed-forward network trained using Stochastic Gradient Descent (Rumelhart et al., 1986). On the basis of the projected embeddings we rank our negative polar expressions.

Evaluation of Features on Base Lexicon
We conduct experiments on our base lexicon (Table 1) and report macro-average precision, recall and f-score. SVMs are evaluated on a 10-fold crossvalidation.  Table 7 shows the performance of SVMs using different linguistic features ( §4.1). Among the three intensity types, the most effective one is the person-based intensity (INT person ). However, it can be effectively combined with the remaining types. Among the lexical sentiment resources used (i.e. NRC, INT bin and VIEW), VIEW is most effective. Their combination also results in an improvement. The surface patterns (PAT) are surprisingly predictive. Of the general-purpose lexical resources (i.e. WN, WK and FN), WN and WK are both very effective resources. Glosses from WN are the strongest individual feature. Combining WK, WN and FN results in significant improvement. The best feature set combines all features.
Our results also suggest that for languages other than English, there are some very strong features, such as PAT, WK or embeddings, that could be easily adopted since they do not depend on a resource which is only available in English.  are identified by applying to the vocabulary of Wiktionary an SVM trained on the words from the Subjectivity Lexicon with their respective polarities. As features, we use word embeddings ( §4.2). In order to produce the feature-based lexicon of abusive words another SVM is trained on our base lexicon (Table 1) using the best feature set from Table 6. With 2989 abusive words, our expanded lexicon is 5 times as large as the base lexicon. In order to measure the impact of our proposed features on the quality of the resulting lexicon, we devised an alternative expansion which just employs word embeddings. For this, we used Sent-Prop, the most effective induction method from the SocialSent package (Hamilton et al., 2016). 11 6 Cross-domain Classification

Motivation and Set Up
We now apply our expanded lexicon ( §5) to the classification of abusive microposts, i.e. we classify entire comments rather than words out of context. Table 8 shows the datasets of labeled microposts that we use. The difference between these datasets is the source from which they originate. Consequently, different topics are represented in the different datasets. Still, we find similar types 11 Since SentProp produces a ranking rather than a classification, we consider 2989 as a cut-off value to separate the instances into 2 classes. This corresponds to the size of abusive words predicted by our feature-based lexicon (Table 9). dataset size † abusive source (Warner and Hirschberg, 2012) 3438 14.3% diverse (Waseem and Hovy, 2016) 16165 35.3% Twitter (Razavi et al., 2010) 1525 31.9% UseNet (Wulczyn et al., 2017) 115643 11.6% Wikipedia † : total number of microposts in the dataset of abusive language (e.g. racism, sexism). For example, both (10)-(11) from Waseem and (12) from Wulczyn are sexist comments 12 but (10)-(11) discuss the role of women in sports while (12) addresses women's hygiene in Slavic countries.
(10) from Waseem dataset: maybe that's where they should focus? Less cunts on football. (11) from Waseem dataset: I would rather brush my teeth with sandpaper then watch football with a girl!! (12) from Wulczyn dataset: slavic women don't like to wash ... Their pussy stinks.
Since our aim is to produce the best possible cross-domain classifier, all classifiers are trained on one dataset and tested on another. This is a real-life scenario. Often when a classifier for abusive microposts is needed, sufficient labeled data is only available for other text domains.
Having different topics in training and test data makes cross-domain classification difficult. For example, since a large proportion of sexist comments in Waseem relate to sports, traditional supervised classifiers (using bag of words or word embeddings) will learn correlations between words of that domain with the class labels. For instance, the domain-specific word football occurs frequently in Waseem (i.e. 90 occurrences) with a strong correlation towards abusive language (precision: 95%). Other words, such as sports and commentator, display a similar behaviour. A supervised classifier will assign a high weight to such words. While such domain-specific words may aid in-domain classification and enable a correct classification of microposts, such as (11), we will show that it has a detrimental effect on cross-domain classification. We claim that the predictive words that abusive comments share across different domains are abusive words, just of the sort that our expanded lexicon contains, e.g. cunts in (10) and pussy in (12).
Our proposed classifier for labeling microposts is an SVM trained on features derived from our expanded lexicon ( §5). We do not use a binary feature encoding the presence of abusive words. Instead, we rank all abusive words of our lexicon 12 (12) is also a racist comment.   according to the confidence score of the classifier it produced and use their ranks as features.
As baseline classifiers we consider publicly available word lists (Table 9). We include the resource from Razavi et al. (2010), henceforth referred to as Ottawa, the entries of Hatebase 13 , which has been used in Nobata et al. (2016) and , and the derogatory words from Wiktionary (Derogatory) 14 . 15 Finally, we also include our base lexicon (Table 1) in order to evaluate the expansion process of our two expanded lexicons ( §5). For all lists, we train on a single feature indicating the frequency of abusive words in a micropost to be classified. Ottawa also contains weights assigned to abusive words. We weight the observed frequency with these weights.
We further evaluate 3 classifiers representing the state of the art of in-domain evaluations: Fast-Text (Joulin et al., 2017), Gated Recurrent Units Recurrent Neural Networks RNN, which have been reported to work best on English microposts (Pavlopoulos et al., 2017)  trained on the sophisticated feature set proposed by Nobata et al. (2016). Next to character and token n-grams, Yahoo includes word and comment embeddings, syntactic features and some linguistic diagnostics.

Results
In Table 10, we list the performance of the 3 stateof-the-art classifiers along with our proposed classifier using our expanded lexicon on in-domain 10-fold crossvalidation. Due to space limitations, we cannot list the other classifiers. We only provide this list to demonstrate the strength of the state-of-the-art classifiers on in-domain evaluation. On this setting, a lexicon-based approach is not competitive since domain-specific information is not included. However, as we show in Table 11, for cross-domain classification, it is exactly that property that ensures that our feature-based lexicon provides best performance. Compared to the in-domain setting, FastText, RNN and Yahoo display a huge drop in performance. They all suffer from overfitting to domain-specific knowledge. Of all lexicons, our proposed feature-based lexicon performs best. We were surprised by the poor performance of Hatebase but attribute this to its small size and the high amount of ambiguous (and debatable) entries, such as Charlie, pancake, Pepsi. Although our feature-based lexicon is the largest of all tested (i.e. 2989 words), our experiments do not support the general rule that larger lexicons always outperform smaller ones. For instance, already our base lexicon with 551 abusive words is much better than the lexicons Derogatory or Ottawa which are about 3 times larger (Table 9). Each word in our base lexicon was only included if 4 out of 5 raters judged it to be abusive. This ensured a fairly reliable annotation. In contrast, Derogatory and Ottawa suffer from many ambiguous entries (e.g. bag, Tim, yellow). The high precision of our base lexicon is what ensures that our expanded lexicon does not include much noise.
Another shortcoming of most of the other existing lexicons is that they overwhelmingly focus on nouns. While nouns undoubtedly represent the most frequent abusive terms, there is, however, a substantial number of abusive words that belong to other parts of speech, particularly adjectives (e.g. vile, sneaky, slimy, moronic). In our base lexicon, more than 30% of the abusive words are of that part of speech. Our expanded lexicon,  which roughly preserves that ratio, includes about 800 adjectives in total. Since abusive adjectives often co-occur with abusive nouns ( §4.1.4), they may compensate for abusive nouns that are missing from the lexicon. Such unknown nouns often occur when authors of microposts try to obfuscate their abusive language, e.g. sneaky assh0le, slimy b*st*rd. Interestingly, the modifying adjectives are not obfuscated, probably because they are considered slightly less offensive in tone. Given that among the newly created lexicons our feature-based expanded lexicon performs best, we conclude that the expansion is effective (since we improve over the base lexicon), and the features are more effective than a generic induction approach (i.e. SentProp).

Explicitly vs. Implicitly Abusive Microposts
The results in Table 11 also show that the crossdomain performance of our proposed featurebased lexicon is lower on the two datasets Warner and Waseem. We observed that while on the other two datasets almost all abusive microposts can be considered explicitly abusive posts, i.e. they contain abusive words, a large proportion of microposts labeled abusive in Warner and Waseem are implicitly abusive (Waseem et al., 2017), i.e. the abuse is conveyed by other means, such as sarcasm or metaphorical language (11). We asked raters from Prolific Academic to identify explicitly abusive microposts by marking abusive words in those posts. The annotators were not given access to any lexicon of abusive words. We then conducted cross-domain classification on those subsets where the abusive instances were only those rated as ex-plicit. The results are displayed in Table 12. The table shows that our feature-based lexicon is much better on this subset, while the most sophisticated supervised classifier (Yahoo) still performs worse. From that we conclude that only explicitly abusive microposts can be reliably detected in crossdomain classification.

Conclusion
We examined the task of inducing a lexicon of abusive words. We presented novel features including surface patterns, sentiment views, polar intensity and general purpose lexical resources, particularly Wiktionary. The information we thus acquire cannot be learnt all that effectively from labeled microposts, not even with a projectionbased classifier. While a lexicon of abusive words can only aid the detection of explicit abuse, its effectiveness was demonstrated on the novel task of cross-domain detection of abusive microposts, where our domain-independent lexicon outperforms previous supervised classifiers which suffer from overfitting to domain-specific features.