What A Sunny Day ☔: Toward Emoji-Sensitive Irony Detection

Irony detection is an important task with applications in identification of online abuse and harassment. With the ubiquitous use of non-verbal cues such as emojis in social media, in this work we aim to study the role of these structures in irony detection. Since the existing irony detection datasets have <10% ironic tweets with emoji, classifiers trained on them are insensitive to emojis. We propose an automated pipeline for creating a more balanced dataset.


Introduction
Social media text often contains non-verbal cues, such as emojis, for users to convey their intention. Statistics have shown that more than 45% of internet users in the United States have used an emoji in social media 1 . Due to this prevalent usage of emoji, some works attempt to exploit the occurences of emoji for tackling NLP tasks, such as sentiment analysis (Chen et al., 2019), emotion detection, and sarcasm detection (Felbo et al., 2017), as the presence of emoji can change the meaning a text as an emoji can have positive or negative tone.
We are interested in analyzing the role of emoji in irony since this specific linguistic phenomenon is related to sentiment analysis and opinion mining (Pang et al., 2008). Irony can also relate to more serious issues, such as criticism  or online harassment . Based on our analysis on existing irony dataset from Se-mEval 2018 , only 9.2% of the ironic tweets contain an emoji. Furthermore, they crawled tweets using irony-related hashtags (i.e. #irony, #sarcasm, #not). This does not capture all variations of ironic occurrences, especially those caused by emojis.
How an emoji changes the meaning of irony tweets is illustrated by the following example. If we have this tweet: "What a sunny day ", it does not sound ironic. However, "What a sunny day " is ironic. From these examples, we can see that we cannot ignore the importance of emoji to identify irony.
Due to the sparcity of ironic tweets containing emoji, our goal is to augment the existing dataset such that the model requires both textual and emoji cues for irony detection.
We first analyze the behavior of emojis in ironic and non-ironic expressions. We find that the presence of emojis can convert a non-ironic text to an ironic text by causing sentiment polarity contrasts. We develop heuristics for data augmentation and evaluate the results. Then, we propose a simple method for generating ironic/non-ironic texts using sentiment polarities and emojis.

Related Work
A common definition of verbal irony is saying things opposite to what is meant (McQuarrie and Mick, 1996;Curcó, 2007). Many studies have diverse opinions regarding sarcasm and irony being different phenomenon (Sperber and Wilson, 1981;Grice, 1978Grice, , 1975 or being the same (Reyes et al., 2013;Attardo et al., 2003). In this work, we do not make a distinction between sarcasm and irony.
Previous work on irony detection relied on hand-crafted features such as punctuation and smiles (Veale and Hao, 2010) or lexical features, such as gap between rare and common words, intensity of adverbs and adjectives, sentiments, and sentence structure (Barbieri and Saggion, 2014).

Dataset Analysis
We analyze the SemEval 2018: Irony Detection in English Tweets dataset . Table 1 shows the data statistics for both ironic and non-ironic tweets. Row 2 shows the tweet distri-bution with respect to the presence of emojis. We can see that only 11% of the all tweets contain an emoji, out of which 46% are ironic. In order to study the robustness of current irony detection model to ironic text containing emoji, it is necessary to augment the existing dataset with additional tweets containing emojis due to the limited amount of ironic tweets with emojis.
We hypothesize that the emojis used for ironic tweets may be different from the emojis used for non-ironic tweets. Table 2 shows ten emojis that most frequently appear in ironic tweets and nonironic tweets in the English dataset. appears most often in both ironic (42 times) tweets and non-ironic (49 times) tweets. For other frequent emojis, except for , the emojis in ironic tweets are different from the emojis in non-ironic tweets. Emojis in the ironic tweets mostly does not have positive sentiment, if we do not want to say negative, such as , , , , and while the most frequent emojis in the non-ironic tweets are dominated by positive emojis, such as , , , , and . Moreover, some tweets may contain multiple emojis. We found that out of 175 ironic tweets that contain emoji, 45% of them contain multiple emojis. We consider to follow this distribution when we are building our ironic tweet generation pipeline.

Manual Data Augmentation
To further analyze the role of emoji in ironic expressions, we conduct qualitative analysis while controlling the effect of the text content. Concretely, we generate ironic and non-ironic texts by manipulating emoji without changing the texts. The resulting texts give us an insight about emoji use and can also be used as an evaluation resource for developing emoji-sensitive irony detection.
Our manual inspection focuses on the three cases of emoji manupulation below.  emojis from the original dataset and inspect whether replacing/removing the emojis converts these ironic tweets to non-ironic tweets.

Case 3 -Irony without emoji → non-irony:
For another set of randomly sampled 50 ironic tweets not containing any emojis originally, we inspect whether addition of any emojis converts these ironic tweets to nonironic tweets.
For each original tweet in each case, three annotators assign a label '1' in case a conversion is possible and '0' otherwise. Additionally, for tweets that can be converted, each of the annotators provides one transformed tweet. Table 3 shows some example tweets. After this annotation step, we obtained 171 transformed tweets in total.
We calculate the inter-annotator agreement for each case in Table 4. Case 2 has the worst agreement. This is possibly because it is difficult to convert non-ironic tweets to ironic tweets only by adding emoji.
For instance, two out of the three annotators felt the following non-ironic tweet "@MiriamMockbill must b in the #blood lol x" can be transformed into an ironic tweet "@MiriamMockbill must b in the #blood lol x " by adding emojis, however the irony in the transformed tweet is not very evident.
Next, we validate the quality of the generated texts. Each example of the generated texts is given to the two annotators. The annotators must rate the given example as 'ironic' or 'nonironic'. The agreement was moderately high. We achieved 100% agreement on 100 out of 171 tweets (58.4%). We call this dataset consisting of the generated 100 tweets plus their 60 original tweets Imoji dataset and use it in a subsequent analysis. To the best of our knowledge, this is the first dataset which contain multiple ironic/non-  ironic expressions with the same text body and different emojis.

Automatic Data Augmentation
Analysis of Imoji dataset suggests that emojis tend to be used for causing "irony by clash" in most cases. Positive emoji is likely to be paired with negative texts in ironic expressions, and vice versa.

Method
Following the insight drawn from Imoji dataset, we propose a simple data augmentation using sentiment analysis dataset so that we can build an ironic detector more robust to emoji.
1. We collected emoji-sentiment lexicon from Emoji Sentiment Ranking (Kralj Novak et al., 2015). This resource contains the emojis' frequency and sentiment polarity. Then, we preprocessed the emojis in from this Emoji Sentiment Ranking, resulting in 48 strongly positive 48 emojis and 48 strongly negative emojis. We filter out low-frequency emoji (bottom 50% frequency), ignore nonemotional symbols (e.g. arrows), and extract top 10% emoji in terms of normalized sentiment scores for each of positive and negative sentiments.
2. Collect tweets with positive and negative sentiments from SemEval 2018 Affect in Tweets Task 3 dataset (Mohammad et al., 2018). This dataset contains total of 2,600 tweets with negative emotions, such as sadness, anger and fear, joy tweets, and sarcastic tweets. Crowdsourcers were asked to annotate them as positive or negative tweets.
3. Generate ironic/non-ironic tweets by adding emoji at the end of texts.

Evaluation of Automatic Generation
We conduct manual analysis of the generated tweets. Table 5 displays the generated ironic and non-ironic tweets.
The first example is generated by combining positive sentiment tweet with negative sentiment emoji, and we agree that it is an ironic text. For the second example, it is quite unclear whether the text is ironic or not.
may not make the text ironic if the writer's purpose is really not to see the other person frown even though the sentiment of the text without emoji itself is slightly negative. The third example is not ironic although it is generated by combining negative tweet with positive emoji. "Bogum" is a Korean actor and "oppa" is commonly used by fangirls to call older Korean male. Thus, using in the text makes sense and does not make it ironic. The last example is a generated non-ironic text by adding positive emoji to positive tweet. Based on this analysis, we decided to use only tweets with positive sentiments as seeds to generate accurate ironic/nonironic tweets.

Preprocessing
To normalize special strings in tweets like URLs, mentions and hashtags, we run ekphrasis 2 (Baziotis et al., 2017) to normalize texts. We also correct non-standard spellings. We collect sentiment analysis datasets for automatic data augmentation from SemEval 2018 Shared Task (Mohammad et al., 2018). Then we obtained 768 additional irony detection instances.

Baseline Model
We use the NTUA-SLP system (Baziotis et al., 2018) from SemEval 2018. It uses standard twolayer biLSTMs and a self-attention mechanism to encode a tweet into a fixed-sized vector and makes a prediction by a logistic regression classifier taking the encoded tweet as input. Embedding layers 2 https://github.com/cbaziotis/ekphrasis Acc Prec Rec are initialized with 300D pre-trained word embeddings, word2vec model trained on tweets for English ( (Baziotis et al., 2017)).

Result
We train the model on our augmented data and test it on the Imoji dataset as shown in Figure  1. To make sure that the performance change by our augmented data (Ours) is not only from the increased number of training instances, we also collect the same number of ironic detection instances as the generated instances from another dataset containing irony annotations (Ghosh et al., 2015). Interestingly, the classifier trained on our augmented dataset achieve much higher recall.

Conclusion
In this work, we presented an automatic pipeline for generating ironic data using sentiment analysis. We observe that our method works well for the irony based on polarity contrast. In summary, the experimental results show our augmented data helped classifiers improve their sensitivity to emojis in irony detection tasks without damaging the overall performance of irony detection on the whole datasets. An interesting future direction is to apply our method to multilingual irony dataset.