Cross-Lingual Emotion Lexicon Induction using Representation Alignment in Low-Resource Settings

Emotion lexicons provide information about associations between words and emotions. They have proven useful in analyses of reviews, literary texts, and posts on social media, among other things. We evaluate the feasibility of deriving emotion lexicons cross-lingually, especially for low-resource languages, from existing emotion lexicons in resource-rich languages. For this, we start out from very small corpora to induce cross-lingually aligned vector spaces. Our study empirically analyses the effectiveness of the induced emotion lexicons by measuring translation precision and correlations with existing emotion lexicons, along with measurements on a downstream task of sentence emotion prediction.


Introduction
Two main forms of classifying emotions are often distinguished: representing them along continuous dimensions, or breaking them into discrete categories (Stevenson et al., 2007;Calvo and Kim, 2013). A prominent instance of the former approach is the PAD model by Russell and Mehrabian (1977), which represents affect along 3 dimensions: pleasure, arousal, and dominance. An example of the latter is the Wheel of Emotions by Plutchik (1980), who argued that most emotions can be derived from a set of eight basic ones -anger, fear, sadness, disgust, surprise, anticipation, trust, and joy.
There have been efforts to create emotion lexicons, where each word is assigned either scores or discrete classes reflecting the associated emotions. Such lexicons are useful in emotional analyses of product reviews, literary texts, or posts on social media, inter alia. Bradley et al. (1999) solicited human affective norm ratings to create such a dataset for English based on the PAD model. Mohammad and Turney (2013) relied on crowdsourcing to annotate words with Plutchik's 8 basic emotions, providing binary labels. The recent NRC Emotion Intensity Lexicon (Mohammad, 2018) reconciles the notion of discrete emotions, corresponding to commonly invoked emotion names, with the benefits of continuous scoring in accounting for degrees of emotion intensity. Again relying on crowdsourcing, the lexicon provides intensity scores for Plutchik's eight basic emotions.
Affective norm ratings have as well been procured for certain other languages. An alternative route is to draw on automated techniques such as machine translation, as has been done for the NRC Emotion Intensity lexicon, where the English words are translated to other languages using Google Translate while retaining the original scores. Buechel et al. (2020) used Google Translate to translate a source emotion lexicon to a target lexicon that serves as training data, based on which valence/arousal/dominance or 5 basic emotions are predicted for a range of resource-rich languages. However, at the time of writing this paper, Google Translate serves around 100 languages. This raises the question of whether similar resources can be induced for resource-poor languages using minuscule amounts of data.
In this paper, we investigate simple means of deriving emotion ratings for resource-poor languages. In particular, we consider the case of drawing on very small corpora, focusing on partial translations of the Bible. We explore different cross-lingual embedding alignment techniques that allow us to transfer English emotion ratings to over 350 languages, assessing the accuracy of translations and of our induced emotion ratings. We have made the resulting induced emotion lexicons freely available 1 .

Monolingual Emotion Lexicon Construction
Ground-truth emotion lexicons are typically constructed by manually annotating words with associated emotions. Bradley et al. (1999) aggregated results of a questionnaire to create an emotion lexicon with ratings for the PAD model, Warriner et al. (2013) compiled a similar dataset with larger coverage, and Shoeb and de Melo (2020) solicited emotion ratings for emojis. Crowd-sourcing platforms such as Amazon's Mechanical Turk can be used to expedite the annotation process (Mohammad and Turney, 2013), with techniques such as best-worst scaling to better account for the variance between crowd workers (Kiritchenko and Mohammad, 2016).
Apart from manual compilation, different strategies can be invoked to construct monolingual emotion lexicons automatically. For instance, the DepecheMood lexicon (Staiano and Guerini, 2014) was derived using statistical measures based on emotionally tagged text crawled from specific Web sites. Raji and de Melo (2020) revealed that unsupervised distributional semantics can outperform such supervised techniques. Leveau et al. (2012) showed that word translations across languages are strongly correlated in emotion. As machine translation gradually increased in accuracy, inducing affect-related resources cross-lingually become more feasible (Mihalcea et al., 2007). Lexicons for sentiment polarity have been induced crosslingually using various forms of supervision (Chen and Skiena, 2014;Abdalla and Hirst, 2017;Barnes et al., 2018;Dong and de Melo, 2018b;Dong and de Melo, 2018a). In terms of emotion, Buechel et al. (2020) induced fine-grained emotion lexicons for the 91 languages for which Google Translate was available. However, machine translation tools are limited by the amount of available training data.

Cross-Lingual Emotion Lexicon Induction
In recent years, induction has thus often been achieved by means of cross-lingual word embeddings. While numerous approaches for bilingual embedding training (Gouws and Søgaard, 2015) have been explored, it can be more convenient to draw on potentially larger amounts of monolingual data for embedding training and then achieve a post-hoc alignment of the embedding spaces. Mikolov et al. (2013) showed that word vectors in different languages can often be aligned with reasonably high accuracy using simple linear transformations. Xing et al. (2015) showed that enforcing orthogonality on the linear transformation matrix may result in better translation accuracy. There are now also several unsupervised alignment algorithms seeking to identify orthogonal transformations of embedding vector spaces (Lample et al., 2018;Artetxe et al., 2018;Grave et al., 2019). In this paper, we investigate such approaches for cross-lingual emotion lexicon induction.
Work so far has been limited in at least one of the following ways: 1) polarity lexicon induction as opposed to fine-grained emotion lexicons, 2) induction dependent on supervised data, or 3) unsupervised induction but with languages for which resources like Google Translate or pre-trained fastText embeddings are available. In the following sections, we present a method of emotion lexicon induction that works with resource-poor languages for which such tooling is unavailable.

Proposed Method
In this section, we introduce some pertinent definitions and provide a brief overview of our methodology to induce emotion ratings for resource-poor languages.
We consider a target language L T that is typically a resource-poor one, for which no emotion ratings are available, and a source language L S , for which emotion ratings are available. We define an emotion rating σ e (w) ∈ [0, 1] as an emotion intensity score, i.e., the degree of emotional association of word w with emotion e ∈ E for a set of target emotions E. Accordingly, an emotion lexicon E can be regarded as a function of the form V × E → [0, 1] that maps words w from a vocabulary V paired with emotions e ∈ E to word-emotion ratings σ e (w).
Our method requires three resources: a monolingual text corpus for each of L T and L S , and an emotion lexicon E S for the source language. We induce a target-language E T in a two-step process: First, we induce a cross-lingual word embeddings space covering both L T and L S , by drawing on the monolingual corpora as well as unsupervised cross-lingual alignment (Section 4). Subsequently, we derive emotion ratings for L T using this vector space, based on the source lexicon E S (Section 5).
Our empirical investigations focus mainly on the first step. We evaluate three algorithms to induce cross-lingual word embeddings in such low-resource settings. We also explore additional supervision when the input corpora possess sentence-level alignments, that is, information about which sentences in L T are translations of which sentences in L S . This is the case for the Bible translations considered in this study, due to the presence of verse identifiers.

Cross-Lingual Embedding Induction
In this section, we explain our overall approach to obtain cross-lingually aligned word embeddings, and then briefly outline three of the algorithms we use for alignment along with our modifications.

Approach
For each input corpus, we first invoke the fastText skip-gram algorithm to learn monolingual word embeddings (see Section 6.2 for details). The text in each monolingual corpus is preprocessed to eliminate all Unicode punctuation and converted to lower case. We obtain two embedding matrices X S and X T for the source and target languages, respectively, with corresponding vocabularies V S and V T .
Our goal is to induce a single cross-lingual embedding matrix X C that covers both V S and V T in a single space. For this, we explore three algorithms to align X S and X T : Wasserstein-Procrustes (Grave et al., 2019), Unsupervised Orthogonal Refinement (Artetxe et al., 2018), and a neural language model (Wada et al., 2019). We also consider modifications of the latter two algorithms and evaluate these modified variants alongside the original ones. Note that the neural language model does not require word embeddings to have already been trained on monolingual corpora, as it jointly trains on two corpora to produce embeddings that already reside in a common space. Thus, only the preprocessing steps are performed for it. In the following, we describe each of these techniques in more detail.

Wasserstein-Procrustes
Given two matrices X S and X T containing word embeddings in two different languages, the Wasserstein-Procrustes technique by Grave et al. (2019) calculates a projection matrix such that the Euclidean distances of the projected embeddings are minimized: This is done in an iterative fashion by alternatively a) finding a permutation of X T that minimizes the above equation, then b) using stochastic gradient descent to move to a more optimal value of W and then using singular value decomposition to obtain the nearest orthogonal matrix. Grave et al. also use an initialization wherein they employ a convex relaxation of the equation they try to optimize in the iterative phase, allowing them to solve for an approximation of the orthogonal matrix W in the above equation. Ultimately, we obtain the final cross-lingual embedding matrix X C = X S W X T .

Unsupervised Orthogonal Refinement
Artetxe et al. (2018) presented another algorithm for unsupervised alignment. The goal is to compute orthogonal transformation matrices W S and W T to align embedding matrices X S and X T in the same embedding space, while also building a bidirectional translation mapping between the words in either language. This is achieved in four steps: 1. Normalization. Embedding matrices are length normalized, then centered around the mean dimension-wise, then normalized again (to obtain unit vectors for each embedding). 2. Unsupervised initialization. In this step, π( √ M S ) and π( √ M T ) are computed, where M S = US 2 U for USV = SVD(X S ) (making M S the SVD of X S X S ), and similarly for M T . Here, π sorts each row of its operand in descending order. The idea is that π(X S X S ) and π(X T X T ), unlike X S and X T , are approximately identical up to a permutation of their rows (an assumption that has already been made in the form of assuming the embedding spaces for different languages are at least approximately isometric, as otherwise without it, attempting to find orthogonal mapping matrices is a futile effort). These sorted matrices are then used to compute an initial bilingual dictionary using step b) of the next phase. 3. Iterative refinement. The orthogonal mapping matrices and the bilingual dictionary are iteratively refined by repeating two steps until convergence: a) Compute the optimal orthogonal mapping matrices W S and W T such that similarities for words that translate to each other in the bilingual dictionary are maximized. b) Compute the optimal bilingual dictionary by using a variation of nearest neighbors to identify words in the other language that are closest in the aligned embedding space. The exact scoring mechanism for computing the nearest neighbors is discussed later in Section 5. This phase employs an annealing dropout-like mechanism that randomly deletes entries from the bilingual dictionary to help escape poor local optima. 4. Final refinement. After the previous iterative phase converges on a solution, the mapping matrices are re-weighted according to the cross-correlation in each component, increasing the relevance of those dimensions that best match across languages.

Orthogonal Refinement with Sentence Alignment Initialization
We modified the technique from Section 4.3 for the setting of sentence-level alignments being available, as is the case for the Bible translations that we consider in this study. To exploit this auxiliary source of supervision, we modified the unsupervised initialization phase, the second of the four phases described in Section 4.3. Normally, this step hinges on the assumption that words that are translations of each other have similar statistical distributions. Starting from matrices X S , X T whose rows contain word embeddings trained on monolingual corpora, an initial bilingual dictionary is induced. This is then iteratively refined in the subsequent phase.
Rather than use word embedding matrices, we modified this phase to align term-sentence matrices D S , D T . These are sparse matrices whose rows correspond to words and columns correspond to sentences. Each entry reflects the count of words in that sentence. Thus, we compute USV = SVD(D S ), such that M S = SVD(D S D S ), and likewise for M T based on D T . In our experiments described in Section 7.1, we find that this greatly enhances the robustness of the approach.

Neural Language Model
Finally, we consider a neural language model for unsupervised joint representation induction, as proposed by Wada et al. (2019). The idea is to use jointly-trained forwards and backwards LSTMs trained on monolingual corpora from multiple languages. Different word embedding layers and decoders are used for each language, but weights in the hidden layers are shared, along with the embeddings for the beginning and end-of-sentence tokens, and the weights for calculating the probability of the end-ofsentence token. The shared weights encourage the word embeddings across different languages to be encoded in roughly the same space. After training, the initial word embedding layer weights are used to project word tokens into the same aligned embedding space.
We also investigated a variant of this technique, replacing the LSTMs in the model with QRNNs (Bradbury et al., 2017), and adopting one-cycle learning rate scheduling (Smith, 2018) to reduce the training time and improve the model's precision.

Cross-Lingual Emotion Rating Induction
Equipped with our cross-lingual embedding space X C , we are now able to induce emotion ratings crosslingually based on the source language emotion lexicon E S . For each target language word w ∈ V T and each emotion e ∈ E, we compute a score where σ e (w ) is the emotion rating of a word w from L S according to the source emotion lexicon E S , and i.e., the set of k = 3 words w from the source language vocabulary V S that are most related to w in terms of the corresponding cross-lingual word vectors v w , v w from X C . To compute the relatedness µ(v w , v w ), we adopt Cross-Domain Similarity Local Scaling (CSLS) scores. CSLS assesses the relatedness between two word embeddings v 1 and v 2 from two different languages L 1 and L 2 as follows: The advantage of CSLS over simple cosine similarities is that it compensates for hubness, the property that some vectors in an embedding space reside near overly many other vectors (Lazaridou et al., 2015). It achieves this by subtracting hubness factors R L 1 (v 2 ) and R L 2 (v 1 ) for v 1 , v 2 , where R L i (v) yields the average cosine similarity of the K = 10 nearest neighbors of v in the other language L i .

Experimental Setup
In the following, we present an empirical analysis of the feasibility of inducing emotion ratings using the above methods when drawing on very small monolingual corpora. We first present our data sources (Section 6.1) and algorithmic parameters (Section 6.2), and then discuss various methods of measurement to verify the effectiveness of our methods (Section 6.3). The results follow in Section 7.  We used English as our resource-rich language L S . We selected our resource-poor languages L T in two groups. We picked nine languages that had the full 31K verses present in one group. In this group, Spanish, Hindi, Dutch, Greek, and Russian are present. While these are not actually resourcepoor, we included these to have a useful point of reference against which to compare the performance of our methods with other languages. This group also includes Yoruba, Scots Gaelic, Sinhala, and Maori, languages that have fewer speakers and less data available on the Internet.
In our second group, we picked six languages that had around 10K or fewer verses available. We picked Spanish and Hindi as reference languages again, this time with Bible translations including only the New Testament (around 8K verses). We also picked Corsican, Estonian, Kyrgyz, and Luxembourgish, for which the only Bibles we obtained were ones with around 10K or fewer verses.
Source Lexicon. For the emotion lexicon in English (E S ), we used the NRC Emotion Intensity Lexicon (EIL) by Mohammad (2018). The NRC EIL contains English words with real-valued intensity scores for eight basic emotions -anger, anticipation, disgust, fear, joy, sadness, surprise, and trust.
Ground Truth. The NRC EIL also includes emotion lexicons for around 100 other languages obtained by translating the English words using Google Translate (note that we have fully translated Bibles for over 350 languages, so we are able to cover many more languages than the NRC EIL does). The NRC EIL's machine-translated emotion lexicons serve as a silver standard ground truth against which we compare the emotion ratings we induce using our methods.

Settings and Parameters
When creating fastText skip-gram embeddings for Wasserstein-Procrustes and Orthogonal Refinement, for each language, we trained for 25 epochs with a learning rate of 0.1 and learned 100-dimensional embeddings. These were created only for words with a frequency count of 5 or greater when training on Bibles with 31k sentences, while the frequency cutoff was set to 2 for smaller Bibles with fewer translated sentences.
For Orthogonal Refinement, we used the same settings as the original version by Artetxe et al. (2018). For our variant from Section 4.4, we modified the initialization phase. We picked the common verses  Table 1). The bottom one shows the same, except with ratings averaged across the smaller dataset languages (from Table 2).
from Bibles in L S and L T and used those to create the term-sentence frequency matrix. We also trained the initial fastText embeddings only on the common verses. The remaining hyperparameters for the alignment were the same as for the original version.
For the neural language model, we used the same settings as in the original paper by Wada et al. (2019), except for an increase in the number of epochs from 10 to 20. This was to match the number of epochs used in our modified model, so as to provide a fair comparison. For our modified variant of the neural language model, we used SGD optimization and set the maximum learning rate for the one-cycle scheduling to 0.2, training the model for 20 epochs. We used similar frequency count cutoffs as with the fastText embeddings, except for setting the threshold for English to 3, as this worked better empirically. We trained a model for each language pair L S , L T .

Measurement Methods
Cross-Lingual Embedding Quality. To assess the quality of the cross-lingual embeddings, we used the bilingual dictionaries with 5k word translations from Lample et al. (2018) for the languages for which they are available as a gold standard. For others, we used the NRC EIL, as it contains English words that are machine-translated to other languages to assign them emotion ratings. We report two metrics for each language L T : a) We take each word in L T present in the gold standard dictionary, but we remove out-of-vocabulary words not present in our corpus vocabulary V T , as these are irrelevant for our later downstream emotion ratings task. On this set, we calculate the fraction of words for which our cross-lingual embeddings X C yield the correct translations according to Eq. 3, in terms of precision at k = 3. b) For comparison, we also report the same precision at k = 3 scores as above, but without eliminating out-of-vocabulary words. Here, if a word in the gold standard dictionary is not present in our induced dictionary, we simply count it as incorrectly translated.
Emotion Rating Induction. To evaluate the accuracy of our emotion ratings for each language L T , we take the intersection of words in the NRC EIL and in the respective target corpus vocabulary V T , and   Table 3: Induced emotion ratings using our variant of Orthogonal Refinement. These ratings are for the nine large dataset languages. The bottom row of the header is the size of the induced emotion lexicon, calculated by counting the number of word-emotion pairs for a given language.
calculate the Pearson correlation coefficient for each language and each emotion. Unlike with translation precision, we do not also consider results without eliminating out-of-vocabulary words, as very few words per emotion (typically less than 100) are shared by both our induced dictionary and the NRC EIL, while thousands of words per emotion are often present in either the NRC EIL or in our induced emotion ratings individually. Thus, calculating the correlation on the entire set does not yield meaningful results.

Results
We present the evaluation of cross-lingual embeddings in Section 7.1 and of the resulting emotion ratings in Section 7.2. Additionally, we conduct a case study on sentence-level emotion ratings in Section 7.4.

Cross-Lingual Embedding Induction
In Tables 1 and 2, we provide the evaluation of our cross-lingual embedding induction phase in terms of translation precision. Table 1 considers the set of languages with the full 31K verses of translated Bible text. We observe that Wasserstein-Procrustes is frequently outperformed by Orthogonal Refinement, although the latter fails entirely for a greater number languages. Exploiting parallel information, our modified Orthogonal Refinement is substantially more robust and obtains the best results for most of the languages, losing out on just a few to the original Orthogonal Refinement. However, the original method is not as robust, failing to arrive at embedding alignments for 4 out of 9 languages. Our initialization procedure, while not affecting precision much where alignments could already be found without it, appears to aid in bootstrapping the alignment process. While our procedure is clearly less scalable than operating on the word embedding matrices, on our datasets with just 31k sentences or fewer, the computations could be performed on a single GPU in just a few minutes. Hence, we conclude that our variant is best-suited for small aligned corpora, whereas for large corpora the original method is likely to work well enough.
Our variant of the Neural Language Model (NLM) performs significantly better than the original by Wada et al. (2019), and also is more robust than the original Orthogonal Refinement. However, it does not prevail over our variant of Orthogonal Refinement. Table 2 provides the results for languages with around 10K or fewer verses translated. Across the board, all algorithms fail to achieve satisfactory results. Our algorithm variants show slightly better results than the original methods, but the absolute precision remains low. It appears that such neural representation learning methods require more data in order to start arriving at robust embeddings suitable for accurate translation induction.

Emotion Ratings
The correlation of induced emotion ratings with those in the ground truth are presented in the graphs in Figure 1, reported separately for each method and emotion. Figure 1a considers the set of nine languages with large datasets (as listed in Table 1), and each reading was obtained by averaging the coefficients across the nine languages. Our variant of Orthogonal Refinement attains the best scores across all emotions. Figure 1b is similar to Figure 1a but presents the correlations for the smaller dataset languages (those listed in Table 2). As expected, the correlation scores are generally lower here compared to the languages with larger datasets, confirming that such minuscule amounts of training data are insufficient to induce emotion ratings using our methods.

Qualitative Analysis
To better understand in what ways our induced emotion ratings deviated from the NRC EIL, we performed a qualitative analysis. We took the 50 Spanish, 30 Yoruba and 30 Sinhala words whose induced emotion rating deviated the most from the NRC EIL's and labeled each of them with the cause of error based on our inspection of the nearest source language neighbors and their corresponding emotion ratings. The results of this analysis are presented in Table 4. The following discussion focuses on Spanish, as the error categories are essentially the same for Yoruba and Sinhala.  While 21 of the Spanish word errors appeared to be due to random mistranslations with no identifiable patterns, we were able to categorize the remaining 29. 10 of these seemed to be exaggerated translations (falso to murderer instead of just false, engañar to evil instead of just cheat, cambio to turmoil instead of change). An interesting theme here is the exaggeration of words for deceit, cunning, and falsehoods (astucia, supposed to be cunning or craftiness, was translated to evil, hatred, and slander). Such shifts may stem from the biblical narrative in our source corpora, which may diverge from common use.
The next frequent issue is ambiguity, where a word has multiple meanings and the NRC EIL picked one, while our methods picked another. We also observed words being translated to their antonyms (calma to madness instead of calm, contento to ruin and sad instead of happiness) and adjacent ideas (ayuda to distress instead of aid, médico to disease instead of doctor). This is an expected result when drawing on distributional semantics, as antonyms and adjacent concepts appear in similar contexts as the original words. Finally, we encounter the issue of correct translations for a word being the top ones, but incorrect translations also getting included at the end of the list, which ends up skewing the final emotion rating.
A notable deviation from the error category frequency pattern is that of questionable NRC ratings for Yoruba. Inspecting Yoruba literature corpus searches yielding translations in context, we noticed that the NRC translations appeared to be incorrect surprisingly frequently, which led to our emotion ratings deviating significantly from those of the NRC for these mistranslated words.

Sentence-Level Evaluation
As an additional case study, we also evaluate our induced emotion ratings on the downstream task of predicting the emotion of sentences in an unsupervised manner.
Data. Due to the scarcity of emotion-labeled corpora for low-resource languages, we here rely on the Spanish language LiSSS corpus (Torres-Moreno and Moreno-Jiménez, 2020), but again induce our Spanish ratings using our corpora of just 31k / 7.9k Bible verses. LiSSS provides around 500 sentences from the literary domain, each manually annotated with one or more of five emotions -love, fear, happiness,  anger, and sadness. We dropped the sentences labeled exclusively with love, as that is not an emotion present in the NRC EIL. We were left with 428 sentences, which we used for evaluation.
Method. Given a sentence S, we predict its emotion as argmax e∈E w∈S λ w σ e (w), where E is the set of four candidate emotions. We consider two different weighting schemes: The first simply sets λ w = 1, while the second sets it to the the IDF score of w in the Spanish Bible corpus. For sentences labeled with a single emotion, the predicted emotion must match it to be counted as correct.
For sentences labeled with multiple emotions, the predicted emotion must be among the true emotions.
Results. Table 5a shows the results of the evaluation against the LiSSS corpus when our method is trained on the full 31K verse Spanish Bible. For reference, the expected precision that random guessing would achieve is 0.271. The NRC EIL, as expected, does the best, as it used Google Translate, while our methods had only small Bible corpora as training data. The NRC EIL also shows a slight improvement upon adding IDF weighting. Interestingly, the three unmodified alignment methods produce emotion ratings that actually do better without IDF weighting. We conjecture that this is because the translation precision for rarely seen words is too low in these methods for IDF to be effective. Another interesting observation is that while the our modified Orthogonal Refinement and NLM methods attained a comparable translation precision, this does not correlate with comparable precision on the LiSSS corpus. In fact, NLM does hardly better than chance, while our modified Orthogonal Refinement does almost twice as well as chance. Table 5b shows the results of the LiSSS evaluation when training only on the 7.9K verses Bible version. Here, none of the methods do much better than chance.

Conclusion
In this paper, we investigate approaches to cross-lingually induce emotion ratings based on very small training corpora. This is achieved by taking an emotion lexicon for a resource-rich language and inducing a cross-lingual embedding space to transfer the source language emotion ratings to words in the resourcepoor target languages. We compare several strategies to achieve this and evaluate them in terms of both translation precision and the final correlation of the induced emotion ratings with existing emotion lexicons. Generally, we find that our modified variants of the original algorithms yield important gains in such low-resource settings. We also evaluate them on the downstream task of unsupervised sentencelevel emotion prediction on a human-annotated literary corpus. Overall, while the methods do not work sufficiently well with 10K or fewer verses, we attained satisfactory results on languages for which translated Bibles with at least 31K verses exist. This still leaves us with the ability to induce cross-lingual emotion ratings for over 350 languages, available online at http://emotionlexicon.org/.