A Multidimensional Lexicon for Interpersonal Stancetaking

The sociolinguistic construct of stancetaking describes the activities through which discourse participants create and signal relationships to their interlocutors, to the topic of discussion, and to the talk itself. Stancetaking underlies a wide range of interactional phenomena, relating to formality, politeness, affect, and subjectivity. We present a computational approach to stancetaking, in which we build a theoretically-motivated lexicon of stance markers, and then use multidimensional analysis to identify a set of underlying stance dimensions. We validate these dimensions intrinscially and extrinsically, showing that they are internally coherent, match pre-registered hypotheses, and correlate with social phenomena.


Introduction
What does it mean to be welcoming or standoffish, light-hearted or cynical? Such interactional styles are performed primarily with language, yet little is known about how linguistic resources are arrayed to create these social impressions. The sociolinguistic concept of interpersonal stancetaking attempts to answer this question, by providing a conceptual framework that accounts for a range of interpersonal phenomena, subsuming formality, politeness, and subjectivity (Du Bois, 2007). 1 This 1 Stancetaking is distinct from the notion of stance which corresponds to a position in a debate (Walker et al., 2012). Similarly, Freeman et al. (2014) correlate phonetic features with the strength of such argumentative stances. framework has been applied almost exclusively through qualitative methods, using close readings of individual texts or dialogs to uncover how language is used to position individuals with respect to their interlocutors and readers.
We attempt the first large-scale operationalization of stancetaking through computational methods. Du Bois (2007) formalizes stancetaking as a multi-dimensional construct, reflecting the relationship of discourse participants to (a) the audience or interlocutor; (b) the topic of discourse; (c) the talk or text itself. However, the multidimensional nature of stancetaking poses problems for traditional computational approaches, in which labeled data is obtained by relying on annotator intuitions about scalar concepts such politeness (Danescu-Niculescu-Mizil et al., 2013) and formality (Pavlick and Tetreault, 2016).
Instead, our approach is based on a theoretically-guided application of unsupervised learning, in the form of factor analysis, applied to lexical features. Stancetaking is characterized in large part by an array of linguistic features ranging from discourse markers such as actually to backchannels such as yep (Kiesling, 2009). We therefore first compile a lexicon of stance markers, combining prior lexicons from Biber and Finegan (1989) and the Switchboard Dialogue Act Corpus (Jurafsky et al., 1998). We then extend this lexicon to the social media domain using word embeddings. Finally, we apply multi-dimensional analysis of co-occurrence patterns to identify a small set of stance dimensions.
To measure the internal coherence (construct validity) of the stance dimensions, we use a word intrusion task (Chang et al., 2009) and a set of preregistered hypotheses. To measure the utility of the stance dimensions, we perform a series of extrinsic evaluations. A predictive evaluation shows that the membership of online communities is determined in part by the interactional stances that predominate in those communities. Furthermore, the induced stance dimensions are shown to align with annotations of politeness and formality.
Contributions We operationalize the sociolinguistic concept of stancetaking as a multidimensional framework, making it possible to measure at scale. Specifically, • we contribute a lexicon of stance markers based on prior work and adapted to the genre of online interpersonal discourse; • we group stance markers into latent dimensions; • we show that these stance dimensions are internally coherent; • we demonstrate that the stance dimensions predict and correlate with social phenomena. 2

Related Work
From a theoretical perspective, we build on prior work on interactional meaning in language. Methodologically, our paper relates to prior work on lexicon-based analysis and contrastive studies of social media communities.

Linguistic Variation and Social Meaning
In computational sociolinguistics (Nguyen et al., 2016), language variation has been studied primarily in connection with macro-scale social variables, such as age (Argamon et al., 2007;Nguyen et al., 2013), gender (Burger et al., 2011;Bamman et al., 2014), race (Eisenstein et al., 2011;Blodgett et al., 2016), and geography (Eisenstein et al., 2010). This parallels what Eckert (2012) has called the "first wave" of language variation studies in sociolinguistics, which also focused on macro-scale variables. More recently, sociolinguists have dedicated increased attention to situational and stylistic variation, and the interactional meaning that such variation can convey (Eckert and Rickford, 2001). This linguistic research can be aligned with computational efforts to quantify phenomena such as subjectivity (Riloff and Wiebe, 2003), sentiment (Wiebe et al., 2005), politeness (Danescu-Niculescu-Mizil et al., 2013), formality (Pavlick and Tetreault, 2016), and power dynamics (Prabhakaran et al., 2012). While linguistic research on interactional meaning has focused largely on qualitative methodologies such as discourse analysis (e.g., Bucholtz and Hall, 2005), these computational efforts have made use of crowdsourced annotations to build large datasets of, for example, polite and impolite text. These annotation efforts draw on the annotators' intuitions about the meaning of these sociolinguistic constructs.
The key idea, as articulated by Du Bois (2007), is that stancetaking captures the speaker's relationship to (a) the topic of discussion, (b) the interlocutor or audience, and (c) the talk (or writing) itself. Various configurations of these three legs of the "stance triangle" can account for a range of phenomena. For example, epistemic stance relates to the speaker's certainty about what is being expressed, while affective stance indicates the speaker's emotional position with respect to the content (Ochs, 1993).
The framework of stancetaking has been widely adopted in linguistics, particularly in the discourse analytic tradition, which involves close reading of individual texts or conversations (Kärkkäinen, 2006;Keisanen, 2007;Precht, 2003;White, 2003). But despite its strong theoretical foundation, we are aware of no prior efforts to operationalize stancetaking at scale. Since annotators may not have strong intuitions about stance -in the way that they do about formality and politeness -we cannot rely on the annotation methodologies employed in prior work. We take a different approach, performing a multidimensional analysis of the distribution of likely stance markers.

Lexicon-based Analysis
Our operationalization of stancetaking is based on the induction of lexicons of stance markers. The lexicon-based methodology is related to earlier work from social psychology, such as the General Inquirer (Stone, 1966) and LIWC (Tausczik and Pennebaker, 2010). In LIWC, the basic categories were identified first, based on psychological constructs (e.g., positive emotion, cognitive processes, drive to power) and syntactic groupings of words and phrases (e.g., pronouns, prepositions, quantifiers). The lexicon designers then manually contructed lexicons for each category, augmenting their intuitions by using distributional statistics to suggest words that may have been missed (Pennebaker et al., 2015). In contrast, we follow the approach of Biber (1991), using multidimensional analysis to identify latent groupings of markers based on co-occurrence statistics. We then use crowdsourcing and extrinsic comparisons to validate the coherence of these dimensions.

Multicommunity Studies
Social media platforms such as Reddit, Stack Exchange, and Wikia can be considered multicommunity environments, in that they host multiple subcommunities with distinct social and linguistic properties. Such subcommunities can be contrasted in terms of topics (Adamic et al., 2008;Hessel et al., 2014) and social networks (Backstrom et al., 2006). Our work focuses on Reddit, emphasizing community-wide differences in norms for interpersonal interaction. In the same vein, Tan and Lee (2015) attempt to characterize stylistic differences across subreddits by focusing on very common words and parts-of-speech; Tran and Ostendorf (2016) use language models and topic models to measure similarity across threads within a subreddit. One distinction of our approach is that the use of multidimensional analysis gives us interpretable dimensions of variation. This makes it possible to identify the specific interpersonal features that vary across communities.

Data
Reddit, one of the internet's largest social media platforms, is a collection of subreddits organized around various topics of interest. As of January 2017, there were more than one million subreddits and nearly 250 million users, discussing topics ranging from politics (r/politics) to horror stories (r/nosleep). 3 Although Reddit was originally designed for sharing hyperlinks, it also provides the ability to post original textual content, submit comments, and vote on content quality (Gilbert, 2013 For example, the following are two comments from the subreddit r/malefashionadvice, posted in response to a picture posted by a user asking for fashion advise.
U 1 : "I think the beard looks pretty good. Definitely not the goatee. Clean shaven is always the safe option." U 2 : "Definitely the beard. But keep it trimmed." The phrases in bold face are markers of stance, indicating a evaluative stance. The following example is a part of a thread in the subreddit r/photoshopbattles where users discuss an edited image posted by the original poster OP. The phrases in bold face are markers of stance, indicating an involved and interactional stance.
U 3 : "Ha ha awesome!" U 4 : ''are those..... furries?" OP: "yes, sir. They are!" U 4 : "Oh cool. That makes sense!" We used an archive of 530 million comments posted on Reddit in 2014, retrieved from the public archive of Reddit comments. 4 This dataset consists of each post's textual content, along with metadata that identifies the subreddit, thread, author, and post creation time. More statistics about the full dataset are shown in Table 1.

Stance Lexicon
Interpersonal stancetaking can be characterized in part by an array of linguistic features such as hedges (e.g., might, kind of ), discourse markers (e.g., actually, I mean), and backchannels (e.g., yep, um). Our analysis focuses on these markers, which we collect into a lexicon.

Seed lexicon
We began with a seed lexicon of stance markers from Biber and Finegan (1989), who compiled an extensive list by surveying dictionaries, previous studies on stance, and texts in several genres of English. This list includes certainty adverbs (e.g., actually, of course, in fact), affect markers (e.g., amazing, thankful, sadly), and hedges (e.g., kind of, maybe, something like) among other adverbial, adjectival, verbal, and modal markers of stance. In total, this list consists of 448 stance markers.
The Biber and Finegan (1989) lexicon is primarily based on written genres from the pre-social media era. Our dataset -like much of the recent work in this domain -consists of online discussions, which differ significantly from printed texts (Eisenstein, 2013). One difference is that online discussions contain a number of dialog act markers that are characteristic of spoken language, such as oh yeah, nah, wow. We accounted for this by adding 74 dialog act markers from the Switchboard Dialog Act Corpus (Jurafsky et al., 1998). The final seed lexicon consists of 517 unique markers, from these two sources. Note that the seed lexicon also includes markers that contain multiple tokens (e.g. kind of, I know).

Lexicon expansion
Online discussions differ not only from written texts, but also from spoken discussions, due to their use of non-standard vocabulary and spellings. To measure stance accurately, these genre differences must be accounted for. We therefore expanded the seed lexicon using automated techniques based on distributional statistics. This is similar to prior work on the expansion of sentiment lexicons (Hatzivassiloglou and McKeown, 1997;Hamilton et al., 2016).
Our lexicon expansion approach used word embeddings to find words that are distributionally similar to those in the seed set. We trained word embeddings on a corpus of 25 million Reddit comments and a vocabulary of 100K most frequent words on Reddit using the structured skip-gram models of both WORD2VEC (Mikolov et al., 2013) and WANG2VEC (Ling et al., 2015) with default parameters. The WANG2VEC method augments WORD2VEC by accounting for word order information. We found the similarity judgments obtained from WANG2VEC to be qualitatively more meaningful, so we used these embeddings to construct the expanded lexicon. 5

Seed term
Expanded terms (Example seeds from Biber and Finegan (1989)  To perform lexicon expansion, we constructed a dictionary of candidate terms, consisting of all unigrams that occur with a frequency rate of at least 10 −7 in the Reddit comment corpus. Then, for each single-token marker in the seed lexicon, we identified all terms from the candidate set whose embedding has cosine similarity of at least 0.75 with respect to the seed marker. 6 Table 2 shows examples of seed markers and related terms we extracted from word embeddings. Through this procedure, we identified 228 additional markers based on similarity to items in the seed list from Biber and Finegan (1989), and 112 additional markers based on the seed list of dialog acts. In total, our stance lexicon contains 812 unique markers.

Linguistic Dimensions of Stancetaking
To summarize the main axes of variation across the lexicon of stance markers, we apply a multidimensional analysis (Biber, 1992) to the distributional statistics of stance markers across subreddit communities. Each dimension of variation can then be viewed as a spectrum, characterized by the stance markers and subreddits that are associated with the positive and negative extremes. Multidimensional analysis is based on singular value decomposition, which has been applied successfully to a wide range of problems in natural language processing and information retrieval (e.g., Landauer et al., 1998). While Bayesian topic models are an appealing alternative, singular value decomposition is fast and deterministic, with a minimal number of tuning parameters.

Extracting Stance Dimensions
Our analysis is based on the co-occurrence of stance markers and subreddits. This is motivated by our interest in comparisons of the interactional styles of online communities within Reddit, and by the premise that these distributional differences reflect socially meaningful communicative norms. A pilot study applied the same technique to the cooccurrence of stance markers and individual authors, and the resulting dimensions appeared to be less stylistically coherent.
Singular value decomposition is often used in combination with a transformation of the cooccurrence counts by pointwise mutual information (Bullinaria and Levy, 2007). This transformation ensures that each cell in the matrix indicates how much more likely a stance marker is to cooccur with a given subreddit than would happen by chance under an independence assumption. Because negative PMI values tend to be unreliable, we use positive PMI (PPMI), which involves replacing all negative PMI values with zeros (Niwa and Nitta, 1994). Therefore, we obtain stance dimensions by applying singular value decomposition to the matrix constructed as follows: X m,s = log Pr(marker = m, subreddit = s) Pr(marker = m) Pr(subreddit = s) + .
Truncated singular value decomposition performs the approximate factorization X ≈ U ΣV , where each row of the matrix U is a k-dimensional description of each stance marker, and each row of V is a k-dimensional description of each subreddit. We included the 7,589 subreddits that received at least 1,000 comments in 2014.

Results: Stance Dimensions
From the SVD analysis, we extracted the six principal latent dimensions that explain the most variation in our dataset. 7 The decision to include only the first six dimensions was based on the strength of the singular values corresponding to the dimensions. Table 3 shows the top five stance markers for each extreme of the six dimensions. The stance dimensions convey a range of concepts, such as involved versus informational language, narrative 7 Similar to factor analysis, the top few dimensions of SVD explain the most variation, and tend to be most interpretable. A scree plot (Cattell, 1966) showed that the amount of variation explained dropped after the top six dimensions, and qualitative interpretation showed that the remaining dimension were less interpretable.  Figure 1: Mapping of subreddits in dimension two and dimension three, highlighting especially popular subreddits. Picture-oriented subreddits r/gonewild and r/aww map high on dimension two and low on dimension three, indicating involved and informal style of discourse. Subreddits dedicated for knowledge sharing discussions such as r/askscience and r/space map low on dimension two and high on dimension three indicating informational and formal style.
versus dialogue-oriented writing, standard versus non-standard variation, and positive versus negative affect. Figure 1 shows the distribution of subreddits along two of these dimensions.

Construct Validity
Evaluating model output against gold-standard annotations is appropriate when there is some notion of a correct answer. As stancetaking is a multidimensional concept, we have taken an unsupervised approach. Therefore, we use evaluation techniques based on the notion of validity, which is the extent to which the operationalization of a construct truly captures the intended quantity or concept. Validation techniques for unsupervised content analysis are widely found in the social science literature (Weber, 1990;Quinn et al., 2010) and have also been recently used in the NLP and machine learning communities (e.g., Chang et al., 2009;Murphy et al., 2012;Sim et al., 2013). We used several methods to validate the stance dimensions extracted from the corpus of Reddit comments. This section describes intrinsic evaluations, which test whether the extracted stance dimensions are linguistically coherent and mean-

Word Intrusion Task
A word intrusion task is used to measure the coherence and interpretability of a group of words. Human raters are presented with a list of terms, all but one of which are selected from a target concept; their task is to identify the intruder. If the target concept is internally coherent, human raters should be able to perform this task accurately; if not, their selections should be random. Word intrusion tasks have previously been used to validate the interpretability of topic models (Chang et al., 2009) and vector space models (Murphy et al., 2012). We deployed a word intrusion task on Amazon Mechanical Turk (AMT), in which we presented the top four stance markers from one end of a dimension, along with an intruder marker selected from the top four markers of the opposite end of that dimension. In this way, we created four word intrusion tasks for each end of each dimension. The main reason for including only the top four words in each dimension is the expense of conducting crowd-sourced evaluations. In the most relevant prior work, Chang et al. (2009) used the top five words from each topic in their evaluation of topic models.
Worker selection We required that the AMT workers ("turkers") have completed a minimum of 1,000 HITs and have at least 95% approval rate Furthermore, because our task is based on analysis of English language texts, we required the turkers to be native speakers of English living in one of the majority English speaking countries. As a further requirement, we required the turkers to obtain a qualification which involves an English comprehension test similar to the questions in standardized English language tests. These requirements are based on best practices identified by Callison-Burch and Dredze (2010).
Task specification Each AMT human intelligence task (HIT) consists of twelve word intrusion tasks, one for each end of the six dimensions. We provided minimal instructions regarding the task, and did not provide any examples, to avoid introducing bias. 8 As a further quality control, each HIT included three questions which ask the turkers to pick the best synonym for a given word from a list of five answers, where one answer was clearly correct; Turkers who gave incorrect answers were to be excluded, but this situation did not arise in practice. Altogether each HIT consists of 15 questions, and was paid US$1.50. Five different turkers performed each HIT.

Results
We measured the interrater reliability using Krippendorf's α (Krippendorff, 2007) and the model precision metric of Chang et al. (2009). Results on both metrics were encouraging. We obtained a value of α = 0.73, on a scale where α = 0 indicates chance agreement and α = 1 indicates perfect agreement. The model precision was 0.82; chance precision is 0.20. To offer a sense of typical values for this metric, Chang et al. (2009) report model precisions in the range 0.7-0.83 in their analysis of topic models. Overall, these results indicate that the multi-dimensional analysis has succeeded at identifying dimensions that reflect natural groupings of stance markers.

Pre-registered Hypotheses
Content validity was also assessed using a set of pre-registered hypotheses. The practice of preregistering hypotheses before an analysis and testing the correctness is widely used in the social sciences; it was adopted by Sim et al. (2013) to evaluate the induction of political ideological models from text. Before performing the mutidimensional analysis, we identified two groups of hypotheses that are expected to hold with respect to the latent stancetaking dimensions using our prior linguistic knowledge: • Hypothesis I: Stance markers that are synonyms should not appear on the opposite ends of a stance dimension. • Hypothesis II: If at least one stance marker from a predefined stance feature group (defined below) appears on one end of a stance dimension, then other markers from the same feature group will tend not to appear at the opposite end of the same dimension.

Synonym Pairs
For each marker in our stance lexicon, we extracted synonyms from Wordnet, focusing on markers that appear in only one Wordnet synset, and not including pairs in which one term was an inflection of the other. 9 Our final list contains 73 synonym pairs (e.g., eventually/finally, grateful/thankful, yea/yeah). Of these pairs, there were 59 cases in which both terms appeared in either the top or bottom 200 positions of a stance dimension. In 51 of these cases (86%), the two terms appeared on the same side of the dimension. The chance rate would be 50%, so this supports Hypothesis I and 9 It is possible that inflections are semantically similar, because by definition they are changes in the form of a word to mark distinctions such as tense, person, or number. However, different inflections of a single word form might be used to mark different stances (e.g., some stances might be associated with the past while others might be associated with the present or future). Stance Dimension On same end On opposite ends   DIMENSION 1  6  3  DIMENSION 2  12  2  DIMENSION 3  2  1  DIMENSION 4  11  0  DIMENSION 5  10  2  DIMENSION 6 10 0

Number of synonym pairs
Total 51/59 8/59 Table 4: Results for pre-registered hypothesis that stance dimensions will not split synonym pairs. further validates the stance dimensions. More details of the results are shown in Table 4. Note that synonym pairs may differ in aspects such as formality (e.g., said/informed, want/desire), which is one of the main dimensions of stancetaking. Therefore, perfect support for Hypothesis I is not expected.

Stance Feature Groups
Biber and Finegan (1989) group stance markers into twelve "feature groups", such as certainty adverbs, doubt adverbs, affect expressions, and hedges. Ideally, the stance dimensions should preserve these groupings. To test this, for each of the seven feature groups with at least ten stance markers in the lexicon, we counted the number of terms appearing among the top 200 positions in both ends (high/low) of each dimension. Under the null hypothesis, the stance dimensions are random with respect to the feature groups, so we would expect roughly an equal number of markers on both ends. As shown in Table 5, for five of the seven feature groups, it is possible to reject the null hypothesis at p < .007, which is the significance threshold at α = 0.05, after correcting for multiple comparisons using the Bonferroni correction. This indicates that the stance dimensions are aligned with predefined stance feature groups.

Extrinsic Evaluations
The evaluations in the previous section test internal validity; we now describe evaluations testing whether the stance dimensions are relevant to external social and interactional phenomena.

Predicting Cross-posting
Online communities can be considered as communities of practice (Eckert and McConnell-Ginet, 1992), where members come together to engage in shared linguistic practices. These practices  Table 5: Results for preregistered hypothesis that stance dimensions will align with stance feature groups of Biber and Finegan (1989).
evolve simultaneously with membership, coalescing into shared norms. The memberships of multiple subreddits on the same topic (e.g., r/science and r/askscience) often do not overlap considerably. Therefore we hypothesize that users of Reddit have preferred interactional styles, and that participation in subreddit communities is governed not only by topic interest, but also by these interactional preferences. The proposed stancetaking dimensions provide a simple measure of interactional style, allowing us to test whether it is predictive of community membership decisions.
Classification task We design a classification task, in which the goal is to determine whether a pair of subreddits is high-crossover or lowcrossover. In high-crossover subreddit pairs, individuals are especially likely to participate in both.
For the purpose of this evaluation, individuals are considered to participate in a subreddit if they contribute posts or comments. We compute the pointwise mutual information (PMI) with respect to cross-participation among the 100 most popular subreddits. For each subreddit s, we identify the five highest and lowest PMI pairs s, t , and add these to the high-crossover and low-crossover sets, respectively. Example pairs are shown in Table 6. After eliminating redundant pairs, we identify 437 unique high-crossover pairs, and 465 unique lowcrossover pairs. All evaluations are based on multiple random training/test splits over this dataset.
Classification approaches A simple classification approach is to predict that subreddits with similar text will have high crossover. We measure similarity using TF-IDF weighted cosine similarity, using two possible lexicons: the 8,000 most frequent words on reddit (BOW), and the stance lexicon (STANCE MARKERS). The similarity threshold between high-crossover and low-

Cross-Community Participation
High-Scoring Pairs Low-Scoring Pairs r/blog, r/announcements r/gonewild, r/leagueoflegends r/pokemon, r/wheredidthesodago r/soccer, r/nosleep r/politics, r/technology r/programming, r/gonewild r/LifeProTips, r/dataisbeautiful r/nfl, r/leagueoflegends r/Unexpected, r/JusticePorn r/Minecraft, r/personalfinance  Table 7: Accuracy for prediction of subreddit cross-participation. crossover pairs was estimated on the training data. We also tested the relevance of multi-dimensional analysis, by applying SVD to both lexicons. For each pair of subreddits, we computed a feature set of the absolute difference across the top six latent dimensions, and applied a logistic regression classifier. Regularization was tuned by internal crossvalidation. Table 7 shows average accuracies for these models. The stance-based SVD features are considerably more accurate than the BOWbased SVD features, indicating that interactional style does indeed predict cross-posting behavior. 10 Both are considerably more accurate than the bagof-words models based on cosine similarity.

Politeness and Formality
The utility of the induced stance dimensions depends on their correlation with social phenomena of interest. Prior work has used crowdsourcing to annotate texts for politeness and formality. We now evaluate the stancetaking properties of these annotated texts.
Data We used the politeness corpus of Wikipedia edit requests from Danescu-Niculescu-Mizil et al. (2013), which includes the textual content of the edit requests, along with scalar annotations of politeness. Following the original authors, we compare the text for the messages ranked in the first and fourth quartiles of politeness scores. For formality, we used the corpus from Pavlick and Tetreault (2016), focusing on the blogs domain, which is most similar to our domain of Reddit. Each sentence in this corpus was annotated for formality levels from −3 to +3. We considered only the sentences with mean formality score greater than +1 (more formal) and less than −1 (less formal).
Stance dimensions For each document in the above datasets, we compute the stance properties, as follows: for each dimension, we compute the total frequency of the hundred most positive terms and the hundred most negative terms, and then take the difference. Instances containing no terms from either list are excluded. We focus on stance dimensions two and five (summarized in Table 3), because they appeared to be most relevant to politeness and formality. Dimension two contrasts informational and argumentative language against emotional and non-standard language. Dimension five contrasts positive and formal language against non-standard and somewhat negative language.
Results A kernel density plot of the resulting differences is shown in Figure 2. The effect sizes of the resulting differences are quantified using Cohen's d statistic (Cohen, 1988). Effect sizes for all differences are between 0.3 and 0.4, indicating small-to-medium effects -except for the evaluation of formality on dimension five, where the effect size is close to zero. The relatively modest effect sizes are unsurprising, given the short length of the texts. However, these differences lend insight to the relationship between formality and politeness, which may seem to be closely related concepts. On dimension two, it is possible to be polite while using non-standard language such as hehe and awww, so long as the sentiment expressed is positive; however, these markers are not consistent with formality. On dimension five, we see that positive sentiment terms such as lovely and stunning are consistent with politeness, but not with formality. Indeed, the distribution of dimension five indicates that both ends of dimension five are consistent only with informal texts.
Overall, these results indicate that interactional phenomena such as politeness and formality are reflected in our stance dimensions, which are induced in an unsupervised manner. Future work may consider the utility of these stance dimensions to predict these social phenomena, particularly in cross-domain settings where lexical classifiers may overfit.

Conclusion
Stancetaking provides a general perspective on the various linguistic phenomena that structure social interactions. We have identified a set of several hundred stance markers, building on previouslyidentified lexicons by using word embeddings to perform lexicon expansion. We then used multidimensional analysis to group these markers into stance dimensions, which we show to be internally coherent and extrinsically useful. Our hope is that these stance dimensions will be valuable as a convenient building block for future research on interactional meaning.