Condolences and Empathy in Online Communities

Offering condolence is a natural reaction to hearing someone’s distress. Individuals frequently express distress in social media, where some communities can provide support. However, not all condolence is equal—trite responses offer little actual support despite their good intentions. Here, we develop computational tools to create a massive dataset of 11.4M expressions of distress and 2.8M corresponding offerings of condolence in order to examine the dynamics of condolence on-line. Our study reveals widespread disparity in what types of distress receive supportive con-dolence rather than just engagement. Building on studies from social psychology, we analyze the language of condolence and develop a new dataset for quantifying the empathy in a condolence using appraisal theory. Finally, we demonstrate that the features of condolence individuals ﬁnd most helpful online differ substantially in their features from those seen in interpersonal settings.


Introduction
Millions of individuals experience emotional distress each year from diverse circumstances such as personal loss or abuse. After such experiences, people often turn to their social circle in social media to convey their experiences and seek out emotional support (Brubaker et al., 2012;Brubaker and Hayes, 2011;De Choudhury and Kiciman, 2017). Often, support comes in the form of condolence where individuals connect with the distressed person, and express forms of sympathy, empathy, advice, and social connection, among others (Burleson, 2003). However, not all expressions of distress receive emotional support, nor do all condolence messages offer equal levels of support (Davidowitz and Myrick, 1984). Given the wide-spread use of social media for seeking social support, what makes for an effective supportive message? Here, we perform the first major study of condolence in social media, examining what type of distress individuals seek support for, what linguistics factors are more likely to elicit condolence, and what types of condolence viewed as more helpful.
Distress and emotional support have long been explored in work in social psychology and counseling (Burleson et al., 2009;Rack et al., 2008), frequently around bereavement and helping victims of abuse. NLP works have only recently examined emotional support in online spaces for mental and physical health (Biyani et al., 2014;Navindgi et al., 2016;Wang et al., 2015) and in communities oriented around goals like weight-loss (Manikonda et al., 2014); however, these focus on the general concept of supportiveness. In this work, we examine distress as a universal phenomenon-not just related to health and death-and examine the strategies and helpfulness of responses to this distress.
This study aims to computationally identify mechanisms and strategies for delivering effective and impactful condolence on social media. Conveying condolence is often difficult for many people (Cameron et al., 2019), who fall back to common responses to distress such as "thoughts and prayers" or "I'm so sorry for your loss" due to the emotional and mental effort required to relate to the distressed person. To identify effective strategies of condolence, we construct a dataset of 14.1M expressions of distress from Reddit by developing computational models for recognizing distress and condolence. We then use this dataset to analyze how the community embraces the individual and which condolence responses were found helpful.
This work offers the following three contributions. First, we introduce a new massive dataset of 11.4M public expressions of distress and 2.8M of condolence labeled using two deep learning models for identifying each, showing that our data mirrors known trends in seasonality and theme. Sec-ond, using an analysis of 11.4M expressions of distress, we demonstrate that the community selectively engages in condolence; not all distress messages which attract attention actually receive support. Third, we introduce a new dataset and model for identifying empathy in condolences and, using the empathy estimates, find that distressed individuals less frequently offer gratitude for deeply empathetic condolences and instead prefer compassionate, positive messages, which runs counter to observations from in-person settings.

Recognizing Distress and Condolence
Distress and condolence are expressed in a variety of ways. As no standard dataset exists for detecting these constructs, we first create one for training models using distant supervision to heuristically label data. Then, two classifiers are trained to recognize each in expressions on social media and finally fine-tuned to attain high precision. For both, we use Reddit comments as our base data. Additional details for classification and training are reported in supplemental section A.
Recognizing Distress A set of stereotypical condolence expressions, e.g., "sorry for your loss" or "my heart goes out to you" is first manually identified. Due to their ubiquitous use in the face of distress, such expressions act as heuristics to identify posts containing a variety of circumstances and topics. All Reddit comments receiving at least one of these stereotypical-condolence replies are treated as positive examples of distress, identified from all Reddit comments in the year 2017. 1 An equivalent number of randomly-selected comments that do not receive any of these stereotyped-condolence responses are sampled from the same communities in the same month, which ensures the corpus is topically and temporally balanced. In total, 229,204 comments are collected as training data from Reddit during the year 2017.
Two classifiers are trained from this balanced dataset. The first is a SVM classifier using unigrams and bigrams, which is known to be a robust baseline (Wang and Manning, 2012 SVM and BERT-based classifiers were tested to recognize condolence in comments, using the same setup as those for recognizing distress. Performance at recognizing condolence, shown in Table 1 (bottom) was even higher than that for recognizing distress. Since there are relatively common strategies in condolence expressions (e.g., expressing sympathy with phrases like "I'm so sorry for your loss"), we suspect these condolence comments are easier to recognize.
Tuning for Precision Decision thresholds were set at 0.9 for both classifiers to focus on precision after a manual review of a subset of classifications found this to produce sufficiently correct results.
Dataset Description Our final condolence and distress datasets were collected by running the respective classifiers on a random sample of 2018 Reddit comments made in the top ten thousand most popular safe-for-work subreddits. Condolence comments have a length centered around a median of 21 words, with a long right tail (mean of 47.7 words, standard deviation of 79.8 words). Distress comments have a similarly shaped distribution, with a median of 25 words, mean of 41.3 words, and standard deviation of 57.8 words.

Condolence Behavior in Social Media
As an initial demonstration of the model, we label a random sample of Reddit comments from 2018 made in the top ten thousand most popular safefor-work subreddits and examine where and when distress and condolence are exhibited. Distress and condolence communities Figure 1 (left) shows that while health topics are prominent, individuals frequently seek out communities based around bereavement (e.g., r/Miscarriage) and abuse (e.g., r/domesticviolence). This result confirms that our model is able to identify a diverse set of circumstances in which individuals experience distress, mirroring some of those highlighted in prior work for online support of distress (Krysinska and An driessen, 2013;Huh et al., 2014;Döveling, 2017). Surprisingly, the location of condolence behavior (Figure 1, right) does not mirror that of distress. Instead, condolence is frequently offered to those suffering from the loss of a pet and, less frequently, those experiencing the death of a loved one. Many people find the death of a pet more relatable compared with other circumstances like domestic violence, lessening the effort required to relate to the person experiencing the loss and offer condolence (Lim and DeSteno, 2016). Indeed, to express effective condolence, an empathetic response requires effort to relate on a personal level to the feelings of the affected person (Cameron et al., 2019), which many may find more challenging emotionally in circumstances like abuse. Seasonal effects in distress Changes in seasons and holidays are both known to increase distress and anxiety levels (Cattell, 1955;Rosenthal et al., 1984;Harmatz et al., 2000). As Figure 2    2: Relative rates for distress (right axis) and condolence (left axis) show that while distress mirrors expected seasonal trends, condolence does not; instead, condolence trends are partially driven by response to events, e.g., mass shootings. Throughout the paper, error bars show 95% confidence intervals.
shows, expressions of distress in Reddit mirror these trends with a substantial increase around commonly-celebrated holidays. There are spikes around Valentine's Day (February) and increases leading to Thanksgiving (November), Christmas and New Year's (December). Surprisingly, the rate of a community's support of these individualsexpressed through condolence-largely does not mirror this trend. Instead, we observe that spikes in condolence were associated with significant events, including school shootings and celebrity deaths; these self-contained events triggered mass outpourings of condolence.

What Distress Receives Condolence?
As individuals turn to social platforms for emotional support for a variety of reasons, which types of distress messages receive condolence? We contrast whether an expression of distress receives condolence with receiving any reply.
Methods To understand what factors lead to a dis-tress message receiving a reply or a condolence, we fit separate mixed-effect logistic regression models on the dependent variable of receipt of the respective type. To capture thematic trends across messages, we train a 20-topic LDA model and manually label each topic with its prominent theme (topics are shown in supplemental section C). Offering a condolence can require empathetic alignment with another person (Trobst et al., 1994;Cameron et al., 2019), which could be difficult for certain emotions; therefore, we include estimates of the emotions expressed in a distress message using the NRC-emotion lexicon (Mohammad and Turney, 2013). Pronouns reflect the narrative focus of the distress, e.g., frequent mentions of "I" center the content on the distressed person whereas "he" focus on what was done to the distressed person; therefore we include counts of how many times first, second, and third-person pronouns appear using LIWC categories (Pennebaker et al., 2001). Individuals on Reddit are known to be sensitive to the perceived gender of the author when providing support (Wang and Jurgens, 2018), so we include a variable for the user's estimated gender using genderperformr. As controls, we include comment length by space-delimited words, the comment age in hours after the post was created, the depth of the comment, the score of the post as a measure of popularity, and temporal factors for hour of day, day of week, and month. To control for differences within specific subreddits and posts, we include nested random effects for subreddit and the post in which the distress comment is made; for computational tractability, we include only random effects for posts with 30 or more distress comments. The Reddit-based models were fit using a random sample of 1M comments from the 2018 data identified as distress expressions.

Results
The factors affecting whether a distress comment receives a reply differed substantially from those receiving condolence. Whereas distress comments relating to politics, dieting, or sports are likely to receive a reply, such comments are far less likely to receive condolence. Differences in topical effects show that while the Reddit community is likely to engage with distress in all topics, the community selectively supports only a few of these. While the model for receiving a reply is similar to De Choudhury and De (2014,   as not all replies are actually supportive.

The Structure of Condolence
Individuals regularly employ a common set of strategies in condolence (e.g., Davidowitz and Myrick, 1984;Lehman et al., 1986;Burleson, 2003), from trope-like expressions ("sorry for your loss") to thoughtful and empathetic statements that validate the other's experience. These statements often fall along a spectrum of person-centeredness (High and Dillard, 2012) with respect to their acknowledgment, understanding, and legitimization of the distressed person's state. Here, we analyze the structure of Reddit condolences to examine reg-ularities in strategies individuals employ in crafting their responses. We use a data-driven approach to identify themes by fitting a 20-topic LDA model to identify broad themes; to test for structure, we measure the probability of each topic in the sequence of sentences for condolences of different lengths.
Results Condolences follow regular patterns in their strategies for support. Figure 3 shows the presence of different topics by position in the sentence across condolences of different lengths; the most probable words for each topic are listed in Table  3. Three notable trends occur, showing increasing focus on the person experiencing distress. First, sympathy features prominently in shorter condolences, which focus largely on acknowledging the person's suffering as a result of the distress. These comments serve as bookends to the overall statement, but largely disappear in longer condolences. The use of swearing in these contexts acts not only as an intensifier in expressing the speaker's perception of unpleasantness but also as a way of expressing solidarity through emphasizing in-group membership by transgressing social norms (Fägersten, 2012;Stapleton, 2010).
Second, as condolences become longer, individuals begin adding their own experience within the response (PERSONAL EXPERIENCE). This behavior features prominently in middle-length condolences that still begin with sympathy and then try to relate their own personal experience to that of the suffering. At a high-level, these experiences aim to help the person experiencing distress reframe their own mindset and correspond to a higher-level of person-centeredness (Servaty-Seib and Burleson, 2007;High and Dillard, 2012).
Finally, the longest condolences contain significant amounts of advice and reframing, with less focus on the condolence giver. These condolences can correspond to even higher levels of personcenteredness by trying to engage with the other's experience through advice.

Empathy in Condolence
At a high level, empathy requires a person to imagine the experience of another as they felt it-to put themselves in the other's shoes. In condolence, empathy provides a powerful, person-centered framing for validating and connecting with those in distress. Distressed individuals have found empathetic condolences more supportive than sympathetic messages (Davidowitz and Myrick, 1984;   Empathy itself has many varying definitions in social psychology (Basch, 1983;Cuff et al., 2016) and the limited computational work employing empathy has largely focused only on mirroring emotional state as a way of empathizing (Collins, 2014;Litvak et al., 2016;Fung et al., 2016;Khanpour et al., 2017). More recently Abdul-Mageed et al. (2017) and Buechel et al. (2018) have gone beyond these simple models to develop and use a corpus for distress and empathy in reactions to news stories. These works adopt a broader definition drawn from multiple sources of empathy which mixes empathy with related concepts of compassion, altruism, and prosocial behavior. (Batson et al., 1987;Sober and Wilson, 1999;Goetz et al., 2010;Mikulincer and Shaver, 2010). In this work, we adopt a stricter definition of measuring empathy based on appraisal theory (Lamm et al., 2007;Wondra and Ellsworth, 2015). Here, empathy occurs when an observer appraises a person's situation in the same way as the person experiencing the distress. This definition more closely mirrors the person-centeredness of the response in terms of how the observer acknowledges and validates different aspects of the distressed person's mental state. Following this definition, we create a new corpus around appraisalbased empathy and develop a classifier that can be used to label condolences for their empathy.   Table 3). Shorter condolences focus on expressing sympathy, middlelength include more personal experience, and longer condolences offer substantial amounts of advice.

Data and Annotation
lengths followed a log-log-normal distribution, and shorter condolences tended to be trite or repetitive, e.g., "so sorry to hear." To introduce diversity in the annotated condolence data, we binned comment pairs by condolence length using Jenks optimization, then reweighted the probability of sampling from each bin to flatten the distribution of lengths. Two annotators identified a set of 1000 distress comments with a self-contained message, without being shown the condolence to avoid bias. Individuals may express their distress over multiple comments in a discussion thread, so this process was aimed at reducing the prior context needed to estimate appraisal to a single distress comment.
Annotators were shown a condolence reply to a comment and asked to rate on a five-point Likert scale to what degree did the observer appraise the other person's situation in the same way along the following dimensions: (1) pleasantness, (2) anticipated effort in dealing with the situation, (3) situational control, (4) how much oneself or another person was responsible for the situation, (5) attentional activity, and (6) certainty about what was happening in the situation or what would happen next. High scoring comments acknowledge and validate the distressed person's experience.
Prior to annotating the full dataset, annotators collaboratively developed guidelines and completed five rounds of training on 100 items of heldout data in each round and discussed each case of disagreement. Annotators attained Krippendorff's α ≈ 0.6 for the final two rounds. Following training and adjudication, the final 1,000 condolence replies were annotated. After an initial pass, Krippendorff's α was 0.359. While this initial value seems low, α is strongly affected by the large class skew from most condolences not being empathetic (score 1). A second pass was made across the 25 comment pairs where annotators disagreed by 3 or more points, where annotators discussed their disagreements and updated their individual ratings, after which α=0.431; these disagreements were largely due to unintentional mistakes or misinterpretations, rather than substantive disagreements on empathy. In the final dataset, annotators differentiated by at most one scale point on 91.2% of the items (Pearson r=0.58). While the agreement value is moderate, it matches similar agreement levels seen when annotation requires inferring mental states and intentions from text (e.g., Card et al., 2015;Rashkin et al., 2016;Rashid and Blanco, 2017;Breitfeller et al., 2019). The difficulty of annotation stems from interpreting the intentions, appraisals, and alignment between the distress comment and observer's comment. Further, the choice to diversity the data by sampling across longer replies likely depressed agreement, as shorter replies often are low-empathy (e.g., trite messages) which annotators readily agreed on. The final empathy rating is the mean of the two annotations.
Recognizing Empathy Two types of regression models were trained for predicting the empathy rating of a condolence using our dataset, which use either the target's and observer's texts or just the observer's text. The first type of models uses a random forest regressor that is trained on unigram and bigrams of the target and observers comments, using separate feature spaces for each. The second type of model uses RoBERTa (Liu et al., 2019) as a base, starting from the pretrained roberta-base parameters. When using the target and observer text as inputs, the texts are separated by the [SEP] token. The [CLS] representations of each input were concatenated and passed through a fully-connected linear layer, using sigmoid activation to bound the output value in [1,5]. Due to the empathy rating imbalance in the data, we construct randomized stratified partitions for training (80%), validation (10%), and test (10%) using the rounded value of the empathy rating. Models are compared with the mean empathy rating.
Both models surpassed the baseline of predicting the mean value from the training data, as seen in Table 4, with the RoBERTa models performing best. 2 For both the RoBERTa and Random Forest models, knowledge of the target's comments improved performance, suggesting that models benefit from being able to align the two inputs in determining empathy. Nonetheless, performance of the best model is moderate at best and we view these results as a preliminary step at identifying appraisal-based empathy in text.
As a follow-up analysis, we used the Target & Observer model to rate unlabeled condolence replies and manually examined a random 100 responses rated with empathy ≥ 2, which signals more than the minimal empathetic alignment. Of these replies, 84% contained at least two empathetic alignments (e.g., aligning with the target's perception of pleasantness and situational control), suggesting the model is effective at recognizing empathetic speech and any misclassifications are more likely to be underestimates of empathy.
As a further comparison, we computed the empathy scores for the model of Buechel et al. (2018) on our data; the two scores had a Pearson r=0.343, indicating that, while related, both are capturing substantially different notions of empathy.
2 Additional RoBERTa models were trained using a language model that had first been fine-tuned using masked language modeling on the distress and condolence comments for 10 epochs; however, these models resulted in slightly worse performance: the Observer-only had MSE=0. 561

Model and Features
Condolence effectiveness is modeled using a nested-effects logistic regression with the dependent variable of whether the condolence was responded to with gratitude. Random effects are added for the subreddit with a nested effect for condolences made to posts receiving 30 or more replies; posts receiving fewer are modeled with a common nested effect. Note that these random effects control for relative differences in the level of gratitude and behavioral norms in each subreddit, allowing more accurate estimates of which content features contribute to effective condolences. Three groups of regression features were selected: two from theory for known helpful and unhelpful strategies, with an additional group of data-driven controls, all described next.

Helpful Strategy Features
In the first group, we include the macro-empathy estimates of Buechel et al. (2018) and our appraisal-based empathy estimate of the comment, as person-centered empathetic responses are known to be more helpful in clinical therapy (Nienhuis et al., 2018). As a third test, we include uses of first-, second-, and third-person pronominal referents from LIWC (Pennebaker et al., 2001). Increased use of each pronoun category reflects narrative focus on the condoler, distressed person, or the situation being described, respectively; in particular, mentions of the distressed person are more aligned with a person-centered message. Fourth, individuals will mirror the language as a way of decreasing social distance which can increase trust (Scissors et al., 2008); Wang et al. (2015) found that lexical alignment is associated with increased emotional support. Therefore, we include a feature for lexical alignment as the % of the condolence's words that were also used in the distress comment.
Unhelpful Strategy Features Some wellintentioned responses may include strategies that are unhelpful in practice. Lehman et al. (1986) note that forced positivity in the face of distress is often viewed poorly; therefore, to test this effect, we include a sentiment estimate of the condolence using VADER (Hutto and Gilbert, 2014). Similarly, minimizing phrases such as "it's not that bad" or "I'm sorry you feel sad" invalidate the experience and emotions of the distressed persons (Lehman et al., 1986;Hogan et al., 1994); to test for these effects, we include the presence of a list of such phrases drawn from observational studies and matched using regular expressions. Third, we include a separate minimizing phrase for trivializing "just" (Kiesling, 2011)-e.g., "it's just an exam"-which is modeled by identifying the presence of an adverbial use in the text.
Control Features As controls, we include (i) the topics of the condolence (Table 3), which act as coarse proxies of the strategy and content, (ii) the score of the comment containing the distress and the time between the distress comment and condolence reply (minutes), (iii) the length of the condolence, and (iv) temporal factors for the month, day of week, and hour of day. Finally, multiple stud- −2.72 * * * Note: * p<0.1; * * p<0.05; * * * p<0.01 Table 5: Coefficients for predicting whether a condolence will receive gratitude; for simplicity, coefficients for temporal controls and topics corresponding to experiential themes (e.g., sports) are omitted and provided in supplemental section D.
ies have reported gender differences in strategies of support, with women typically offering more emotionally complex and empathetic condolences (Knight et al., 1998;Rack et al., 2008;Burleson et al., 2009); to test for this effect, we include the gender prediction from genderperformr.

Results
The linguistic factors associated with helpful condolences largely followed expectations from observational studies, with one significant exception. As predicted from observational studies, condolences with markers of person-centered responses were rated as more helpful, which included lexical alignment and narrative focus on the other person (second-person pronouns). In annotation, we observed that condolences shift between the "personal you" of the distress person and use of the "generic you," which is known to be evoked in meaning making (Orvell et al., 2017(Orvell et al., , 2019; given the positive coefficient for second-person pronouns, future work may attempt to distinguish between these uses to test whether such meaning-making comments contribute to more effective condolence. Also predicted, advice is strongly negative to good condolence-despite being the most commonly-used strategy (cf. Figure 3). Replies with the ADVICE 2 topic contained more references to third parties than ADVICE; some of these included popular supportive quotes, not actually condolence, or assessment of and advice for a thirdparty outside of the interaction being modeled in this regression. Similarly, sympathy and invocations of religious language (which we found often contains minimizing tropes) are known to be found less helpful and have negative coefficients here as well. Last, our study confirms the expected disparity for men and women in condolence helpfulness. However, our results disagree with prior observations on empathy and we find that, while the compassion-like empathy of Buechel et al. (2018) is found helpful, condolences with the more personcentered appraisal-based empathy were less likely to receive gratitude. We speculate that people may turn to Reddit for lighter, less-personal forms of support in times of distress, whereas the more compassion-like empathy of Buechel et al. (2018) is helpful when more personal responses are not licensed by the relative anonymity of the platform. 3 Our results also disagree with expectations around forced positivity (Lehman et al., 1986), where positive sentiment replies are consistently more helpful. We interpret this result pointing to a different goal of support by Reddit users who seek out positive reinforcement, rather than comments that require emotional effort to engage with complex emotions.
While we are only able to speculate on negative impact of appraisal-based empathy, the effect could be due to different goals for the desired support received online, where individuals seek out information instead of empathy (Yao et al., 2015). Alternatively, here, we have modeled condolence helpfulness using a fixed set of phrases to identify thanks in replies; it could be that the more empathetic responses generate replies that, while not containing these thanks-expressions, still signal the condolence's positive utility. Our results motivate future work to understand online users' preferences for empathy in support: as millions of people already respond to distress with good intentions each year, improving these supportive efforts has the 3 As a follow-up analysis, we also tested whether a binary encoding of higher appraisal empathy (score ≥2) instead of a continuous marker would be found to be more helpful; after re-running the regressions, the appraisal-based empathy still had a negative coefficient. potential to better the lives of millions.

Ethics
Distress is inherently personal and computational studies on such matters warrant ethical consideration. In weighing the risks and benefits of our studies, the largest risk has been the loss of privacy, as individuals expressing their distress may have contextual expectations of privacy or anonymity (Fiesler and Proferes, 2018). To mitigate this risk, we report only paraphrased examples and aggregate statistics. Further, we only release this data to researchers upon request and provided they follow similar privacy practices. As a counter balance, this study has considerable benefit by providing better information on what makes for effective condolences; the insights from this study can be distilled into practical advice that can make for more supportive online communities.

Conclusion
Distress is an omnipresent part of life, and individuals turn to their social circle and social platforms for support when experiencing it. In this paper, we have developed new computational models for recognizing distress, condolences to that distress, and empathy within condolence. Applying those models, we examine the dynamics of distress and condolence, showing that not all distress is treated equally online, and there exist regular structures within condolence. Through analyzing millions of condolence responses, we test what makes for effective condolence online, showing that while some features predicted from observation studies hold true online, e.g., increasing person-centeredness of the message (High and Dillard, 2012), distressed individuals did not find empathetic comments more helpful, suggesting different goals from online support. Our results have important implications for (i) individuals by providing concrete suggestions of how to express one's distress to make it more likely to receive support, (ii) site operators by allowing them to observe the emotional health and responsiveness of their community, potentially reaching out to underserved individuals who have yet to receive support, and (iii) the general public for authoring more effective supportive messages. Models and reproducible code are available at https://blablablab.si. umich.edu/projects/condolence/ and data is made available upon request.

A.1 Dataset
Both the condolence and distress datasets are collected using the heuristic method detailed in the paper. An initial set of stereotypical "seed" condolence phrases is augmented by performing the process of retrieving sibling comments to comments containing these condolence phrases and then performing an n-grams analysis to discover other common phrases. This final list of 21 phrases, shown in Table 6 was used to identify distress comments as described in the main paper. When training, the raw text was extracted from markdown, code blocks were removed, links were stripped, and only ASCII characters were kept. Newlines were replaced with a single space.

A.2 BERT Models
Both deep learning classifiers were fine-tuned on a pretrained BERT model with 12 heads and 110M parameters, trained on lower case English text (the HuggingFace bert-base-uncased model), and share the same architecture and training method.
For both models, we feed 768-long hidden output into a fully-connected layer with 2 outputs, which are then fed through a softmax activation function.
During training, a dropout with probability 0.5 was added between the BERT output and the fully connected layer. The fully-connected layer was initalized using Xavier initialization (Glorot and Bengio, 2010). ADAM optimizer was used to minimize cross-entropy loss with learning rate 0.001 for the fully connected layer and 0.00001 for BERT parameters initially, and decreased by a factor of 10 every three epochs. The training set was shuffled every epoch. The models were trained overnight with batches of 16 comments on a single NVIDIA GTX 1080 Ti.

A.3 SVM Classifiers
SVM classifiers are trained as a baseline for comparison. The inputs are preprocessed in the same way (text extracted from markdown, links and code blocks stripped, and Unicode symbols removed). Again, both the condolence and distress classifiers "made me tear up" "you dodged a bullet" "take care of yourself" "even begin to imagine" "my heart goes out" "not beat yourself up" "please take care of" "keep your head up" "heart goes out to" "can not even begin" "do not blame yourself" "hope you find peace" "my thoughts and prayers" "there are no words" "this made me cry" "remember the good times" "my deepest condolences" "can not imagine losing" "can not even imagine" "god bless you and" "sorry for your loss"  were trained the same way with the same hyperparameters.
The same random seed is set as when training the deep learning models, so the training, validation, and test datasets are the same between the BERT and SVM classifiers. We trained the linear SVM on comments count-encoded with the 50,000 most common uni-and bigrams. Each classifier took a few minutes to train. Table 7 shows test and validation accuracies for all four models, and Table 8 shows test and validation F-1 scores for all four models.

B.1 Dataset
The dataset was collected as detailed in the paper, then cleaned to be stripped of markdown, links, and images.

B.2 Random Forest Regressor
A random forest regressor is trained to predict empathy (as an average of the two annotator scores) given unigram and bigram features of either (i) only the Observer's condolence reply as input or (ii) the Target's comment and Observer's reply. When both the Observer's and Target's texts are used, separate features are used to record the presence of unigrams and bigrams in each. The random forest has 100 estimators using the default parameters from Scikit

B.3 Deep Learning Model
Two RoBERTa (Liu et al., 2019) models were trained on the same dataset as the random forest model using the roberta-base set of parameters to initialize. Models were trained either providing (i) only the Observer's condolence reply as input or (ii) the Target's comment and Observer's reply. In the latter case, the two texts are separated by the [SEP] token. In both cases, classification is done using the [CLS] token. Both RoBERTa models were implemented using the simpletransformers package using the default hyperparameters, including learning rate 4e-5, batch size 8, and Adam =1e-8. Models were trained for 20 epochs Each model is trained on a single NVIDIA GTX 1080 Ti graphics card, and took about 30 minutes for 20 epochs. No hyperparameter tuning was performed and performance is reported over a single run using a fixed seed.

C Topic Modeling
For both distress and condolence comments, we trained LDA topic models using MALLET using its default hyperparameters for all options and using 20 topics to reflect high-level themes in the data. To preprocess, we stripped markdown, images, and links. We show the top 20 words associated with each topic, as well the topic label we decided, in Table 9 for distress topics and Table 10 for condolence topics.

D Regression Experiments
We run mixed effects regressions for several experiments: predicting whether a distress comment receives any response, predicting whether a distress comment receives a condolence response, and predicting whether a condolence comment receives an appreciative response from the distressed individ-ual. In measuring helpful condolences, the expressions described in Table 11 were used to recognize minimizing condolences.