A Computational Approach to Understanding Empathy Expressed in Text-Based Mental Health Support

Empathy is critical to successful mental health support. Empathy measurement has predominantly occurred in synchronous, face-to-face settings, and may not translate to asynchronous, text-based contexts. Because millions of people use text-based platforms for mental health support, understanding empathy in these contexts is crucial. In this work, we present a computational approach to understanding how empathy is expressed in online mental health platforms. We develop a novel unifying theoretically-grounded framework for characterizing the communication of empathy in text-based conversations. We collect and share a corpus of 10k (post, response) pairs annotated using this empathy framework with supporting evidence for annotations (rationales). We develop a multi-task RoBERTa-based bi-encoder model for identifying empathy in conversations and extracting rationales underlying its predictions. Experiments demonstrate that our approach can effectively identify empathic conversations. We further apply this model to analyze 235k mental health interactions and show that users do not self-learn empathy over time, revealing opportunities for empathy training and feedback.


Introduction
Approximately 20% of people worldwide are suffering from a mental health disorder (Holmes et al., 2018).Still, access to mental health care remains a global challenge with widespread shortages of workforce (Olfson, 2016).Facing limited in-person treatment options and other barriers like stigma (White and Dorman, 2001), millions of peo-Figure 1: Our framework of empathic conversations contains three empathy communication mechanisms -Emotional Reactions, Interpretations, and Explorations.We differentiate between no communication, weak communication, and strong communication of these factors.Our computational approach simultaneously identifies these mechanisms and the underlying rationale phrases (highlighted portions).All examples in this paper have been anonymized using best practices in privacy and security (Matthews et al., 2017).
such as TalkLife (talklife.co)to express emotions, share stigmatized experiences, and receive peer support (Eysenbach et al., 2004).However, while peer supporters on these platforms are motivated and well-intentioned to help others seeking support (henceforth seeker), they are untrained and typically unaware of best-practices in therapy.
In therapy, interacting empathically with seekers is fundamental to success (Bohart et al., 2002;Elliott et al., 2018).The lack of training or feedback to layperson peer supporters results in missed opportunities to offer empathic textual responses.NLP systems that understand conversational empathy could empower peer supporters with feedback and training.However, the current understanding of empathy is limited to traditional face-to-face, speech-based therapy (Gibson et al., 2016;Pérez-Rosas et al., 2017) due to lack of resources and methods for new asynchronous, text-based interactions (Patel et al., 2019).Also, while previous NLP research has focused predominantly on empathy as reacting with emotions of warmth and compassion (Buechel et al., 2018), a separate but key aspect of empathy is to communicate a cognitive understanding of others (Selman, 1980).
In this work, we present a novel computational approach to understanding how empathy is expressed in text-based, asynchronous mental health conversations.We introduce EPITOME, 1 a conceptual framework for characterizing communication of empathy in conversations that synthesizes and adapts the most prominent empathy scales from speech-based, face-to-face contexts to text-based, asynchronous contexts ( §3).EPITOME consists of three communication mechanisms of empathy: Emotional Reactions, Interpretations, and Explorations (Fig. 1).
To facilitate computational modeling of empathy in text, we create a new corpus based on EPITOME.We collect annotations on a dataset of 10k (post, response) pairs from extensively-trained crowdworkers with high inter-rater reliability ( §4). 2 We develop a RoBERTa-based bi-encoder model for identifying empathy communication mechanisms in conversations ( §5).Our multi-task model simultaneously extracts the underlying supportive evidences, rationales (DeYoung et al., 2020), for its predictions (spans of input post; e.g., highlighted portions in Fig. 1) which serve the dual role of (1) explaining the model's decisions, thus minimizing the risk of deploying harmful technologies in sensitive contexts, and (2) enabling rationale-augmented feedback for peer supporters.
We show that our computational approach can effectively identify empathic conversations with underlying rationales (∼80% acc., ∼70% macro-f1) and outperforms popular NLP baselines with a 4-point gain in macro-f1 ( §6).We apply our model to a dataset of 235k supportive conversations on TalkLife and demonstrate that empathy is associated with positive feedback from seekers and the forming of relationships.Importantly, our results suggest that most peer supporters do not self-learn empathy with time.This points to critical opportunities for training and feedback for peer supporters to increase the effectiveness of mental health support (Miner et al., 2019;Imel et al., 2015).Specifically, NLP-based tools could give actionable, real-time feedback to improve expressed empathy, and we demonstrate this idea in a smallscale proof-of-concept ( §7).

Background
2.1 How to measure empathy?
Empathy is a complex multi-dimensional construct with two broad aspects related to emotion and cognition (Davis et al., 1980).The emotion aspect relates to the emotional stimulation in reaction to the experiences and feelings expressed by a user.The cognition aspect is a more deliberate process of understanding and interpreting the experiences and feelings of the user and communicating that understanding to them (Elliott et al., 2018).
Here, we study expressed empathy in text-based mental health support -empathy expressed or communicated by peer supporters in their textual interactions with seekers (cf.Barrett-Lennard (1981)). 3 Table 1 lists existing empathy scales in psychology and psychotherapy research.Truax and Carkhuff (1967) focus only on communicating cognitive understanding of others while Davis et al. (1980); Watson et al. (2002) also make use of expressing stimulated emotions.
These scales, however, have been designed for in-person interactions and face-to-face therapy, often leveraging audio-visual signals like expressive voice.In contrast, in text-based support, empathy must be expressed using textual response alone.Also, they are designed to operate on long, synchronous conversations and are unsuited for the shorter, asynchronous conversations of our context.
In this work, we adapt these scales to text-based, asynchronous support.We develop a new comprehensive framework for text-based, asynchronous conversations (Table 1; §3), use it to create a new dataset of empathic conversations ( §4), a computational approach for identifying empathy ( §5; §6), & gaining insights into mental health platforms ( §7).

Computational Approaches for Empathy
Computational research on empathy is based on speech-based settings, exploiting audio signals like pitch which are unavailable in text-based platforms (Gibson et al., 2016;Pérez-Rosas et al., 2017).Moreover, previous NLP research has predominantly focused on empathy as reacting with emotions of warmth and compassion (Buechel et al., 2018).For mental health support, however, communicating cognitive understanding of feelings and experiences of others is more valued (Selman, 1980).Recent work also suggests that grounding conversations in emotions implicitly makes them empathic (Rashkin et al., 2019).Research in therapy, however, highlights the importance of expressing empathy in interactions (Truax and Carkhuff, 1967).In this work, we present a computational approach to (1) understanding empathy expressed in textual, asynchronous conversations; (2) address both emotional and cognitive aspects of empathy.

Framework of Expressed Empathy
To understand empathy in text-based, asynchronous, peer-to-peer support conversations, we develop EPITOME, a new conceptual framework of expressed empathy (Fig. 1).In collaboration with clinical psychologists, we adapt and synthesize existing empathy definitions and scales to textbased, asynchronous context.EPITOME consists of three communication mechanisms providing a comprehensive outlook of empathy -Emotional Reactions, Interpretations, and Explorations.For each of these mechanisms, we differentiate between -(0) peers not expressing them at all (no communication), (1) peers expressing them to some weak degree (weak communication), (2) peers expressing them strongly (strong communication).
Here, we describe our framework in detail:4 Emotional Reactions.Expressing emotions such as warmth, compassion, and concern, experienced by peer supporter after reading seekers post.Expressing these emotions plays an important role in establishing empathic rapport and support (Robert et al., 2011).A weak communication of emotional reactions alludes to these emotions without the emotions being explicitly labeled (e.g., Everything will be fine).On the other hand, strong communication specifies the experienced emotions (e.g., I feel really sad for you).Interpretations.Communicating an understanding of feelings and experiences inferred from the seekers post.Such a cognitive understanding in responses is helpful in increasing awareness of hidden feelings and experiences, and essential for developing alliance between the seeker and peer supporter (Watson, 2007).A weak communication of interpretations contains a mention of the understanding (e.g., I understand how you feel) while a strong communication specifies the inferred feeling or experience (e.g., This must be terrifying) or communicates understanding through descriptions of similar experiences (e.g., I also have anxiety attacks at times which makes me really terrified).
Explorations.Improving understanding of the seeker by exploring the feelings and experiences not stated in the post.Showing an active interest in what the seeker is experiencing and feeling and probing gently is another important aspect of empathy (Miller et al., 2003;Robert et al., 2011).A weak exploration is generic (e.g., What happened?) while a strong exploration is specific and labels the seeker's experiences and feelings which the peer supporter wants to explore (e.g., Are you feeling alone right now?).
Consistent with existing scales, responses that only give advice (Try talking to friends), only provide factual information (mindful meditation overcomes anxiety), or are offensive or abusive (shut the f**k up)5 are not empathic and are characterized as no communication of empathy in our framework.

Data Collection
To facilitate computational methods for empathy, we collect data based on EPITOME.

Data Source
We use conversations on the following two online support platforms as our data source: (1) TalkLife.TalkLife (talklife.co) is the largest global peer-to-peer mental health support network.It enables seekers to have textual interactions with peer supporters through conversational threads.The dataset contains 6.4M threads and 18M interactions (seeker post, response post pairs).
We use the entire dataset for in-domain pretraining ( §5) and annotate a subset of 10k interactions on empathy.We further analyze empathy on a carefully filtered dataset of 235k mental health interactions on TalkLife ( §7).

Annotation Task and Process
Empathy is conceptually nuanced and linguistically diverse so annotating it accurately is difficult in short-term crowdwork approaches.This is also reflected in prior work that found it challenging to annotate therapeutic constructs (Lee et al., 2019).To ensure high inter-rater reliability, we designed a novel training-based annotation process.Crowdworkers Recruiting and Training.We recruited and trained eight crowdworkers on identifying empathy mechanisms in EPITOME.We leveraged Upwork (upwork.com), a freelancing platform that allowed us to hire and work interactively with crowdworkers.Each crowdworker was trained Highlighting Rationales.Along with the categorical annotations, crowdworkers were also asked to highlight portions of the response post that formed rationale behind their annotation.E.g, in the post That must be terrible!I'm here for you, the portion That must be terrible is the rationale for it being a strong communication of interpretations.Data Quality.Overall, our corpus has an average inter-annotator agreement of 0.6865 (average over pairwise Cohen's κ of all pairs of crowdworkers; each pair annotated >50 posts in common) which is higher than previously reported values for the annotation of empathy in face-to-face therapy (∼0.60 in Pérez-Rosas et al., 2017;Lord et al., 2015).Our ground-truth corpus contains 10,143 (seeker post, response post) pairs with annotated empathy labels from trained crowdworkers (Table 2).Privacy and Ethics.The TalkLife dataset was sourced with license and consent from the TalkLife platform.All personally identifiable information (user, platform identifiers) in both the datasets were removed.This study was approved by University of Washington's Institutional Review Board.In addition, we tried to minimize the risks of annotating mental health related content by providing crisis management resources to our annotators, follow-ing Sap et al. (2020).This work does not make any treatment recommendations or diagnostic claims.

Model
With our collected dataset, we develop a computational approach for understanding empathy.

Problem Definition
Let S i = s i1 , ..., s im be a seeker post and R i = r i1 , ..., r in be a corresponding response post.For the pair (S i , R i ), we want to perform two tasks: Task 1: Empathy Identification.Identify how empathic R i is in the context of S i .For each of the three communication mechanisms in EPIT-OME (Emotional Reactions, Interpretations, Explorations), we want to identify their level of communication (l i ) in R i -no communication (0), weak communication (1), or strong communication (2).
Task 2: Rationale Extraction.Extract rationales underlying the identified level l i ∈ {no, weak, strong} of each of the three communication mechanism in EPITOME.The extracted rationale is a subsequence of words x i in R i .We represent this subsequence as a mask m i = (m i1 , ..., m in ) over the words in R i , where m ij ∈ {0, 1} is a boolean variable: 1 -rationale word; 0 -non-rationale word.
Correspondingly, x i = m i R i .

Bi-Encoder Model with Attention
We propose a multi-task bi-encoder model based on RoBERTa (Liu et al., 2019) for identifying empathy and extracting rationales (Fig. 2).We multi-task over the two tasks of empathy identification and rationale extraction and train three independent but identical architectures for the three empathy communication mechanisms in EPITOME ( §3).The bi-encoder architecture (Humeau et al., 2019) facilitates a joint modeling of (S i , R i ) pairs.Moreover, the use of attention helps in providing context from the seeker post, S i .We find that such an approach is more effective than methods that concatenate S i with R i with a [SEP] token to form a single input sequence ( §6).
Two Encoders.Our model uses two independently pre-trained transformer encoders from RoBERTa BASE -S-Encoder & R-Encoder -for encoding seeker post and response post respectively.S-Encoder encodes context from the seeker post whereas R-Encoder is responsible for understanding empathy in the response post.

… Attention
Life sucks … today

S-Encoder
[CLS] where [CLS] and [SEP] are special start and end tokens adapted from BERT (Devlin et al., 2019).Domain-Adaptive Pre-training.Both the S-Encoder and R-Encoder are initialized using the weights learned by RoBERTa BASE .We further perform a domain-adaptive pre-training (Gururangan et al., 2020) of the two encoders to adapt to conversational and mental health context.For this additional pre-training of the two encoders, we use the datasets of 6.4M seeker posts (182M tokens) and 18M response (279M tokens) posts respectively sourced from TalkLife ( §4).We use the masked language modeling task for pre-training (3 epochs, batch size = 8).Attention Layer.We use a single-head attention over the two encodings for generating seekercontext aware representation of the response post.
Using the terminology of transformers (Vaswani et al., 2017), our query is the response post encoding e (R) i , and the keys and the values are the seeker post encoding e (S) i .Our attention is computed as: where d = 768 (hidden size in RoBERTa BASE ).We sum the encoded response e (R) i with its representation transformed through attention a i (e i , which forms the final seeker-context aware representation of the response post.
Empathy Identification.For the task of identifying empathy, we use the final representation of the [CLS] token in the response post (h and pass it through a linear layer to get the predictions of the empathy level li (0, 1, or 2) of each empathy communication mechanism.Note that we train three independent models for the three communication mechanisms in EPITOME ( §3).
Extracting Rationales.For extracting rationales y i underlying the predictions, we use final representations of the individual tokens in R i (h (R) i [r i1 , ..., r in ]) and pass them through a linear layer for making boolean predictions, mi .
Loss Function.We use cross-entropy between the true and predicted labels as the loss functions of our two tasks.The overall loss of our multi-task architecture is: Experimental Setup.We split both the datasets into train, dev, and test sets (75:5:20).We train our model for 4 epochs using a learning rate of 2e−5, batch size of 32, λ EI = 1, and λ RE = 0.5 (Refer Appendix B for fine-tuning details).

Results
Next, we analyze how effectively we can identify empathy with underlying rationales using our computational approach.
Empathy Identification Task.Table 3 reports the accuracy and macro-f1 scores of the three communication mechanisms (random baseline for each is 33% accurate; three levels).Log.reg., RNN, and HRED struggle to identify empathy with noticeably low macro-f1 scores indicative of failures to distinguish between the three levels of communication.
Among the baseline transformer architectures, we obtain best performance using RoBERTa but observe substantial gains over them with our approach (+1.73 acc., +4.02 macro-f1 over RoBERTa).We analyze the sources of these gains in §6.2.Rationale Extraction Task.We perform both token level and span level evaluation for this task.We use two metrics, commonly used in discrete rationale extraction tasks (DeYoung et al., 2020): 1. T-f1 (token level f1); 2. IOU-f1 (intersection over union overlap of predicted spans with ground truth spans; threshold of 0.5 on the overlap for finding true positives and the corresponding f1).We find that GPT-2 and DialoGPT perform better than BERT and RoBERTa likely due to appropriateness to the related task of generating free-text rationales (Table 4).Our approach obtains gains of +2.58 T-f1 and +6.45 IOU-f1 over DialoGPT, potentially due to the use of attention and seeker post ( §6.2).

Ablation Study
We next analyze the components and training strategies in our approach through an ablation study.
No Attention.Instead of using attention, we concatenate the seeker post encoding (e (S) i ) with the response post encoding (e (R) i ) and use the concatenated representation as input to the linear layer.
No Seeker Post.We train without the S-Encoder, i.e., by only encoding from the R-Encoder.
No Rationales.We set λ RE to 0 and only train on the empathy identification task.
No Domain-Adaptive Pre-training.We initialize by only using model weights from RoBERTa BASE .
Results.Our most significant gains come from using attention and the seeker post (Table 5) which greatly benefits the rationale extraction task (+4.88 T-f1, +5.74 IOU-f1).Also, using rationales and pre-training only leads to small performance improvements.

Error Analysis
We qualitatively analyze the sources of our errors.We found that the model sometimes failed to iden- tity short expressions of emotions in responses that otherwise contained a lot of instructions (e.g., Sorry to hear that!Try doing ...).Also, certain responses trying to universalize the situation (e.g., You are not alone) got incorrectly identified as strong interpretations.Furthermore, a source of error for explorations was confusions due to questions that were not an exploration of seeker's feelings or experiences (e.g., offers to talk -Do you want to talk?).

Model-based Insights into Mental Health Platforms
We apply our model to study how empathy impacts online peer-to-peer support dynamics.To only focus on conversations related to significant mental health challenges and filter out common social me-

Seeker Post
Original Response Re-written Response I cannot do anything without getting blamed today.This day is getting worse and worse.
Days end, tomorrow is a fresh start.
I'm sorry that today sucks, but tomorrow is a fresh start.
An hour ago i was happy an hour later i'm sad.Am i getting mad now?

Try mindful meditation which can control anxiety
That's something Ive struggled with too, and it really pains me to hear that youre dealing with the same thing.Have you considered trying meditation?I've found it to be very helpful.dia interactions (e.g., Merry Christmas), we carefully select 235k mental health related interactions on TalkLife using a seeker-reported indicator. 6e investigate (1) the levels of empathy on the platform, its variation over time, and examine the relationship of empathy with (2) conversation outcomes, (3) relationship forming, and (4) gender.
(1) Peer supporters do not self-learn empathy over time.Overall, we observe that empathy expressed by peer supporters on the platform is low (avg.total score7 of 1.09 out of 6).In addition, we find that the emotional reactivity of users decreases over time (36% decrease over three years) and their levels of interpretations and explorations remain practically constant (Fig. 3a).This is also reflected in prior work on therapy that shows that without deliberate practice and specific feedback, even trained therapists often diminish in skills over time (Goldberg et al., 2016).We find this trend robust to potential confounding factors (new users, user dropout) and users of different groups (low vs. high activity users, moderators; Appendix C).This indicates that most users do not self-learn empathy and highlights the need of providing them feedback.
(2) High empathy interactions are received positively by seekers.We analyze the correlation of empathic conversations with positive feedback, concretely with seeker "liking" the post.We find that strong communications of empathy are received with 45% more likes by seekers than no communication (Fig. 3b).Strong explorations get 44% less likes but receive 47% more replies than no explorations, leading to higher engagement.
(3) Relationship forming more likely after empathic conversations.Psychology research emphasizes the importance of empathy in forming alliance and relationship with seekers (Watson, 2007).
Here, we operationalize relationship forming as seeker "following" the peer supporter after a conversation (within 24hrs).We find that seekers are 79% more likely to follow peer supporters after an empathic conversation (total score of 1+ vs. 0) than after a non-empathic one (Fig. 3c).
(4) Females are more empathic with females than males are with males.Previous work has shown that seekers identifying as females receive more support in online communities (Wang and Jurgens, 2018).Here, we ask if empathic interactions are affected by the self-reported gender of seekers and peer supporters.We find that female peer supporters are 32% more empathic towards female seekers than males are towards male seekers (Fig. 3d).Also, females are 6% more empathic towards males than males are towards females.Implications for empathy-based feedback.These results suggest that our approach not only successfully measures empathy according to a principled framework ( §3), but that the measured empathy components are important to online supportive conversations as indicated by the positive reactions from seekers and meaningful reflections of social theories.However, peer supporters on the platform express empathy rarely and this does not improve over time.This points to critical opportunities for empathy-based feedback to peer supporters for making their interactions with seekers more effective.Here, we demonstrate the potential of feedback in a simple proof-ofconcept.When providing three participants (none are co-authors) simple feedback (Appendix D) based on EPITOME and our best-performing model, they were able to increase empathy in responses from 0.8 to 3.0 (total empathy across the three mechanisms).Table 6 shows two such examples of re-written responses that improve in communicating cognitive understanding (today sucks) and are also better with emotional reactions (I'm sorry, it pains me) and explorations (Have you considered trying mindful meditation?).
Previous work in NLP for mental health has focused on analysis of effective conversation strategies (Althoff et al., 2016;Pérez-Rosas et al., 2019;Zhang and Danescu-Niculescu-Mizil, 2020), identification of therapeutic actions (Lee et al., 2019), and language development of counselors (Zhang et al., 2019).Researchers have also analyzed linguistic accommodation (Sharma and De Choudhury, 2018), cognitive restructuring (Pruksachatkun et al., 2019), and selfdisclosure (Yang et al., 2019).We extend these studies and analyze empathy which is key in counseling and mental health support.Recent work has also developed proof-of-concept prototypes, such as ClientBot (Huang et al., 2020), for training users in counseling.Our approach is aimed towards developing empathy-based feedback and training systems for peer supporters (consistent with calls to action for improved treatment access and training (Miner et al., 2019;Imel et al., 2015;Kazdin and Rabbitt, 2013)).

Conclusion
We developed a new framework, dataset, and computational method for understanding expressed empathy in text-based, asynchronous conversations on mental health platforms.Our computational approach effectively identifies empathy with underlying rationales.Moreover, the identified components are found to be important to mental health platforms and helpful in improving peer-to-peer support through model-based feedback.

B Reproducibility B.1 Implementation Details
Code.Our codes are based on the huggingface library (https://huggingface.co/).We make them publicly available at https://github.com/behavioral-data/Empathy-Mental-Health.Seed Value.For all our experiments, we used the seed value of 12.

B.4 Train, Dev, Test Splits
We split both the datasets into train, dev, and test sets in the ratio of 75:5:20.Table 7 contains the statistics of the train, dev, and test splits.

B.5 Number of Parameters
The total number of parameters of our model = 2 * number of parameters of RoBERTa BASE + parameters in the linear layers ≈ 2*125M + 2 * .5M= 251M

B.6 Reddit dataset
The entire Reddit dataset can be accessed through its archive on Google BigQuery at https://bigquery.C Potential confounding factors in analysis of variation of empathy over time We note that such an analysis can be affected by several confounding factors such as old vs. new users, user dropout, and low activity of several users.To account for these factors, we stratify users by the year in which they started supporting on the platform (2015,2016,2017) and analyze the average levels of empathy during subsequent years in each stratum.We further filter users with < 10 posts and only consider users who stay on the platform for at least a year.
In addition, we analyze various user groups but observe similar trends (Fig. 4).We work with three computer science students with no training in counseling and give them (seeker post, response post) pairs identified low in empathy by our approach (total empathy score ≤ 1).We show them -(1) the levels of empathy predicted by our model, (2) extracted rationales, (3) a templated feedback explaining where the response lacks and how it can be made more empathic (based on the predicted levels, extracted rationales, definitions and examples in EPITOME).A sample feedback is shown below: • Seeker Post: I'm hurt so much that I don't really have feelings anymore • Response Post: Yeah, I felt it once • Feedback: 1.The response communicates an understanding of the seekers post to a weak degree in the portion I felt it once.The communication can be made stronger by talking about the seekers feelings or experiences that you interpret after reading the post.Typically, they are expressed by saying This must be terrible, I know you are in a tough situation.2. It also lacks expressions of emotions of warmth, compassion, or concern and also does not attempt to explore the seekers emotions or feelings.Typically, they are expressed by saying I am feeling sorry for you, What makes you feel depressed?
We ask them to re-write the response post making use of the templated feedback.Overall, the participants were comfortable to re-write the responses with an average difficulty of 1.92 out of 5 (most difficult is 5) and found the feedback useful in the re-writing process with an average usefulness rating of 3.5 out of 5 (highly useful is 5).

Figure 2 :
Figure2: We use two independently pre-trained RoBERTa-based encoders for encoding seeker post and response post respectively.We leverage attention between them for generating seeker-context aware representation of the response post, used to perform the two tasks of empathy identification and rationale extraction.

Figure 3 :
Figure 3: (a) Peer-supporters do not self-learn empathy over time.Only users who joined in 2015 were included but similar trends hold for other user groups; (b) Stronger communications of emotional reactions and interpretations are received positively by seekers.Stronger explorations get 47% more replies; (c) A lot more seekers follow peers after empathic interactions; (d) Females are more empathic towards females.
Pre-training Time.We conducted domain-adaptive pre-training on four RTX 2080 Ti GPUs.Pre-training S-Encoder took around 22 hours.Pre-training R-Encoder took around 38 hours.Both are pre-trained for three epochs.Model Training Time.We trained our model on one RTX 2080 Ti GPU.The training approximately takes five minutes.Our model is trained for four epochs.

Figure 4 :
Figure 4: Empathy over time analysis of various user groups.We find similar trends across multiple groups.

Table 1 :
EPITOME incorporates both emotional and cognitive aspects of empathy that were previously only studied in face-to-face therapy and never computationally in text-based, asynchronous conversations.*Rashkinetal. (2019) implicitly enable empathic conversations through grounding in emotions instead of communication.

Table 2 :
Statistics of the collected empathy dataset.The crowdworkers were trained on EPITOME through a series of phone calls and manual/automated feedback on sample posts to ensure high quality annotations.
and Explorations), one at a time.For each mechanism, crowdworkers annotated whether the response post contained no communication, weak communication, or strong communication of empathy in the context of the seeker post.

Table 4 :
Rationale extraction task results.We evaluate both at the level of tokens (T-f1) and spans (IOU-f1).

Table 5 :
Ablation results.Most of our gains are due to context provided through attention and seeker post; higher gains for the rationale extraction task.*Note that rationales cannot be predicted after removing them from training.

Table 6 :
Example re-written responses with our model-based feedback.Participants increased empathy from 0.8 to 3.0.blue = Strong emo.reactions, light red/dark red = Weak/Strong Interpretations, green = Strong explorations.