A Computational Linguistic Study of Personal Recovery in Bipolar Disorder

Mental health research can benefit increasingly fruitfully from computational linguistics methods, given the abundant availability of language data in the internet and advances of computational tools. This interdisciplinary project will collect and analyse social media data of individuals diagnosed with bipolar disorder with regard to their recovery experiences. Personal recovery - living a satisfying and contributing life along symptoms of severe mental health issues - so far has only been investigated qualitatively with structured interviews and quantitatively with standardised questionnaires with mainly English-speaking participants inWestern countries. Complementary to this evidence, computational linguistic methods allow us to analyse first-person accounts shared online in large quantities, representing unstructured settings and a more heterogeneous, multilingual population, to draw a more complete picture of the aspects and mechanisms of personal recovery in bipolar disorder.


Introduction and background
Recent years have witnessed increased performance in many computational linguistics tasks such as syntactic and semantic parsing (Collobert et al., 2011;Zeman et al., 2018), emotion classification (Becker et al., 2017), and sentiment analysis (Barnes et al., 2017(Barnes et al., , 2018a, especially concerning the applicability of such tools to noisy online data. Moreover, the field has made substantial progress in developing multilingual models and extending semantic annotation resources to languages beyond English (Pianta et al., 2002;Boas, 2009;Piao et al., 2016;Boot et al., 2017).
Concurrently, it has been argued for mental health research that it would constitute a 'valuable critical step ' (Stuart et al., 2017) to analyse first-hand accounts by individuals with lived experience of severe mental health issues in blog posts, tweets, and discussion forums. Several severe mental health difficulties, e.g., bipolar disorder (BD) and schizophrenia are considered as chronic and clinical recovery, defined as being relapse and symptom free for a sustained period of time (Chengappa et al., 2005), is considered difficult to achieve (Forster, 2014;Heylighen et al., 2014; U.S. Department of Health and Human Services: The National Institute of Mental Health, 2016). Moreover, clinically recovered individuals often do not regain full social and educational/vocational functioning (Strakowski et al., 1998;Tohen et al., 2003). Therefore, research originating from initiatives by people with lived experience of mental health issues has been advocating emphasis on the individual's goals in recovery (Deegan, 1988;Anthony, 1993). This movement gave rise to the concept of personal recovery (Andresen et al., 2011;van Os et al., 2019), loosely defined as a 'way of living a satisfying, hopeful, and contributing life even with limitations caused by illness' (Anthony, 1993). The aspects of personal recovery have been conceptualised in various ways (Young and Ensing, 1999;Mansell et al., 2010;Morrison et al., 2016). According to the frequently used CHIME model (Leamy et al., 2011), its main components are Connectedness, Hope and optimism, Identity, Meaning and purpose, and Empowerment.
Here, we focus on BD, which is characterised by recurring episodes of depressed and elated (hypomanic or manic) mood (Jones et al., 2010;Forster, 2014). Bipolar spectrum disorders were estimated to affect approximately 2% of the UK population (Heylighen et al., 2014) with rates ranging from 0.1%-4.4% across 11 other European, American and Asian countries (Merikangas et al., 2011). Moreover, BD is associated with a high risk of suicide (Novick et al., 2010), making its prevention and treatment important tasks for society. BD-specific personal recovery research is motivated by mainly two facts: First, the pole of positive/elevated mood and ongoing mood instability constitute core features of BD and pose special challenges compared to other mental health issues, such as unipolar depression (Jones et al., 2010). Second, unlike for some other severe mental health difficulties, return to normal functioning is achievable given appropriate treatment (Coryell et al., 1998;Tohen et al., 2003;Goldberg and Harrow, 2004).
A substantial body of qualitative and quantitative research has shown the importance of personal recovery for individuals diagnosed with BD (Mansell et al., 2010;Jones et al., 2010Jones et al., , 2012Jones et al., , 2015Morrison et al., 2016). Qualitative evidence mainly comes from (semi-)structured interviews and focus groups and has been criticised for small numbers of participants (Stuart et al., 2017), lacking complementary quantitative evidence from larger samples (Slade et al., 2012). Some quantitative evidence stems from the standardised bipolar recovery questionnaire (Jones et al., 2012) and a randomised control trial for recovery-focused cognitive-behavioural therapy (Jones et al., 2015). Critically, previous research has taken place only in structured settings.
What is more, the recovery concept emerged from research primarily conducted in Englishspeaking countries, mainly involving researchers and participants of Western ethnicity. This might have led to a lack of non-Western notions of wellbeing in the concept, such as those found in indigenous peoples (Slade et al., 2012), limiting its the applicability to a general population. Indeed, the variation in BD prevalence rates from 0.1% in India to 4.4% in the US is striking. It has been shown that culture is an important factor in the diagnosis of BD (Mackin et al., 2006), as well as on the causes attributed to mental health difficulties in general and treatments considered appropriate (Sanches and Jorge, 2004;Chentsova-Dutton et al., 2014). While approaches to mental health classification from texts have long ignored the cultural dimension (Loveys et al., 2018), first studies show that online language of individuals affected by depression or related mental health difficulties differs significantly across cultures (De Choudhury et al., 2017;Loveys et al., 2018).
Hence, it seems timely to take into account the wealth of accounts of mental health difficulties and recovery stories from individuals of diverse ethnic and cultural backgrounds that are available in a multitude of languages on the internet. Corpus and computational linguistic methods are explicitly designed for processing large amounts of linguistic data (Jurafsky and Martin, 2009;O'Keeffe and McCarthy, 2010;McEnery and Hardie, 2011;Rayson, 2015), and as discussed above, recent advances have made it feasible to apply them to noisy user-generated texts from diverse domains, including mental health (Resnik et al., 2014;Benton et al., 2017b). Computer-aided analysis of public social media data enables us to address several shortcomings in the scientific underpinning of personal recovery in BD by overcoming the small sample sizes of lab-collected data and including accounts from a more heterogeneous population.
In sum, our research questions are as follows: (1) How is personal recovery discussed online by individuals meeting criteria for BD? (2) What new insights do we get about personal recovery and factors that facilitate or hinder it? We will investigate these questions in two parts, looking at English-language data by westerners and at multilingual data by individuals of diverse ethnicities.

Data
Previous work in computational linguistics and clinical psychology has tended to focus on the detection of mental health issues as classification tasks (Arseniev-Koehler et al., 2018). Datasets have been collected for various conditions including BD using publicly available social-media data from Twitter (Coppersmith et al., 2015) and Reddit (Sekulić et al., 2018;Cohan et al., 2018). Unfortunately, the Twitter dataset is unavailable for further research. 1 In both Reddit datasets, mental health-related content was deliberately removed. This allows the training of classifiers that try to predict the mental health of authors from excerpts that do not explicitly address mental health, yet it renders the data useless for analyses on how mental health is talked about online. Due to this lack of appropriate existing publicly accessible datasets, we will create such resources and make them available to subsequent researchers.
We plan to collect data relevant for BD in gen-eral as well as for personal recovery in BD from three sources varying in their available amount versus depth of the accounts we expect to find: 1) Twitter, 2) Reddit (focusing on mental healthrelated content unlike previous work), 3) blogs authored by affected individuals. Twitter and Reddit users with a BD diagnosis will be identified automatically via self-reported diagnosis statements, such as 'I was diagnosed with BD-I last week'.
To do so, we will extend on the diagnosis patterns and terms for BD provided by Cohan et al. (2018) 2 . Implicit consent is assumed from users on these platforms to use their public tweets and posts. 3 Relevant blogs will be manually identified, and their authors will be contacted to obtain informed consent for using their texts.
Since language and culture are important factors in our research questions, we need information on the language of the texts and the country of residence of their authors 3 , which is not provided in a structured format in the three data sources. For language identification, Twitter employs an automatic tool (Trampus, 2015), which can be used to filter tweets according to 60 language codes, and there are free, fairly accurate tools such as the Google Compact Language Detector 4 , which can be applied to Reddit and blog posts. The location of Twitter users can be automatically inferred from their tweets (Cheng et al., 2010) or the (albeit noisy) location field in their user profiles (Hecht et al., 2011). Only one attempt to classify the location of Reddit users has been published so far (Harrigian, 2018) showing meagre results, indicating that the development of robust location classification approaches on this platform would constitute a valuable contribution.
Some companies collect mental health-related online data and make them available to researchers subject to approval of their internal review boards, e.g., OurDataHelps 5 by Qntfy or the peer-support forum provider 7 Cups 6 . Unlike 'raw' social media data, these datasets have richer user-provided metadata and explicit consent for research usage. On the other hand, less data is available, the process to obtain access might be tedious within the short timeline of a PhD project and it might be im-possible to share the used portions of the data with other researchers. Therefore, we will follow up the possibilities of obtaining access to these datasets, but in parallel also collect our own datasets to avoid dependence on external data providers.

Methodology and Resources
As explained in the introduction, the overarching aim of this project is to investigate in how far information conveyed in social media posts can complement more traditional research methods in clinical psychology to get insights into the recovery experience of individuals with a BD diagnosis. Therefore, we will first conduct a systematic literature review of qualitative evidence to establish a solid base of what is already known about personal recovery experiences in BD for the subsequent social media studies.
Our research questions, which regard the experiences of different populations, lend themselves to several subprojects. First, we will collect and analyse English-language data from westerners. Then, we will address ethnically diverse Englishspeaking populations and finally multilingual accounts. This has the advantage that we can build data processing and methodological workflows along an increase in complexity of the data collection and analysis throughout the project.
In each project phase, we will employ a mixedmethods approach to combine the advantages of quantitative and qualitative methods (Tashakkori and Teddlie, 1998;Creswell and Plano Clark, 2011), which is established in mental health research (Steckler et al., 1992;Baum, 1995;Sale et al., 2002;Lund, 2012) and specifically recommended to investigate personal recovery (Leonhardt et al., 2017). Quantitative methods are suitable to study observable behaviour such as language and yield more generalisable results by taking into account large samples. However, they fall short of capturing the subjective, idiosyncratic meaning of socially constructed reality, which is important when studying individuals' recovery experience (Russell and Browne, 2005;Mansell et al., 2010;Morrison et al., 2016;Crowe and Inder, 2018). Therefore, we will apply an explanatory sequential research design (Creswell and Plano Clark, 2011), starting with statistical analysis of the full dataset followed by a manual investigation of fewer examples, similar to 'distant reading' (Moretti, 2013) in digital humanities.
Since previous research mainly employed (semi-)structured interviews and we do not expect to necessarily find the same aspects emphasised in unstructured settings, even less so when looking at a more diverse and non-English speaking population, we will not derive hypotheses from existing recovery models for testing on the online data. Instead, we will start off with exploratory quantitative research using comparative analysis tools such as Wmatrix (Rayson, 2008) to uncover important linguistic features, e.g., on keywords and key concepts that occur with unexpected frequency in our collected datasets relative to reference corpora. The underlying assumption is that keywords and key concepts are indicative of certain aspects of personal recovery, such as those specified in the CHIME model (Leamy et al., 2011), other previous research (Mansell et al., 2010;Morrison et al., 2016;Crowe and Inder, 2018), or novel ones. Comparing online sources with transcripts of structured interviews or subcorpora originating from different cultural backgrounds might uncover aspects that were not prominently represented in the accounts studied in prior research.
A specific challenge will be to narrow down the data to parts relevant for personal recovery, since there is no control over the discussed topics compared to structured interviews. To investigate how individuals discuss personal recovery online and what (potentially unrecorded) aspects they associate with it, without a priori narrowing down the search-space to specific known keywords seems like a chicken-and-egg problem. We propose to address this challenge by an iterative approach similar to the one taken in a corpus linguistic study of cancer metaphors (Semino et al., 2017). Drawing on results from previous qualitative research (Leamy et al., 2011;Morrison et al., 2016), we will compile an initial dictionary of recovery-related terms. Next, we will examine a small portion of the dataset manually, which will be partly randomly sampled and partly selected to contain recovery-related terms. Based on this, we will be able to expand the dictionary and additionally automatically annotate semantic concepts of the identified relevant text passages using a semantic tagging approach such as the UCREL Semantic Analysis System (USAS) (Rayson et al., 2004). Crucially for the multilingual aspect of the project, USAS can tag semantic categories in eight languages (Piao et al., 2016). Then, se-mantic tagging will be applied to the full corpus to retrieve all text passages mentioning relevant concepts. Furthermore, distributional semantics methods (Lenci, 2008;Turney and Pantel, 2010) can be used to find terms that frequently co-occur with words from our keyword dictionary. Occurrences of the identified keywords or concepts can be quantified in the full corpus to identify the importance of the related personal recovery aspects.

Linguistic
Inquiry and Word Count (LIWC) (Pennebaker et al., 2015) is a frequently used tool in social-science text analysis to analyse emotional and cognitive components of texts and derive features for classification models (Cohan et al., 2018;Sekulić et al., 2018;Tackman et al., 2018;Wang and Jurgens, 2018). LIWC counts target words organised in a manually constructed hierarchical dictionary without contextual disambiguation in the texts under analysis and has been psychometrically validated and developed for English exclusively. While translations for several languages exist, e.g., Dutch (Boot et al., 2017), and it is questionable to what extent LIWC concepts can be transferred to other languages and cultures by mere translation. We therefore aim to apply and develop methods that require less manual labour and are applicable to many languages and cultures. One option constitute unsupervised methods, such as topic modelling, which has been applied to explore cultural differences in mental-health related online data already (De Choudhury et al., 2017;Loveys et al., 2018). The Differential Language Analysis ToolKit (DLATK) (Schwartz et al., 2017) facilitates social-scientific language analyses, including tools for preprocessing, such as emoticon-aware tokenisers, filtering according to meta data, and analysis, e.g. via robust topic modelling methods.
Furthermore, emotion and sentiment analysis constitute useful tools to investigate the emotions involved in talking about recovery and identify factors that facilitate or hinder it. There are many annotated datasets to train supervised classifiers (Bostan and Klinger, 2018;Barnes et al., 2017) for these actively researched NLP tasks. Machine learning methods were found to usually outperform rule-based approaches based on look-ups in dictionaries such as LIWC. Again, most annotated resources are English, but state of the art approaches based on multilingual em-beddings allow transferring models between languages (Barnes et al., 2018a).

Ethical considerations
Ethical considerations are established as essential part in planning mental health research and most research projects undergo approval by an ethics committee. On the contrary, the computational linguistics community has started only recently to consider ethical questions (Hovy and Spruit, 2016;Hovy et al., 2017). Likely, this is because computational linguistics was traditionally concerned with publicly available, impersonal texts such as newspapers or texts published with some temporal distance, which left a distance between the text and author. Conversely, recent social media research often deals with highly personal information of living individuals, who can be directly affected by the outcomes (Hovy and Spruit, 2016). Hovy and Spruit (2016) discuss issues that can arise when constructing datasets from social media and conducting analyses or developing predictive models based on these data, which we review here in relation to our project: Demographic bias in sampling the data can lead to exclusion of minority groups, resulting in overgeneralisation of models based on these data. As discussed in the introduction, personal recovery research suffers from a bias towards English-speaking Western individuals of white ethnicity. By studying multilingual accounts of ethnically diverse populations we explicitly address the demographic bias of previous research. Topic overexposure is tricky to address, where certain groups are perceived as abnormal when research repeatedly finds that their language is different or more difficult to process. Unlike previous research (Coppersmith et al., 2015;Cohan et al., 2018;Sekulić et al., 2018) our goal is not to reveal particularities in the language of individuals affected by mental health problems. Instead, we will compare accounts of individuals with BD from different settings (structured interviews versus informal online discourse) and of different backgrounds. While the latter bears the risk to overexpose certain minority groups, we will pay special attention to this in the dissemination of our results.
Lastly, most research, even when conducted with the best intentions, suffers from the dual-use problem (Jonas, 1984), in that it can be misused or have consequences that affect people's life nega-tively. For this reason, we refrain from publishing mental health classification methods, which could be used, for example, by health insurance companies for the risk assessment of applicants based on their social media profiles.
If and how informed consent needs to be obtained for research on social media data is a debated issue (Eysenbach and Till, 2001;Beninger et al., 2014;Paul and Dredze, 2017), mainly because it is not straightforward to determine if posts are made in a public or private context. From a legal point of view, the privacy policies of Twitter 7 and Reddit 8 , explicitly allow analysis of the user contents by third party, but it is unclear to what extent users are aware of this when posting to these platforms (Ahmed et al., 2017). However, in practice it is often infeasible to seek retrospective consent from hundreds or thousands of social media users. According to current ethical guidelines for social media research (Benton et al., 2017a;Williams et al., 2017) and practice in comparable research projects (O'Dea et al., 2015;Ahmed et al., 2017), it is regarded as acceptable to waive explicit consent if the anonymity of the users is preserved. Therefore, we will not ask the account holders of Twitter and Reddit posts included in our datasets for their consent. Benton et al. (2017a) formulate guidelines for ethical social media health research that pertain especially to data collection and sharing. In line with these, we will only share anonymised and paraphrased excerpts from the texts, as it is often possible to recover a user name via a web search for the verbatim text of a post. However, we will make the original texts available as datasets to subsequent research under a data usage agreement. Since the (automatic) annotation of demographic variables in parts of our dataset constitutes especially sensitive information on minority status in conjunction with mental health, we will only share these annotations with researchers that demonstrate a genuine need for them, i.e. to verify our results or to investigate certain research questions.
Another important question is in which situations of encountering content indicative of a risk of self-harm or harm to others it would be appro-7 https://cdn.cms-twdigitalassets. com/content/dam/legal-twitter/ site-assets/privacy-policy-new/ Privacy-Policy-Terms-of-Service_EN.pdf 8 www.redditinc.com/policies/ privacy-policy priate or even required by duty of care for the research team to pass on information to authorities. Surprisingly, we could only find two mentions of this issue in social media research (O'Dea et al., 2015;Young and Garett, 2018). Acknowledging that suicidal ideation fluctuates (Prinstein et al., 2008), we accord with the ethical review board's requirement in O'Dea et al. (2015) to only analyse content posted at least three months ago. If the research team, which includes clinical psychologists, still perceives users at risk we will make use of the reporting facilities of Twitter and Reddit.
As a central component we consider the involvement of individuals with lived experience in our project, an aspect which is missing in the discussion of ethical social media health research so far. The proposal has been presented to an advisory board of individuals with a BD diagnosis and was received positively. The advisory board will be consulted at several stages of the project to inform the research design, analysis, and publication of results. We believe that board members can help to address several of the raised ethical problems, e.g., shaping the research questions to avoid feeding into existing biases or overexposing certain groups and highlighting potentially harmful interpretations and uses of our results.

Impact and conclusion
The importance of the recovery concept in the design of mental health services has recently been prominently reinforced, suggesting recoveryoriented social enterprises as key component of the integrated service (van Os et al., 2019). We think that a recovery approach as leading principle for national or global health service strategies, should be informed by voices of individuals as diverse as those it is supposed to serve. Therefore, we expect the proposed investigations of views on recovery by previously under-researched ethnic, language, and cultural groups to yield valuable insights on the appropriateness of the recovery approach for a wider population. The datasets collected in this project can serve as useful resources for future research. More generally, our socialmedia data-driven approach could be applied to investigate other areas of mental health if it proves successful in leading to relevant new insights.
Finally, this project is an interdisciplinary endeavour, combining clinical psychology, input from individuals with lived experience of BD, and computational linguistics. While this comes with the challenges of cross-disciplinary research, it has the potential to apply and develop state-of-the-art NLP methods in a way that is psychologically and ethically sound as well as informed and approved by affected people to increase our knowledge of severe mental illnesses such as BD.

Acknowledgments
I would like to thank my supervisors Steven Jones, Fiona Lobban, and Paul Rayson for their guidance in this project. My heartfelt thanks go also to Chris Lodge, service user researcher at the Spectrum Centre, and the members of the advisory panel he coordinates that offer feedback on this project based on their lived experience of BD. Further, I would like to thank Masoud Rouhizadeh for his helpful comments during pre-submission mentoring and the anonymous reviewers. This project is funded by the Faculty of Health and Medicine at Lancaster University as part of a doctoral scholarship.