On the Impact of Twitter-based Health Campaigns: A Cross-Country Analysis of Movember

Health campaigns that aim to raise awareness and subsequently raise funds for research and treatment are commonplace. While many local campaigns exist, very few attract the attention of a global audience. One of those global campaigns is Movember, an annual campaign during the month of November, that is directed at men’s health with special foci on cancer & mental health. Health campaigns routinely use social media portals to capture people’s attention. Recently, researchers began to consider to what extent social media is effective in raising the awareness of health campaigns. In this paper we expand on those works by conducting an investigation across four different countries, while not only restricting ourselves to the impact on awareness but also on fundraising. To that end, we analyze the 2013 Movember Twitter campaigns in Canada, Australia, the United Kingdom and the United States.


Introduction
The rise of social media portals -and thus access to vast amounts of user-generated datahas not gone unnoticed within the health care domain. Existing works have, amongst others, exploited social media data to track and predict the spread of diseases (Achrekar et al., 2011;Culotta, 2010;Chew and Eysenbach, 2010;Diaz-Aviles and Stewart, 2012), to analyse the effects of drug interactions (Segura-Bedmar et al., 2014), and to examine trends for cardiac arrest and resuscitation communication (Bosley et al., 2013).
Social media portals have also been employed to distribute health information on diseases and treatment options. In (Scanfeld et al., 2010;Vance et al., 2009), for instance, it has been shown that effective dissemination of such information can be achieved through Twitter and YouTube. At the same time though, Moorhead et al. (2013) argue that social health communication research is still in its infancy and large gaps in our understanding remain.
While the usage of social media for health campaigns is ever-growing, very few works have considered how effective these campaigns are in achieving their goals. While Thackeray et al. (2013) and Bravo and Hoffman-Goetz (2015) investigated the change of people's awareness during social media health campaigns, to our knowledge no research so far has considered the second goal of many health campaigns -raising funds for research and treatment.
In this paper, we contribute to closing this gap, (1) by conducting an awareness-based large-scale analysis across several countries, and (2) by investigating the extent to which a global social-media based health campaign is successful in terms of fund-raising. We investigate the particular use case of Movember, an annual health campaign conducted (amongst others) through social media channels that has two goals 1 : (1) to gather "funding for the Movember Foundation's men's health programs", and, (2) to start "conversations about men's health". In both cases, the main foci are on various types of cancer that typically occur in men and on men's mental health.
Movember is a world-wide campaign that aims to raise funds through a number of social activities, chief among them the growing of a moustache in the month of November. Although a global event, the Movember campaigns are localized; each participating country runs its own campaign. In our analysis we focus on the four English-language local campaigns that yield the most donations via Twitter: the United States, the United Kingdom, Canada, and Australia 2 . Globally, Movember can be considered a success, as in 2013 alone (the year we investigate) funds in excess of 123 million AU$ were raised world-wide 3 .
In our work we investigate whether social media activities can explain the success of the campaign (both in terms of raising awareness and financially) by correlating Twitter usage with Movember website visits and received donations. We chose Twitter as our social media channel of choice, due to its popularity and ubiquitous nature in the English-speaking world. We investigate the differences and similarities between the Movember Twitter campaigns running in different countries, and aim analyze to what extent those factors can explain awareness and fund-raising metrics.
In the remainder of this paper we first discuss previous findings concerning social media-based health campaigns ( §2), before introducing the research hypotheses we focus on in this work and the necessary data sources ( §3). Our results are discussed in §4. Lastly, we outline potential avenues for future work in §5.

Health Campaigns & Social Media
In this section we provide an overview of existing health campaign research across social media channels. Almost all research conducted in this area investigates the social media portal Twitter. An overview of the employed data in past works is presented in Table 1. Thackeray et al. (2013) analyzed the impact of the Breast Cancer Awareness month (an international campaign held annually in October) on Twitter users. They focused on engagement metrics and found that tweets discussing breast cancer issues spiked dramatically in the beginning of October but quickly tapered off. In terms of topical aspects, organizations and celebrities posted more often than individuals about fundraisers, early detection and diagnoses, while individuals focused more on wearing pink 4 . Similarly, a topic analysis was conducted by Bravo and Hoffman-Goetz (2015) on the 2013 Canadian Movember campaign. The authors categorized 4,222 sampled tweets related to the campaign into four different categories (health information, campaign, participation and opinion). Due to the small number of identified health information tweets in the sample (considered to be the main signal of increased awareness), the authors concluded that the goal of raising awareness has not been met. Lovejoy et al. (2012) investigated how non-profit organizations use Twitter by analyzing more than 70 different organizations, among them 19 health care organizations, along various basic aspects including the number of followers, tweets, retweets, etc. Importantly, the authors found that most organizations use Twitter as a one-way communication channel instead of making full use of its potential and multi-way communication. Smitko (2012) developed two theories, of how non-profit organizations can build and strengthen their relationships with donors on Twitter: the Social Network Theory (SNT) and the Social-Judgement Theory (SJT). According to SNT, organizations need to strengthen their network of trust by engaging more with their followers while in SJT, organizations need to tailor the content of their tweets to match the interest of their followers. Due to the smallscale nature of the empirical analysis (based on 300 tweets), we consider it an open question to what extent those theories hold. While to our knowledge, no existing work has considered the financial success of health campaigns, we note that Sylvester et al. (2014) studied the relationship between social media activities (on Twitter and news streams) and donations to a large non-profit organization during hurricane Irene, a tropical cyclone that hit the US in 2011. A spatial analysis revealed that donors living in states affected by Irene donated more than donors in nonaffected states.
To summarize, past works have shown that (i) various types of social media users behave differently during health campaigns (celebrities vs. individuals vs. organizations), and (ii) sufficient content related to health campaigns is created on Twitter. What we are lacking is a large-scale analysis of the impact these social media health campaigns have across countries and on fund-raising.

Tweets & Donations of Movember
One goal of our work is to establish whether we can explain donations the local Movember cam-  paigns received through Twitter 5 . We are thus conducting an exploratory analysis on two distinct data sources: Twitter Corpus T w M ov : This corpus contains all tweets 6 published during the month of November 2013 that contain the keyword Movember -1,113,534 tweets in total, posted by 688,488 unique Twitter users across the world. Twenty-one local Movember campaign accounts are active, such as @MovemberUK, @MovemberAUS and @MovemberCA. To enable a country-by-country analysis, we estimated the country each tweet was sent from, according to the machine learning approach described by Van der Veen et al. (2015). In this manner, we were able to label all tweets in our data set with the (likely) country of origin. The approach has been shown to have a country-level accuracy above 80%, a level we consider sufficiently high for our purposes. In total, tweets from 125 different countries were found. The geographic distribution of these tweets is presented in Figure  1, normalized with respect to each country's pop-5 Defined as donations received from users that clicked on a donation link on Twitter. 6 Twitter provided access to their firehose for this study.
ulation, to allow a comparison across countries. It is evident, that the Movember campaign is most popular in North America, Australia and Europe. Most activity (relative to the population) is generated by Twitter users in the UK, followed by those in Canada. Thus, the four countries we focus our analysis on are not only among the most active in terms of fund-raising, but also among the most active in terms of Movember-related Twitter usage. This is a limiting factor to our work, but at the same time allows us to be certain that all of our Movember website visitors and donors were exposed to Twitter activities related to Movember.
Our data set has a single day resolution with all of the following information being available for each individual national campaign website: (1) the  Table 3. As already indicated, Movember is a social event, members of the campaign are called Mo Bros (men) and Mo Sistas (women). Every member can register on the Movember website and collect donations through that site (localized per country). Mo Bros & Mo Sistas can join to form teams and fund-raise together. While growing a moustache is the most common activity, Mo Bros/Mo Sistas can also use alternative social activities for fund-raising. At the end of the onemonth campaign cycle, the teams and individuals raising the most donations within their country receive awards and prices.

Research Hypotheses
Based on our research goal, we developed three research hypotheses: H1: The more well-known Twitter users (celebrities and organizations) support a Movember campaign, the more awareness and funds the campaign will raise.

H2:
Movember campaigns that emphasize the social and fun aspect of the campaign, engage the users better and thus will raise more awareness and funds.

H3:
Movember campaigns that focus on health topics, raise more awareness to the campaign and thus will raise more funds.
H2 and H3 are competing hypotheses, as prior works have not offered conclusive evidence to emphasize one direction (health vs. social) over another.

From Hypotheses to Measurements
Having presented the research hypotheses that guide our work, we now describe how to empirically measure to what extent they hold.
Based on the Movember data set, we can directly measure the impact on donations. At the same time though, we cannot directly measure awareness; we chose to approximate this metric by the number of visitors the Movember website receives.
To examine H1 we require a definition for what constitutes a well-known Twitter user (a "celebrity"). We start with the definition posed by Thackeray et al. (2013), according to which celebrities have more than f USA = 100, 000 followers and are verified by Twitter. As this definition was derived for tweets originating in the United States, we normalize f Country according to the country's population and remove the requirement of being verified. Specifically, for the remaining three countries we employ the following cutoffs: f Canada = 11, 000, f UK = 20, 000, and f Australia = 7, 000.
To investigate the impact of health (related) organizations on Twitter, we define health organizations as those Twitter accounts with more than 5, 000 followers and at least one of the following keywords in their Twitter profile (an ap-proach borrowed from (Thackeray et al., 2013)): {cancer, health, pharmacy, pharmaceutical, campaign, government, firm, company, companies, news, group, society, committee, volunteer, we, official, marketing, promotions and forum).}. The overlap between both types of users (well-known vs. organizations) is between 2.2% (US) and 30.7% (Australia).

Manual Annotation Efforts
Hypotheses H2 and H3 require a content analysis of the Twitter messages. For this purpose, one of the authors manually annotated 2,000 randomly drawn English-language tweets (with 500 tweets each drawn from the UK, Canada, the United States and Australia) from T w M ov into several categories, inspired by the work of Bravo and Hoffman-Goetz (2015). We distinguish five main categories: health, campaign, participation, social and other, with each one (except other) containing between two and three sub-categories (e.g. health tweets are further categorized as cancer, general and mental). Overall, we distinguish 12 different categories/sub-categories. Tweets can belong to multiple categories or sub-categories; tweets that are not found to belong to any of the first four categories are classified as other. An overview of the categories and the resulting annotations (including examples of categorized tweets) is shown in Table 2. Across all countries, we find the social aspect to be the most pronounced in our sample -51% of the sampled tweets are categorized as such. Less than 5% of the tweets mention health issues and even more strikingly, the second pillar of Movember's campaign (mental health) is almost completely absent in our sample. These results are largely in line with Bravo and Hoffman-Goetz (2015)'s findings for the Canadian Movember campaign, where cancer-related tweets were found in only 0.6% of the sample. This manual annotation effort does not only serve as a confirmation of (Bravo and Hoffman-Goetz, 2015), it also shows that these findings hold across countries.

Automatic Classification
Due to the small number of manually annotated tweets in the individual sub-categories, we decided to automatically classify all tweets of T w M ov according to the most opposing ends of the spectrum: health vs. social. This was done separately for each country. Concretely, we aim to classify each tweet into one of four categories: (1) health, (2) social, (3) health & social or (4) other. In order to add robustness to the classifier, we use the insights gained during the manual annotation process to enlarge our training set by automatically selecting additional positive training tweets. For the health classifier, tweets containing one of the following key phrases were used: {prostate, testicular, cancer, mental, health}. Similarly, for the social classifier, we relied on tweets containing at least one of: {gala, party, event, contest, competition, stach, handlebar, facial hair, shave, instagram, twitter.*photo.} as positive training data. Recall, that all tweets in our corpus contain the term Movember by definition, thus ensuring topicality. Overall, in this manner we labelled 406, 709 tweets across all countries, consisting of 120, 601 health tweets and 286, 108 social tweets. A total of 35, 489 tweets were identified as being both social and health-related. These simple rules have thus allowed us to categorize 36.5% of all tweets in T w M ov ; the remaining 63.5% of tweets are categorized according to our classifier output.
We train separate classifiers for each country. We randomly draw 5,000 labelled health (social) tweets as positive training examples of the health (social) classifier. We draw the same amount of non-health (non-social) tweets as negative training examples for balanced training 8 . We performed basic data cleaning steps, removing stopwords (which in this case includes the term "Movember") and employing stemming. As classification algorithm we selected Naïve Bayes with terms as features 9 . We classified the tweets in T w M ov to zero, one or both categories (health/social) depending on the confidence threshold of the individual classifier (a tweet classified with confidence ≥ 0.5 is assigned to the classifier's category).

Results
To determine the influence on the number of donations and visitors, we correlate (using Pearson's correlation coefficient r) the Twitter-based metrics (e.g. number of tweets) with the donation and visitor data from the Movember data set on a dayby-day basis for the month of November.    Further Insights In Figure 2 we visualize the relationship between the number of visitors/donations and the number of health/social tweets in the form of scatter plots. While the visitor data shows few outliers (corresponding to the first & last day of the campaign) and has a clear linear trend, the donation plot is evidently nonlinear without a clear pattern emerging. Finally, in Figure 3, we plot -exemplary for the United Kingdom -the overall trends in the number of tweets, the number of Movember visitors and the number of Movember donations between the end of October 2013 and early December 2013. We observe that over time, the overall tweet volume declines slightly (apart from the final day of the campaign), while the number of visitors and the number of donations are in a reverse relationship: the number of visitors steadily declines over the month of the campaign while the number of donations steadily increases. Twitter activity related to Movember quickly ceases to exist after the end of November.

Conclusions
In this paper, we investigated the impact of different social media strategies on a health campaign's ability to raise awareness and attract funds. We investigated the specific use case of Movember, a global campaign which enjoys widespread popularity in many countries. We focused our analyses on the four most active English-language countries of the Movember campaign. Our findings partially corroborate previous findings on raising awareness, especially those in (Bravo and Hoffman-Goetz, 2015), while expanding on them across several dimensions, most importantly the number of countries investigated and the size of the investigated social media   Table 5: Overview of the number of tweets classified according to their health and/or social intent as well as their correlation (day-by-day) with Movember donation and visitor data. The thresholds for statistical significance (for N = 30 days) are † r = 0.37 (p < 0.05) and ‡ r = 0.47 (p < 0.01) respectively.
sample. We find that across countries Twitter users mostly focus on the social aspect of the Movember campaign, with relatively few tweets focusing on the health aspect of Movember. Additionally, those users that do mention healthrelated issues, often use generic statements, instead of focusing on the two specific health issues that Movember aims to address (cancer and mental health). Surprisingly, the mental health aspect of Movember is virtually not discussed at all.
To explore the impact of social media strategies on awareness and fund-raising, we analysed the relationship between Movember website visitor & donation data and Twitter activities. We found significant correlations between Movember visitors and the Movember-related activities of wellknown Twitter users. We also found clear evidence that social tweets have a higher impact on visitors than health tweets. While the observed correlations were moderate to strong for the United Kingdom and Australia, we only found weak to non-significant correlations for Canada and the United States. Across all countries, we did not find significant correlations between donations and Twitter activities.
Based on these findings, we plan to investigate on a more fine-grained and semantic level in what aspects the Twitter-based Movember activities differ between Australia/UK and Canada/US. We will also consider a temporal analysis of the donation/visitor data, comparing trends across several years of Movember donation data and Twitter activities. We also intend to incorporate more finegrained information about the Twitter users in our analyses, such as their motivations to participate in the campaign (Nguyen et al., 2015).