Identifying and Tracking Sentiments and Topics from Social Media Texts during Natural Disasters

We study the problem of identifying the topics and sentiments and tracking their shifts from social media texts in different geographical regions during emergencies and disasters. We propose a location-based dynamic sentiment-topic model (LDST) which can jointly model topic, sentiment, time and Geolocation information. The experimental results demonstrate that LDST performs very well at discovering topics and sentiments from social media and tracking their shifts in different geographical regions during emergencies and disasters. We will release the data and source code after this work is published.


Introduction
Social media has become pervasive in our daily life, and it is a great way to spread important information efficiently. Using social media (e.g., Twitter, Facebook, Pinterest), people can conveniently inform others and express support during emergencies and disasters.
Nowadays, the social media is keeping producing a huge amount of information. Unlike before, people are not only interested in identifying the static topics and sentiments from given texts, but also, perhaps more concerned with tracking the evolution of topics and sentiments among different geographical regions. On the one hand, this new requirement can be very helpful, especially in the case of emergencies, such as pre-disaster preparation and post-disaster relief in local natural disasters (Beigi et al., 2016). For example, people's sentiments on the topics related to medical and rescue may guide the management and distribution of emergency supplies. On the other hand, the existing models do not take the temporal evolution and the impact of location over topics and sentiments into consideration, which makes them unable to fulfill the new requirement of tracking the evolution of topics and sentiments in different geographical places.
In this paper, we aim to identify the topics and sentiments and track their shifts in different geographical regions during emergencies and disasters. We are inspired by several observations. First, people are interested in not only the overall sentiment or topic distribution of the documents but also the sentiments towards specific topics. For example, a person may be happy with that the disaster passed away, but at the meanwhile he/she may be unsatisfied with the post-disaster relief. Second, most existing sentiment-topic models ignore the temporal evolution of topics and sentiment in a time-variant data corpus such as the Twitter stream. There are strong evidences which indicate that people's attitudes toward a disaster will gradually change over time with the distribution of emergency supplies (Beigi et al., 2016;Caragea et al., 2014;Mandel et al., 2012). Third, people in different places tend to have different opinions towards particular topics. This motivated us to find the influence of specific topics and the relationship between different topics in different regions, which may improve people's awareness to help themselves during disasters.
We propose a location-based dynamic sentiment-topic model (LDST) which generalizes latent Dirichlet allocation (LDA) (Blei et al., 2003), by jointly modeling topic, sentiment, time and geolocation information. After learning the LDST model, we can identify the topics and sentiments held by people in different locations over time. Our model works in an unsupervised way, and we learn the model according to the frequency of terms co-occurring in different contexts. To leverage the prior knowledge, we construct a small set of seed words for each topic of interest to enable the model to group semantically related terms into the same topic. Consequently, the topic words will be more related to the seed words of the same topic.
We conduct experiments using a Hurricane Sandy Twitter corpus which consists of 159,880 geotagged Twitter posts from the geographic area and time period of the 2012 Hurricane Sandy. We show the evolution of people's topics and sentiments, which change according to not only the time the disaster happens, but also people's locations during the hurricane Sandy.

Related work
Sentiment analysis is widely applied in many fields, such as business intelligence, politics, sociology. The papers by Pang and Lee (Pang and Lee, 2008) and Liu (Liu, 2012) described most of the existing techniques for sentiment analysis and opinion mining, which could be categorized into lexicon-based approaches (Kennedy and Inkpen, 2006;Turney, 2002;Yang et al., 2014a,b) and corpus-based approaches (Pang et al., 2002;Yang et al., 2015;Wan, 2009).
Recently, researchers have turned their attention to exploring sentiment analysis on the social media posts of individuals during natural disasters and emergencies (Beigi et al., 2016;Buscaldi and Hernandez-Farias, 2015;Caragea et al., 2014;Kryvasheyeu et al., 2015;Mandel et al., 2012;Shalunts et al., 2014). For example, a sentiment analysis system is applied for Italian to a set of tweets during the Genoa flooding (Buscaldi and Hernandez-Farias, 2015). They attempted to identify trending topics, toponym and sentiments that might be relevant from a disaster management perspective. However, the existing studies only focused on sentiment analysis on the document level, without considering the specific topics in the document.
Meanwhile, topic models, such LDA (Blei et al., 2003), have become popular in extracting interesting topics. Some recent work incorporates context information into LDA, such as time (Wang and McCallum, 2006;Zhao et al., 2014) and authorship (Steyvers et al., 2004;Yang et al., 2016) to make topic models fit expectations better. Some studies also attempt to detect sentiment and topic simultaneously from documents (Der-mouche et al., 2015;Mukherjee et al., 2014;. Nevertheless, none of existing methods takes advantage of temporal and geographical information to identify and track people's topics and sentiment during emergencies and disasters.

Model
In this section, we firstly introduce the generative process of the LDST model. Then we present the inference algorithm for estimating the model parameters.

Model Description
We assume the corpus consists a set of authors, a set of locations, and a collection of documents with specific timestamps. Formally, we use V and U to denote the sets of locations and authors, respectively. A document d ∈ D is a short text written by an author u ∈ U in location v ∈ V at time t. Also, let S be the number of distinct sentiment labels, and T be the total number of topics, where S and T are predefined constant values. Since each tweet is a short text, studying them individually is not very informative. We thus use pooling methods to construct aggregated documents for each location or each author. For a venue v, we use A v to define the set of all authors that have written documents in location v, and d v (location document) to refer to the union of all the documents written in location v. N dv is the number of words in location document d v .  Figure 1 shows the graphical model of LDST. The formal definition of the generative process of LDST is as follows: 1. For each sentiment label s, a. For each topic z under sentiment s, -Draw a word distribution φs,z~Dir(βs,z).

For each author
In the model, α, β, ρ, σ, τ and γ are hyperparameters. The latent sentiments and topics depend on the document venues and personalities of the author. We use ω to control the influence from the venue and the author. In particular, ω is the parameter of a Bernoulli distribution, from which a binary variable c is generated to determine whether the document is influenced by venue or user.

Inference Algorithm
We use Collapsed Gibbs sampling (Porteous et al., 2008) to estimate the unknown latent variables {φ, ω, θ a , θ v , χ a , χ v , ψ}. The posterior distribution of the hidden variables for each word w dv,i (i-th word in venue document d v ) is calculated as follows (to simplify, we use Θ to refer to s,z ) (1) where n u,c (0) and n u,c (1) are the numbers of times that c = 0 and c = 1 are sampled for user u, respectively, and we have n u,c n u,c (0) +n u,c (1). n u,s is the number of times that sentiment s is sampled from the distribution χ a u specific to user u, and n v,s is the number of times that sentiment s is sampled from the distribution χ v dv specific to document venue v. n u,s,z is the number of times that topic z is sampled from the distribution θ a u,s specific to user u and sentiment s, and n v,s,z is the number of times that topic z is sampled from the distributionθ v dvs specific to document venue v and sentiment s. n s,z,w is the number of times that word w is sampled from the distribution φ s,z specific to sentiment s and topic z. The superscript −d v , i denotes a quantity excluding the current word −d v , i.

Defining the Prior Knowledge
In our model, the prior knowledge is employed to guide the generative process of topics. The prior knowledge can be obtained from natural disaster experts. We collect a small set of seed words for each topic of interest during emergency and disaster. For each topic, the LDST model draws the word distribution from a biased Dirichlet prior Dir(β). Each vector β .,z ∈ R V is constructed from the sets of seed words, where Here, Λ .,z,w = 1 if and only if word w is a seed word for topic z, otherwise Λ .,z,w = 0. The scalars λ 1 and λ 2 are hyperparameters. Intuitively, when λ 1 < λ 2 , the biased prior ensures that the seed words are more probably drawn from the associated topic.

Hurricane Sandy Twitter Datasets
This dataset contains nearly 15 million tweets posted on Twitter while Hurricane Sandy was hitting the United States. Tweets were collected from October 25, 2012 to November 4, using the keywords 'hurricane' and 'sandy' (Zubiaga and Ji, 2014). In this paper, we only keep the geotagged tweets. The final experimental dataset consists of 159,880 geotagged tweets. The original geographical information is expressed by using longitude and latitude in decimal degree. We set the granularity of location as a state via Google Maps Geocoding API 2 and analyze the tweets within the United States.

Baseline Methods
We evaluate and compare our model with several baseline methods as follows: LDA: We use gensim toolkit to do inference for LDA model (Blei et al., 2003).
ToT: Topics over Time, a non-Markov continuous time model proposed in (Wang and McCallum, 2006). JST: The first Joint Sentiment-Topic model to identify the sentiment-topic pairs (Lin and He, 2009) .
2 https://developers.google.com/maps/ TS: Topic-Sentiment model proposed in (Dermouche et al., 2015). LDST-w/oS: This is the LDST model without employing prior knowledge (seed words). We use this method to evaluate the influence of seed words.

Implementation details
In our implementation, we set topic number T = 50, and the prior hyperparameters γ = 0.5, τ = σ = 0.1, ρ = α = 50/T . β s,z is calculated using the set of seed words with λ 1 = 0.1 and λ 2 = 0.8. As described in Section 3.3, we use a small set of seed words as our topic prior knowledge. Specifically, the seed words list contains 5 to 10 seed words for each of the five topics of interest 3 about Hurricane impact, public utility, food, shelter, medical, respectively. We choose these five topics based on the actual requirements of our project. However, it is important to note that the model is flexible and do not need to have seed words for every topic.

Quantitative evaluation
We first compare our model with the baseline models in terms of perplexity which is a widely used measurement of how well a probability model predicts a sample. The lower the perplexity, the better the model. We calculate the average perplexity (log-likelihood) using 1000 held-out documents which are randomly selected from the test data. The average test perplexity of each word is calculated as exp{− 1 N w log p(w)}, where N is the total number of words in the held-out test documents. Table 1 shows the perplexity results for Hurricane Sandy dataset. Our model outperforms the baseline models. In particular, the perplexity of our model is 1122, which is 40 and 116 lower than that of JST and TS. The perplexity of LDST is 35 lower than that of LDST-w/oS, which indicates that the seed words can further improve the performance of our model.  Following the same evaluation as in , we also present and discuss the experimental results of sentiment classification. The docu-ment sentiment is classified based on the probability of sentiment label given document, which can be approximated byχ (a) u andχ v dv . Similar to , we only consider the probability of positive and negative label given document, with the neutral label probability being ignored. We define that a document d is classified as a positivesentiment document if its probability of positive sentiment label given document is greater than its probability of negative sentiment label given document, and vice versa. The ground truth of sentiment classification labels of tweets are set by using human annotation. Specifically, we randomly select 1000 documents from the dataset, and label each document as positive, negative or neutral manually. We measure the performance of our model using the tweets with positive or negative labels. The classification accuracies are summarization in Table 2. LDST

Conclusion
In this paper, we propose a location-based dynamic sentiment-topic (LDST) model, which can jointly model sentiment, topic, temporal, and geolocation information.