News Sentiment and Cross-Country Fluctuations

What is the information content of news-based measures of sentiment? How are they related to economic fluctuations? I construct a sentiment index by measuring the net amount of positive expressions in the full corpus of Economic news articles produced by Reuters covering 12 countries over the period 1987-2013. The index successfully tracks fluctuations in GDP at the country level, is a leading indicator of GDP growth and contains information on future GDP growth which is not captured by consensus forecasts. This suggests that forecasters do not appropriately incorporate available information in predicting future states of the economy.


Introduction
To date, there is limited cross-country evidence for the role that sentiment might play in explaining aggregate economic fluctuations. In this paper, I show how measures of aggregate beliefs extracted from news articles are related to GDP growth. I build upon a recent and fast-growing literature which relates information extracted from text to economic and financial variables ((Tetlock, 2007), (Baker et al., 2016), (Garcia, 2013)). The approach commonly used in this literature measures sentiment using preexisting dictionaries. 1 * The author gratefully acknowledges financial support from the IMF Research Department. 1 An alternative approach employed in (Choi and Varian, 2012) uses Google search results to forecast near-term values of economic indicators. I build my sentiment index by measuring the net amount of positive expressions in the collection of Economic news articles from Reuters covering 12 countries over the period 1987 -2013. The index successfully tracks GDP growth over time and across countries. Is sentiment a leading indicator of GDP growth? I estimate an autoregressive model GDP growth to which I add news-based sentiment measures. Coefficients on news-based sentiment measures are jointly significant at the country level for 10 out of 12 countries in my sample. Sentiment variables reduce in-sample forecast errors of GDP growth by 9.1% on average across countries compared to an autogressive process. This indicates that news sentiment is a leading indicator of GDP growth.
Do news-based sentiment measures simply aggregate other well-established leading indicators? I test whether news-based sentiment measures contain information which is not reflected in professional forecasters' expectations. I run predictive regressions of annual GDP growth on consensus forecasts data at different forecasting horizons. I then add to the regressions my news sentiment index measured prior to the release of the consensus forecasts. Including sentiment reduces in-sample forecast errors by 19% on average across countries. News-based sentiment measures contain information which is not included in forecasters' expectations. Reductions in forecast errors are larger for longer forecasting horizons, which reflect the fact the long-term forecast are inherently hard. Reductions in forecast errors are also larger during bad times, which indicates that forecasters might be underreacing to bad news.  2 News-based Sentiment

Data Description
My dataset contains news articles extracted from Factiva.com, an online database which provides access to news archives from around the world. One can retrieve articles by querying a set of tags such as the source, the main topics and the locations associated with an article. A proprietary algorithm attributes topics and location tags to articles and is constant across the database.

Text Processing
I combine dictionaries of positive and negative words compiled by (Loughran and McDonald, 2011) for financial texts and by (Young and Soroka, 2012) for political and economic texts. I search for inflections of each word in these dictionaries which are present in my corpus. Given a root tonal word (e.g. "lose"), I retrieve all the inflected words in the news corpus ("losing", "looser", "lost", "loss", etc ...) and add them to the dictionaries. I check the relevance of the most frequent words and eliminate the ones which are irrelevant. My dictionary of positive words contains 3,527 items and the one with negative words contains 7,109 items. Table 2 shows that the most frequent positive and negative words indeed reflect the sentiment typically associated with economic and financial outcomes.
Here is an example of an article in which the main location tag is Argentina (in bold) and one of the topic tags is "Economic news" (in bold): 2 The dictionary-based approach is straightforward and transparent, yet some words are not properly classified. To improve accuracy, I normalize 373 negative forms such as "no longer", "neither", "not having", etc ... to "not" as proposed in (Young and Soroka, 2012). I then build a second pair of lists of positive and negative expressions which appear preceded by a "not". A positive (negative) word preceded by a "not" is classified as negative (positive). Finally, I normalize 783 ambiguous expressions to correctly account for their tone. For instance, the expression "lose support" would be classified as neutral, so I normalize it to be counted as negative.

Sentiment Index
Using this classification of tonal expressions, a simple measure sentiment is the difference between the fraction of positive expressions and the fraction of negative expressions in each article. This measure is unlikely to capture all the nuances of a text, but it is likely to give a good indication of how news tone varies across country and over time.
Let t ij be the number of occurrences of word i in article j. Let n ij (p ij ) be the number of occurrences of negative (positive) word i in document j. Corre-spondingly, letp ij (n ij ) the number of occurrences of negative (positive) word i in document j preceded by a "not".
The positivity of article j is given by: (1) In the numerator, the first term corresponds to the weighted sum of all the positive words. The second term corresponds to the weighted sum of negative words preceded by a "not". The last term corresponds to the weighted sum of positive words preceded by a "not".
Similarly, the negativity of article j is given by: The net positivity of article j is given by:

Sentiment and GDP Growth
Figure (1) shows that my sentiment index successfully tracks fluctuations in GDP growth at the country level. The first natural question is whether or not sentiment is a leading indicator of GDP growth.

Granger Causality Tests
To show this, I estimate the autoregressive distributed lag model described by equation (4): where y t,c is the log GDP growth between t and t+3 months in country c and t,c is the error term. I first estimate an autoregressive process of GDP growth at a quarterly frequency and at the country level by choosing the number of lags p which minimizes the AIC criterion. I then add monthly lags of positive and negative sentiment (averaged at a monthly frequency), again choosing the number of lag values q using the AIC criterion. Table (3) shows that lags of negative sentiment are a leading indicator of GDP growth at the country level for 9 out of the 12 countries in my sample. Lags of positive sentiment are a leading indicator of GDP growth for half of the country in my sample. This evidence is consistent with previous literature using news-based measures of sentiment which finds that most of the textual information is contained in negative words ((Loughran and McDonald, 2011)). In the case of India however, while I cannot reject the hypothesis that lags of negative sentiment are jointly equal to zero, I can reject the hypothesis that lags of positive sentiment are jointly equal to zero. This suggests that positive sentiment measures might also be worth considering as a leading indicator of GDP growth.

Figure
(2) shows that on average across countries, forecast errors of next quarter GDP growth diminish by 9.1% when news-based sentiment measures are included in the ADL(p,q) model described by equa-   (4) at a quarterly frequency and at the country level. The number of lags p and q are chosen using the AIC criterion. I test for the joint significance of lags of positive sentiment π (column 1 and 2), lags of negative sentiment ν (column 3 and 4), and the union of lags of positive and negative sentiment (π, ν) (column 5 and 6). For each test of joint significance, I report F-statistics and p-values. ** and * indicate that coefficients are jointly significantly different from zero at the 0.05 and 0.10 levels or better, respectively. News articles come from Factiva.com, GDP growth comes from the International Financial Statistics Database (IFS). tion (4) compared to an AR(p) process. 3

News Sentiment and Consensus Forecast
Several aggregate time series (e.g. weekly initial jobless claims, monthly payroll employment, etc...) are well known for containing information to help measure current economic conditions ((Aruoba et al., 2009)). Does my sentiment index simply combine information already contained in these well-known leading indicators? Obtaining data on leading indicators of GDP growth across countries is challenging, but these leading indicators should presumably be included in professional forecasters' information set. Since 1989, Consensus Economics Inc. provides a monthly survey of professional forecasters who are asked to forecast annual GDP growth across countries. For each realization of yearly GDP growth, the dataset contains GDP growth forecasts made by public and private economic institutions for each horizon h=1,...,24 months. (Fildes and Stekler, 2002) show that surveybased consensus forecasts are most of the time more accurate than those generated by time series models. The other advantage of forecasts produced by Consensus Economics is its common format for a large cross section of emerging market countries. If professional forecasters use all available information in producing their forecasts, the information contained in my news-based sentiment measures should not reduce the forecast errors of predictive regressions of GDP growth using consensus forecasts.
Predictive regressions of GDP growth using consensus forecasts and news-based sentiment measures are described by equation (5): where y t,c is the log GDP growth between t and t + 12 months in country c and t,c is the error term. First, I estimate predictive regressions of GDP growth on consensus forecasts at the country level 3 All the regressions' forecast errors are measured in sample by computing the regressions' root mean square errors (RMSE). for each horizon h = 1, ... , 24. Because sample sizes are small, estimating coefficients for each lagged measure of sentiment would lead to large standard errors. I instead include moving averages of my positive and negative sentiment measures (averaged at a monthly frequency); the moving average horizon q is chosen by minimizing regressions' forecast errors. 4 On average across countries and horizons, forecast errors diminish by 19% when news-based sentiment measures are included in predictive regressions of GDP growth on consensus forecasts. The top right panel of figure (3) shows that, on average across horizons, forecast errors diminish for each country in my sample. The top left panel shows that this reduction is larger for longer horizon: the average reduction in forecast error goes from 12% for horizons up to 12 months, to 25% for horizons longer than 12 months. This evidence supports a model of information frictions where forecasters slowly incorporate textual information in forming their forecasts.
It is well established that forecast errors tend to be larger during bad times. Does the reduction in forecast errors resulting from the inclusion of sentiment measures differentially improve forecasts of good and bad times? I fit an H-P filter to quarterly GDP growth times series at the country level ((Hodrick and Prescott, 1997)). Good (bad) times are defined to be the periods when the realized annual GDP growth is above (below) the trend produced by the H-P filter. I use the estimates of the model defined by equation (5) and I separately compute the forecast errors measured during good and bad times. The middle column of figure (3) presents forecast errors of good times and the right column presents forecast errors of bad times.
Forecast errors of good times diminish by 13% on average as a result of the inclusion sentiment measures in equation (5  countries and horizons when news-based sentiment measures are included in predictive regressions of GDP growth on consensus forecasts for GDP growth (see equation (5)). In the left column, forecast errors are averaged across countries. In the right column they are averaged across horizons. The left panel shows forecast errors during both good and bad times; the middle panel shows forecast errors during good times; the right panel shows forecast errors during bad times. Good and bad times are determined with respect to an HP filter estimated on quarterly GDP growth data. A period is considered to be a good (bad) time if annual GDP growth is above (below) the trend estimated by the HP filter. Errors bars represent standard errors. News articles come from Factiva.com, GDP growth comes from the International Financial Statistics Database (IFS), consensus forecasts come from Consensus Economics, Inc.
If forecasters where simply slowly incorporating information but correctly assigning weights when updating their forecasts, I should not observe a difference in changes in predictive accuracy between good and bad times. The fact that reductions in forecast error are larger in bad times than in good times suggests that forecasters tend to underreact to negative information.

Conclusion and Future Work
This paper describes the information content of news-based measures of sentiment and their relationship to fluctuations in GDP growth. Sentiment measures tracks fluctuations in GDP and we show that they are a leading indicator of GDP growth at the country level for 10 out of 12 countries in our sample. Sentiment measures contain information which is not accounted for by professional forecasters. News-based sentiment measures lead to a 19% average reduction in forecast error of GDP growth relative to consensus forecasts. Reductions in forecast errors are larger for longer forecasting horizons which suggests that forecasters slowly incorporate textual information into their forecasts. Reductions in forecast errors are also larger during bad times which indicates that forecasters tend to underreact to bad news.
From a policy perspective, news-based measures of sentiment provide a direct, real-time, automated and inexpensive measures of aggregate sentiment about current and future economic conditions, especially for countries for which official statistics might be sparse, inaccurate or noisy. As a result, it could help policy makers react in a more efficient manner to changes in economic conditions.