Cultural Shift or Linguistic Drift? Comparing Two Computational Measures of Semantic Change

Words shift in meaning for many reasons, including cultural factors like new technologies and regular linguistic processes like subjectification. Understanding the evolution of language and culture requires disentangling these underlying causes. Here we show how two different distributional measures can be used to detect two different types of semantic change. The first measure, which has been used in many previous works, analyzes global shifts in a word's distributional semantics; it is sensitive to changes due to regular processes of linguistic drift, such as the semantic generalization of promise ("I promise."->"It promised to be exciting."). The second measure, which we develop here, focuses on local changes to a word's nearest semantic neighbors; it is more sensitive to cultural shifts, such as the change in the meaning of cell ("prison cell"->"cell phone"). Comparing measurements made by these two methods allows researchers to determine whether changes are more cultural or linguistic in nature, a distinction that is essential for work in the digital humanities and historical linguistics.


Introduction
Distributional methods of embedding words in vector spaces according to their co-occurrence statistics are a promising new tool for diachronic semantics (Gulordava and Baroni, 2011;Jatowt and Duh, 2014;Kulkarni et al., 2014;Xu and Kemp, 2015;Hamilton et al., 2016). Previous work, however, does not consider the underlying causes of seman-tic change or how to distentangle different types of change.
We show how two computational measures can be used to distinguish between semantic changes caused by cultural shifts (e.g., technological advancements) and those caused by more regular processes of semantic change (e.g., grammaticalization or subjectification). This distinction is essential for research on linguistic and cultural evolution. Detecting cultural shifts in language use is crucial to computational studies of history and other digital humanities projects. By contrast, for advancing historical linguistics, cultural shifts amount to noise and only the more regular shifts matter.
Our work builds on two intuitions: that distributional models can highlight syntagmatic versus paradigmatic relations with neighboring words (Schutze and Pedersen, 1993) and that nouns are more likely to undergo changes due to irregular cultural shifts while verbs more readily participate in regular processes of semantic change (Gentner and France, 1988;Traugott and Dasher, 2001). We use this noun vs. verb mapping as a proxy to compare our two measures' sensitivities to cultural vs. linguistic shifts. Sensitivity to nominal shifts indicates a propensity to capture irregular cultural shifts in language, such as those due to technological advancements (Traugott and Dasher, 2001). Sensitivity to shifts in verbs (and other predicates) indicates a propensity to capture regular processes of linguistic drift (Gentner and France, 1988;Kintsch, 2000;Traugott and Dasher, 2001).
The first measure we analyze is based upon changes to a word's local semantic neighborhood; With the global measure of change, we measure how far a word has moved in semantic space between two time-periods. This measure is sensitive to subtle shifts in usage and also global effects due to the entire semantic space shifting. For example, this captures how actually underwent subjectification during the 20th century, shifting from uses in objective statements about the world ("actually did try") to subjective statements of attitude ("I actually agree"; see Traugott and Dasher, 2001 for details). In contrast, with the local neighborhood measure of change, we measure changes in a word's nearest neighbors, which captures drastic shifts in core meaning, such as gay's shift in meaning over the 20th century.
we show that it is more sensitive to changes in the nominal domain and captures changes due to unpredictable cultural shifts. Our second measure relies on a more traditional global notion of change; we show that it better captures changes, like those in verbs, that are the result of regular linguistic drift. Our analysis relies on a large-scale statistical study of six historical corpora in multiple languages, along with case-studies that illustrate the fine-grained differences between the two measures.

Methods
We use the diachronic word2vec embeddings constructed in our previous work (Hamilton et al., 2016) to measure how word meanings change between consecutive decades. 1 In these representations each word w i has a vector representation w (t) (Turney and Pantel, 2010) at each time point, which captures its co-occurrence statistics for that time period. The vectors are constructed using the skip-gram with negative sampling (SGNS) algorithm (Mikolov et al., 2013) and post-processed to align the semantic spaces between years. Measuring the distance between word vectors for consecutive decades allows us to compute the rate at which the different words change in meaning (Gulordava and Baroni, 2011).
We analyzed the decades from 1800 to 1990 using vectors derived from the Google N-gram datasets (Lin et al., 2012) that have large amounts of historical text (English, French, German, and English Fiction). We also used vectors derived from the Corpus of Historical American English (COHA), which is smaller than Google N-grams but was carefully constructed to be genre balanced and contains word lemmas as well as surface forms (Davies, 2010). We examined all decades from 1850 through 2000 using the COHA dataset and used the part-of-speech tags provided with the corpora.

Measuring semantic change
We examine two different ways to measure semantic change ( Figure 1).

Global measure
The first measure analyzes global shifts in a word's vector semantics and is identical to the measure used in most previous works (Gulordava and Baroni, 2011;Jatowt and Duh, 2014;Kim et al., 2014;Hamilton et al., 2016). We simply take a word's vectors for two consecutive decades and measure the cosine distance between them, i.e.
Global measure Local measure Figure 2: The global measure is more sensitive to semantic changes in verbs while the local neighborhood measure is more sensitive to noun changes. Examining how much nouns change relative to verbs (using coefficients from mixed-model regressions) reveals that the two measures are sensitive to different types of semantic change. Across all languages, the local neighborhood measure always assigns relatively higher rates of change to nouns (i.e., the right/green bars are lower than the left/blue bars for all pairs), though the results vary by language (e.g., French has high noun change-rates overall). 95% confidence intervals are shown.

Local neighborhood measure
The second measure is based on the intuition that only a word's nearest semantic neighbors are relevant. For this measure, we first find word w i 's set of k nearest-neighbors (according to cosine-similarity) within each decade, which we denote by the ordered set N k (w (t) i ). Next, to measure the change between decades t and t + 1, we compute a "second-order" similarity vector for w (t) i from these neighbor sets with entries defined as and we compute an analogous vector for w i , contains the cosine similarity of w i and the vectors of all w i 's nearest semantic neighbors in the the time-periods t and t + 1. Working with variants of these second-order vectors has been a popular approach in many recent works, though most of these works define these vectors against the full vocabulary and not just a word's nearest neighbors (del Prado Martin and Brendel, 2016;Eger and Mehler, 2016;Rodda et al., 2016).
Finally, we compute the local neighborhood change as This measures the extent to which w i 's similarity with its nearest neighbors has changed. The local neighborhood measure defined in (3) captures strong shifts in a word's paradigmatic relations but is less sensitive to global shifts in syntagmatic contexts (Schutze and Pedersen, 1993  used k = 25 in all experiments (though we found the results to be consistent for k ∈ [10, 50]).

Statistical methodology
To test whether nouns or verbs change more according to our two measures of change, we build on our previous work and used a linear mixed model approach (Hamilton et al., 2016). This approach amounts to a linear regression where the model also includes "random" effects to account for the fact that the measurements for individual words will be correlated across time (McCulloch and Neuhaus, 2001). We ran two regressions per datatset: one with the global d G values as the dependent variables (DVs) and one with the local neighborhood d L values. In both cases we examined the change between all consecutive decades and normalized the DVs to zeromean and unit variance. We examined nouns/verbs within the top-10000 words by frequency rank and removed all words that occurred <500 times in the smaller COHA dataset. The independent variables are word frequency, the decade of the change (represented categorically), and variable indicating Word 1850s context 1990s context actually "...dinners which you have actually eaten." "With that, I actually agree." must "O, George, we must have faith." "Which you must have heard ten years ago..." promise "I promise to pay you...' "...the day promised to be lovely." gay "Gay bridals and other merry-makings of men." "...the result of gay rights demonstrations." virus "This young man is...infected with the virus." "...a rapidly spreading computer virus." cell "The door of a gloomy cell..." "They really need their cell phones."  Examining the semantic distance between the 1850s and 1990s shows that the global measure is more sensitive to regular shifts (and vice-versa for the local measure). The plot shows the difference between the measurements made by the two methods.

Regular linguistic shifts Irregular cultural shifts
whether a word is a noun or a verb (proper nouns are excluded, as in Hamilton et al., 2016). 2

Results
Our results show that the two seemingly related measures actually result in drastically different notions of semantic change.

Nouns vs. verbs
The local neighborhood measure assigns far higher rates of semantic change to nouns across all languages and datasets while the opposite is true for the global distance measure, which tends to assign higher rates of change to verbs (Figure 2). We focused on verbs vs. nouns since they are the two major parts-of-speech and previous research has shown that verbs are more semantically mutable than nouns and thus more likely to undergo linguistic drift (Gentner and France, 1988), while nouns are far more likely to change due to cultural shifts like new technologies (Traugott and Dasher, 2001). However, some well-known regular linguistic shifts include rarer parts of speech like adverbs (included in our case studies below). Thus we also confirmed 2 Frequency was included since it is known to strongly influence the distributional measures (Hamilton et al., 2016). that the differences shown in Figure 2 also hold when adverbs and adjectives are included along with the verbs. This modified analysis showed analogous significant trends, which fits with previous research arguing that adverbial and adjectival modifiers are also often the target of regular linguistic changes (Traugott and Dasher, 2001).
The results of this large-scale regression analysis show that the local measure is more sensitive to changes in the nominal domain, a domain in which change is known to be driven by cultural factors. In contrast, the global measure is more sensitive to changes in verbs, along with adjectives and adverbs, which are known to be the targets of many regular processes of linguistic change (Traugott and Dasher, 2001;Hopper and Traugott, 2003)

Case studies
We examined six case-study words grouped into two sets. These case studies show that three examples of well-attested regular linguistic shifts (set A) changed more according to the global measure, while three well-known examples of cultural changes (set B) change more according to the local neighborhood measure. Table 2 lists these words with some representative historical contexts (Davies, 2010). Set A contains three words that underwent attested regular linguistic shifts detailed in Traugott and Dasher (2001): actually, must, and promise. These three words represent three different types of regular linguistic shifts: actually is a case of subjectification (detailed in Figure 1); must shifted from a deontic/obligation usage ("you must do X") to a epistemic one ("X must be the case"), exemplifying a regular pattern of change common to many modal verbs; and promise represents the class of shifting "performative speech acts" that undergo rich changes due to their pragmatic uses and subjectification (Traugott and Dasher, 2001). The contexts listed in Table 2 exemplify these shifts.
Set B contains three words that were selected because they underwent well-known cultural shifts over the last 150 years: gay, virus, and cell. These words gained new meanings due to uses in community-specific vernacular (gay) or technological advances (virus, cell). The cultural shifts underlying these changes in usage -e.g., the development of the mobile "cell phone" -were unpredictable in the sense that they were not the result of regularities in human linguistic systems. Figure 3 shows how much the meaning of these word changed from the 1850s to the 1990s according to the two different measures on the English Google data. We see that the words in set A changed more when measurements were made using the global measure, while the opposite holds for set B.

Discussion
Our results show that our novel local neighborhood measure of semantic change is more sensitive to changes in nouns, while the global measure is more sensitive to changes in verbs. This mapping aligns with the traditional distinction between irregular cultural shifts in nominals and more regular cases of linguistic drift (Traugott and Dasher, 2001) and is further reinforced by our six case studies.
This finding emphasizes that researchers must develop and use measures of semantic change that are tuned to specific tasks. For example, a cultural change-point detection framework would be more successful using our local neighborhood measure, while an empirical study of grammaticalization would be better off using the traditional global dis-tance approach. Comparing measurements made by these two approaches also allows researchers to assess the extent to which semantic changes are linguistic or cultural in nature.