Studying Semantic Chain Shifts with Word2Vec: FOOD>MEAT>FLESH

Word2Vec models are used to study the semantic chain shift FOOD>MEAT>FLESH in the history of English, c. 1425-1925. The development stretches out over a long time, starting before 1500, and may possibly be continuing to this day. The semantic changes likely proceeded as a push chain.


Introduction
A semantic chain shift is a set of directly related semantic changes in one lexical field (Anttila 1989, 146-7). One of the best-known examples, and object of study here, is the semantic chain shift involving MEAT 1 in the history of English. The item used to mean 'food of any kind' in Medieval English, but has acquired the more specific meaning of 'food from animal flesh' in Modern English (e.g., Bejan 2017, 82, and many other textbooks, where the phenomenon is usually discussed as an instance of 'semantic narrowing'). This development is linked to a change in the meaning of FOOD. It meant 'anything required to maintain life and growth' in the Middle Ages, as demonstrated, for instance, by ancient Latin-English glosses, such as Thomas Elyot's 1538 Dictionary (Stein 2014), where one reads, Alimentum, alimonia -sustynaunce, fode, or livinge. The word has come to denote 'anything to eat' at the present. Likewise, the item FLESH has undergone a related semantic change from 'soft body tissue in any function' in Old and Middle English towards 'soft body tissue, usually not for eating' in Present-Day English (for a discussion of the relation between FLESH and MEAT in terms of analogy, see Bloomfield 1933, 407-8, 440-2). Hence, the innovative meanings of each item must have encroached on and supplanted their counterpart's conservative semantics, resulting in the chain shift FOOD > MEAT > FLESH. 2 Table 1 paraphrases the semantics of the three targets of the chain shift, the new meaning being at the top, the old at the bottom. It also presents actual uses of the conservative and innovative variants from the 16 th and 19 th century, respectively, in the form of KWIC concordances with a search window size of 12 words to the left and the right. The targets are shown in red, and context words likely to signal the intended interpretation in green.
Semantic chain shifts involve confounding factors such as archaism, fixed expressions, domaindependent technical uses, other genre effects, creative extensions by metaphor and metonymy, noise from polysemy and homonymy, and subtle shifts in connotations. These difficulties impede studying macro-trends in their semantic evolution manually. However, it is possible to trace the developments with word embedding techniques (for an overview, see e.g., Tahmasebi et al. 2018).
The present study employs Word2Vec models (Mikolov et al., 2013) to investigate two questions about the FOOD>MEAT>FLESH chain.
(1) What is the general time course of the changes?
(2) Does the chain commence at the target FLESH (pull chain) or FOOD (push chain)? Section 2 presents the data used in the study. Section 3 presents the findings. Section 4 concludes.
2 Several reviewers pointed out that the argument of this paper would be strengthened by the inclusion of additional instances of semantic chain shifts. Time constraints prevented a discussion of further examples. Other well-known cases of semantic chain shifts are the development of tree names in Ancient Greek, ASH > BEECH > OAK (e.g. Gamkrelidze and Ivanov 1995, 537-8;Ancient Greek φᾱγóς 'oak,' cognate with English beech), or the cycle of facial terms in Latin and early Romance, MOUTH > CHIN > CHEEK > MOUTH (e.g.Mallory and Adams 2006, chapter 11 'Anatomy'; French menton 'chin,' cognate with English mouth). I leave an investigation of these or similar developments to future research.  (Diller et al., 2011). It consists of a total of c. 845 million words or 4,7 GB of uncompressed running text. The material was subdivided into ten 50-year periods covering the time span 1425-1925. Table 2 summarizes the data basis.

Normalization
The greatest challenge to using the historical data fruitfully lies in the great amount of spelling variation found in earlier English. Word embedding techniques treat different orthographic forms of identical lexemes as distinct items, which might impair the quality of the models and hinder diachronic comparisons (for a study highlighting the importance of consistent pre-processing, see e.g. Camacho-Collados and Pilehvar 2018).  Therefore, a large number of regular expressions were run on the texts, improving spelling coherence (a total of 830 replacements, e.g. regularizing v-u variability). Further, several lexemes were lemmatized 5 , including FOOD, MEAT, FLESH and most of their closest neighbors. The Innsbruck and EEBO data was POS-tagged to aid in this task (e.g. to distinguish wine vs. win, meat vs. meet). Some word class distinctions could not be maintained as a result (e.g. DRINK now refers to the verb and the noun).

Training
Word embeddings were created for each of the nine periods by training Word2Vec models on their respective text material with Python's Gensim library (Řehůřek and Sojka, 2010). A continuous bag of words architecture was chosen, the words of interest being of reasonably high frequency, with a vector size of 250, a context window size of 20, and a minimum count of 5. Figure 1 shows the cosine similarities between FOOD-MEAT and MEAT-FLESH across the ten time periods. The former two lexemes have become increasingly more similar from the earliest periods on. Their cosine rose from c. 0.4 in 1450 to c. 0.6 in 1700, where it has remained stable since. In contrast, the latter two items showed some relatedness, but remained quite distinct, throughout the earliest periods. Their cosine then increased from c. 0.3 in 1600, peaking at c. 0.6 between 1700 and 1800, and diverged again to c. 0.4 by 1900.

Results
These findings are compatible with a push chain interpretation: FOOD seems to have initiated the changes by first becoming more similar to MEAT. Only subsequently did MEAT associate more closely with FLESH, which then began to occupy a more distinct semantic niche.
The diachronic trajectories of the targets are visualized in Figure 2. It shows the nearest neighbors of the target words over the time studied from the semantic domains 'sustenance' (green), 'eating' (lime), 'animal food' (orange) and 'human skin' (red). The words are arranged in a twodimensional principal component plot from the last period. The previous time points were homogenized to it using a procrustes transformation. This method is based on Li et al. (2019), which is in turn inspired by Hamilton et al. (2016). 6 6 One reviewer remarked that the procrustes transformation must be performed on identical vocabularies for every time period. This is indeed the case. The constant vocabulary consists only of the words shown in Figure 2. Several words The plot shows that FOOD dissociated from the meaning 'sustenance' early on. This lead to a period of sustained close synonymy between MEAT  Figure 3: Similarity between a set of conservative / innovative contexts words and each target word had to be left out because they were innovated (e.g. coffee, potato) or have radically declined in currency (e.g. concupiscence, raiment) within the time period studied. and FLESH in the domain 'eating.' In fact, the two lexemes are still strongly connected context words of each other. While FOOD is now well contained within 'eating' (EAT, MEAL etc.), MEAT is not distinctively associated with 'animal food' (BEEF, ROAST etc.), but rather hovers between the two domains. FLESH was fairly polysemous, cycling around a number of different senses, like 'animal food' (PORK, BROILED etc.) or 'Christian doctrine' (SIN, CHRIST), but has recently become most closely associated with 'human skin' (SKIN, SWEAT etc.). Figure 3 contains similar information in quantitative, rather than graphical, form. It gives the average closeness of a bag of distinctive context words and the targets as a proxy for their conservative and innovative interpretations.
FOOD consistently moves away from its old towards its new meaning from 1450 on. It thus likely triggered the semantic chain shift. In contrast, the conservative senses of MEAT and FLESH are not entirely lost, but rather fluctuate (witness archaic expressions such as meat and drink or the flesh is weak). Their modern meanings become frequent from c. 1700 on. This development may happen somewhat earlier and faster for MEAT than for FLESH. If so, this would suggests a secondary push. Here, MEAT may have spread towards semantic space previously held by FLESH, thereby pushing it into a new domain.

Summary and outlook
The diachronic developments of the semantic chain shift FOOD > MEAT > FLESH can successfully be investigated with word embedding methods. It was shown that the semantic change of FOOD 'anything for sustenance' > 'anything for eating' can be traced back at least to the middle of the fifteenth century. The acquisition of the new senses 'anything for eating' > 'soft body tissue for eating' for MEAT and 'soft body tissue for eating' > 'soft body tissue not for eating' for FLESH advanced in particular from c. 1700 on. Furthermore, there is evidence to suggest that the semantic change developed as a push chain. FOOD approaches MEAT long before MEAT becomes more closely associated with FLESH. Similarly, MEAT may have encroached upon FLESH somewhat earlier than FLESH became disjoint from the 'animal food' domain.
A number of future research questions are raised by the present study. First, the periodization employed here is not fine-grained enough to establish beyond reasonable doubt that MEAT became specialized before FLESH. The second step of the push chain scenario thus needs to be subject to closer scrutiny. Second, it is possible to follow up the developments during the last century from c. 1900 to 2000. The target items may still be evolving. FLESH might lose its religious connotations; MEAT could move towards a meaning of 'animal body tissue' in general, FOOD is perhaps getting ever more firmly entrenched in the 'eating' domain, etc. Finally, one could investigate a curiously similar chain shift in the history of French, NOURRITURE 'food' > VIANDE 'meat' > CHAIR 'flesh'. It is conceivable that FOOD first changed its meaning under the influence of French loans, such as nourishment or sustenance. The exact relation between the French and English developments merits closer examination.
It would also be a worthwhile endeavor to compare the results obtained with Word2Vec to other methods suitable for this task. One approach could be to conduct an inter-annotator agreement experiment, in which participants should use the available linguistic context to judge whether FOOD, MEAT and FLESH are used in their innovative or conservative senses in a sample of sentences from every period. The resulting scores could also function as a gold standard for evaluating the goodness of the word embeddings. Another approach could involve collocation measures such as pointwise mutual information or possibly Collostructional Analysis (Stefanowitsch and Griess, 2003).
Several problematic aspects of this research remain. It is very difficult to find a set of context words that remains relatively constant in meaning over as great a time span as considered here. The optimization of the non-modern periods' dimensionality reduction on the modern coordinate space thus becomes increasingly distorted, which may account to some degree for the somewhat erratic movements of the target words in Figure 2. Even worse, some lexemes drop out of use altogether. For example, potage 'stew, dish made of a thick liquid' is an important context word of the 'eating' domain at the beginning of the change, but becomes virtually non-existent towards the later periods. Moreover, the corpus sizes of every sub-period vary substantially. This may result in higher-quality embeddings for those periods with more, and poorer embeddings for those with less, textual material. The similarity measurements of the earliest periods of the change, in particular, might be less reliable due to the limited amount of training data. Similarly, the diverse nature of the documents found in the corpora could be problematic. Unbalanced distributions of certain text categories could bias the co-occurrences of target and context words in a considerable way. For example, two corpora might differ by chance in terms of the frequency of religious sermons (associating, say, FLESH with LUST) or culinary recipes (associating FLESH with PORK). Consequently, the embeddings could have been influenced by a random genre effect. Lastly, there are a few minor issues that have not been resolved satisfactorily, such as language mixing in the training texts, in particular with Latin, archaic uses of words in citations, the unprincipled choice of training parameters, and the lack of an appropriate evaluation metric for the task at hand.
Word embedding technologies have advanced to a point where linguists can use them off the shelf to obtain quantitative support for their qualitative assessments (e.g. Traugott and Dasher 2004) without a profound appreciation of the mathematical complexities involved. In particular, they can yield objective measurements and visualizations of the general time course of semantic changes and of the relative sequence of related semantic changes in a chain shift. Yet, the greatest advantage of word embeddings -abstracting over large amounts of text data and their particularities -is also a disadvantage. Linguists are often interested in specific aspects of a semantic change. Is the change more likely to manifest in the writings of a particular social class? Which genres promote or oppose the innovation? What is the role of language contact or dialect? Word embeddings cannot currently output relevant results to help answer such intricate questions. Word embedding methods can supplement but not supplant careful linguistic studies on semantic change.