Modeling a Historical Variety of a Low-Resource Language: Language Contact Effects in the Verbal Cluster of Early-Modern Frisian

Certain phenomena of interest to linguists mainly occur in low-resource languages, such as contact-induced language change. We show that it is possible to study contact-induced language change computationally in a historical variety of a low-resource language, Early-Modern Frisian, by creating a model using features that were established to be relevant in a closely related language, modern Dutch. This allows us to test two hypotheses on two types of language contact that may have taken place between Frisian and Dutch during this time. Our model shows that Frisian verb cluster word orders are associated with different context features than Dutch verb orders, supporting the ‘learned borrowing’ hypothesis.


Introduction
If we want to use computational methods to answer linguistic research questions, a major restriction is that the data-driven methods that are popular in natural language processing today are only applicable to a tiny part of the world's language varieties. Last decade, it was estimated that significant computational resources were available for "perhaps 20 or 30 languages" (Maxwell and Hughes, 2006). Efforts to address this have been proposed, such as the Human Language Project (Abney and Bird, 2010), and to a limited degree executed (i.e. the Universal Dependencies project, Nivre et al., 2016or SeedLing, Emerson et al., 2014. However, the reality is still that relatively few languages are being studied using quantitative methods. Many phenomena that are of interest to linguists do not occur in these 20 or 30 languages, of which the larger available corpora mainly contain modern standard varieties in common registers and within easily recorded domains of language. Specifically, certain phenomena of interest to linguists are characteristic of minority languages, which are by definition used less, and are less likely to have computational resources available. For example, in cases of language contact where there is a majority language and a lesser used language, contact-induced language change is more likely to occur in the lesser used language (Weinreich, 1979). Furthermore, certain phenomena are better studied in historical varieties of languages. Taking the example of language change, it is more interesting to study a specific language change once it has already been completed, such that one can study the change itself in historical texts as well as the subsequent outcome of the change.
For these reasons, contact-induced language change is difficult to study computationally, and we consider it a great test case for applying some insights from the recent wave of articles discussing computational linguistics for low-resource languages. In this work, we apply computational methods, to the extent that it is possible, to gain insight into the nature of language change that occurred in historical West-Frisian, a lesser-used language spoken in the Dutch province of Fryslân.

Case study
Our case study of language change focuses on word order changes in the verbal cluster. This phenomenon has been studied thoroughly in the larger West-Germanic languages such as Dutch (Coussé, 2008;Coupé, 2015), but not the smaller Frisian language 1 , which has been in extensive contact with Dutch for most of its history, continuing up to the present (Breuker, 1993;Ytsma, 1995). This gives us a good basis for comparison. While Frisian is a lesser-used language, its historical data is exceptionally well-accessible: all known West-Frisian texts written until 1800 are digitally available.
In Frisian, when there are two verbs in a cluster (an auxiliary verb and a main verb), the normative word order is the one in example 1 below, as prescribed in the reference grammar of Popkema (2006). However, both logically possible orders are being used in present-day Frisian: (1) Anne Example 1 shows the 2-1 order, so called because the syntactically higher head verb (referred to as 1) comes after the lower lexical verb (2). Example 2 shows the opposite 1-2 order. The presentday use of the 1-2 order appears to be recent, and influenced by language contact with Dutch (de Haan, 1996). It has even been found that Frisian bilingual children have similar word order preferences in their Frisian as in their Dutch (Meyer et al., 2015). However, the non-normative 1-2 order also appears in older sources: in Early-Modern texts, Hoekstra (2012) found 10% 1-2 orders, and noted that the 1-2 ordered clusters exhibit some Dutch-like properties that do not occur in 2-1 ordered clusters, suggesting a contact effect during this time period.
A particularly interesting Middle Frisian set of texts with regards to language contact are the Basle Wedding Speeches, notable for mixing in Middle Low German and Middle Dutch forms (Buma, 1957): a clear case of 'contact' Middle Frisian. Two conflicting hypotheses have been proposed in the literature regarding the nature of this language contact. (Bremmer, 1997, p. 383) argues that the writer was a bilingual with "a full command neither of Frisian nor Low German, certainly not in his writing, nor in all likelihood in his spoken usage". This type of contact may have resulted in this mixed-language text. Blom (2008, p. 21) instead proposes the existence of a shared written register in which using borrowed forms was normal: authors of the time show familiarity with texts written in Middle Dutch and Middle Low German, which may have influenced their written Frisian. These two proposals correspond to two kinds of language change that have been distinguished in the literature: change from below and change from above (Labov, 1965(Labov, , 1994. Furthermore, they correspond to two types of language acquisition: early acquisition and late acquisition (Weerman, 2011). These theories make different usage predictions that allow us to identify which of the two hypotheses is more plausible: 1. Variation in Early-Modern Frisian texts is due to contact through bilingualism, with early acquisition of the optionality, based on Bremmer (1997) and like the present-day situation (de Haan, 1996).
2. Variation in Early-Modern Frisian texts is due to learned borrowing, with late acquisition of the optionality, based on Blom (2008).
To test these hypotheses, we compare features of verb clusters in Early-Modern Frisian texts to those in modern Dutch, as those have been studied thoroughly (De Sutter, 2009;Meyer and Weerman, 2016;Bloem et al., 2014;Augustinus, 2015;Hendriks, 2018). We are particularly interested in the contexts in which the 'Dutch' 1-2 cluster order is used in the Frisian corpus. Specifically, we test whether the Frisian 1-2 orders occur in the same contexts as modern Dutch 1-2 orders to see what type of contact is responsible for them. It has been argued that verb cluster order variation in Dutch has the function of facilitating sentence processing: the verb cluster order that is 'easier' or more economical in a particular context is used (De Sutter, 2009;Bloem et al., 2017). By studying whether the variation in the Frisian texts is predicted by the same features as the variation in modern Dutch, we can infer whether Early-Modern Frisian verb cluster order variation has the same functions as modern Dutch verb cluster order variation.
If Early-Modern Frisian 1-2 order clusters occur in similar contexts as modern Dutch clusters in the 1-2 order this would indicate that this order has the same function in both varieties, and is part of the grammar of the writer of the Early-Modern Frisian text. This can mean two things: Firstly, it could be the case that the order is used in the same way as its modern Dutch counterpart. This supports the idea that 'contact through bilingualism' is the source of the variation: hypothesis 1. If the contexts of use are not similar between Early-Modern Frisian and modern Dutch, this means it is likely that the 1-2 order has been borrowed in some way, but with a different function than the function it has in modern Dutch. In this case, learned borrowing would be the source of the variation: this would support hypothesis 2. There is a third option, which is that these 1-2 orders are not due to contact, but for Early-Modern Frisian we will skip over this possibility with reference to the contact evidence found by Hoekstra (2012). In future work, a study of older Frisian texts is needed to investigate whether this non-contact hypothesis is plausible for older stages of Frisian.

Task description
Our task is to test the aforementioned two hypotheses by taking a model that shows what features are associated with the Dutch 1-2 order, and then creating a model from Frisian data based on those features. We first identify a suitable data source containing sufficiently annotated Early-Modern Frisian text. We then operationalize the relevant verb cluster features (as modelled for Dutch, Bloem et al., 2014) in terms of the annotation. Next, we automatically identify and extract verb clusters and their relevant features from the data. Lastly, we identify the features that are associated with the Dutch-like 1-2 order in the Frisian data, and compare them to those that are associated with the 1-2 order in Dutch. For reasons of comparability, we use logistic regression to identify the features, a method commonly used in quantitative linguistics (Speelman, 2014) and in the studies on Dutch verb clusters that we use as a basis for comparison (De Sutter, 2009;Bloem et al., 2014).
Our approach of taking a case study that is well-studied in a related language is inspired by cross-lingual learning in NLP: in studies involving low-resource languages, closely related languages that are more rich in resources are used as a source of additional data. Examples of this are cross-language parse tree projection (Xia and Lewis, 2007), where structural information about a sentence in one language is transferred to parallel data in another language, and data point selection (Søgaard, 2011), where a tool for a low-resource language is trained on data from a high-resource language, while selecting the data that is most similar to the low-resource language. In both of these cases, general knowledge about a language family is also transferred to a low-resource language.
Frisian language resources When working with a low-resource language, a brief overview of the available resources for that language can be helpful. Most Frisian resources are of the tradi-tional kind. The Wurdboek fan de Fryske Taal, a dictionary that has been in development since 1984, currently contains about 115.000 lemmas (Sijens and Depuydt, 2010), and has an online version 2 . Frisian grammar has been studied since at least the start of the 20th century (Collitz, 1915), leading to collections of linguistic studies such as . Its minority language status has been researched as well (Ytsma, 1995;Breuker, 2001;de Graaf et al., 2015).
As for digital resources, the Fryske Akademy is working on the Frisian Integrated Language Database 3 (Taaldatabank, TDB). This corpus contains all of the attested Frisian texts from the years 1550-1800 and is planned to include modern material. The Early-Modern Frisian texts have been tokenized, lemmatized and part-of-speech tagged manually. The Fryske Akademy is also compiling a Corpus of Spoken Frisian 4 for the purpose of developing speech technology. The aforementioned dictionary is also included in a digitalization effort of Dutch historical dictionaries (Duijff and Kuip, 2018), forming a bilingual lexical-semantic database. A parallel corpus with aligned sentences from the Fryske Akademy exists 5 .
Besides spell-checking, the only available NLP tools appear to be the statistical machine translation system by van Gompel et al. (2014) and two text-to-speech systems: one using an existing Dutch text-to-speech system (Dijkstra et al., 2004) and one using a bilingual system capable of handling code-switching between Dutch and Frisian (Yılmaz et al., 2016). While there is a part-ofspeech tagger for historical (Middle) Low German (Koleva et al., 2017), a related low-resource language, none are available for historical or modern Frisian, and neither are syntactic parsers.
The TDB corpus is the most relevant resource for the present study, as it contains annotated Early-Modern Frisian texts. The size of this section of the corpus is around 480,000 tokens and 20,000 types, though this includes repeated text and noncontemporaneous front/back matter. After selecting representative texts without duplicate material or non-Frisian material, we obtain a subcorpus con-taining 125,842 tokens and 10,405 types. Unfortunately, no tools are available for further annotation that would be relevant for word order phenomena.

Experiments
We automatically annotate verb clusters and extract their features from the corpus using a Python script that detects verb clusters based on the information already available in the annotation. In previous work on Dutch, verb clusters were defined using dependency structure or phrase structure, with one verb being the syntactic head of the other (Bloem et al., 2014;Augustinus, 2015). However, as no syntactic annotation is available, we must rely on part-of-speech tags. As there is no gold standard data for this task, and little data in general, a statistical modeling approach is infeasible. Therefore, the script is rule-based, and we define a verb cluster based on the occurrence of bigrams of verbs (according to the existing annotation), or trigrams containing grammatical verb cluster interruptions, as well as the verb classes in the annotation. The word order of the verb cluster is then determined based on the relative positions of its constituent verbs (a main verb and an auxiliary verb) in the linear order of the sentence. This procedure is not 100% reliable, especially in clusters with infinitival auxiliary verbs, where auxiliary verbs and main verbs may have the same form.
We checked the classification of a random sample of 50 1-2 order clusters and 50 2-1 order clusters, using only prose text for this evaluation because the script appears to make more mistakes there. We evaluate only for precision, not for recall, as we have no gold standard data for evaluating recall. Of the 50 automatically extracted candidate 1-2 clusters, 34 were found to be actual two-verb clusters from subordinate clauses: a precision of 68%. Of the 50 2-1 clusters, all 50 met this requirement (100% precision). Most of the erroneous candidate 1-2 order clusters were cases of a finite auxiliary verb in V2 position in a main clause, immediately followed by the main verb in final position, with no intervening objects. This looks exactly like a 1-2 order cluster consisting of a finite auxiliary verb and a main verb at the end of a subordinate clause. Main clause clusters cannot look like 2-1 order clusters, which explains the 100% precision for the 2-1 order. This evaluation shows that a statistical model based on this annotation is likely to overestimate the probability of 1-2 orders.
Due to annotation limitations, several features from Bloem et al.'s (2014) Dutch model could not be extracted from our corpus: the tree depth of the verb cluster, the definiteness of the preceding noun, extraposition of the prepositional object, multiword units and the length of the clause. Verb frequency was estimated by counting over the entire Early-Modern Frisian part of the TDB. Another factor is that Dutch 1-2 orders have a more uniform information density (Bloem, 2016). This was found by training a n-gram language model on Dutch corpus data, and then measuring its perplexity over sentences containing verb clusters that were not in its training data. A 145 million word corpus was used for this, but for Early-Modern Frisian we have less than 0.5 million words available. A model trained on such diverse texts spanning hundreds of years would require more training data to achieve reasonable perplexity rates than a model trained on newspaper text from a small range of years, thus we cannot reliably operationalize this factor. However, the Dutch result is likely to apply to Frisian as well, as the reasons for the perplexity values that were found for Dutch can equally apply to Frisian: in both languages, there are few clustering auxiliary verbs and many possible main verbs, and in both languages, the first verb of a cluster helps to predict its second verb and is highly unlikely to be followed by something that is not a verb, as verb cluster interruption rarely occurs in present-day Frisian (Barbiers et al., 2008, p. 25-41). The main difference between the languages in this regard is that present-day Frisian shows more noun incorporation into the verb cluster's main verb (Dyk, 1997), which may increase informativity of the main verb compared to Dutch in 1-2 orders, but seems rare. Therefore, we can transfer the knowledge gained with a Dutch language model to Frisian and assume that there is not much difference between the languages regarding verb cluster information density.
Next, we have created a multifactorial logistic regression model using the remaining features. We model verb cluster order as a binary variable predicted by these features, in which the order can be 1-2 or 2-1. The advantage of this method over neural networks or other methods involving dimension reduction is that the contribution of each feature is transparent. The goal is after all not to make an optimal classifier for 1-2 and 2-1 order contexts, but to find out more about why language users pro-duced a 1-2 or 2-1 order given a context. Table 1 shows the contribution of each feature to the model. The effect size of each variable is given as an odds ratio, and in line with previous work, we are reporting associations with the 1-2 order. The model has acceptable multicollinearity (VIF < 1.3) 6 . The text type and year features were not used in previous work, but are necessary control factors when working with historical text. Much of the text is rhyme, which affects word order: 1-2 orders are estimated to be 18.69 times more likely in rhyming text.

Feature
Odds Of the auxiliary verb features, modal is the most important feature in Dutch, with an odds ratio of 148 (Bloem et al., 2014), while our model shows no evidence for an effect. We find an association between copular verbs and the 2-1 order, while Dutch shows the reverse -a difference that supports Hypothesis 2, the learned borrowing hypothesis. The aspectual and to-infinitival effects we found are consistent with Hoekstra's (2012) observations, who shows that no equivalent construction existed in Frisian, making these easy candidates for borrowing, along with the Dutch word order.
Other factors from the Dutch model are not significant in this model (priming, separable, frequency) and are all related to complexity (Bloem, 2016). The information value feature has opposite associations compared to the Dutch model. Thus, the model shows evidence for only some of the features from the Dutch model. Under Hypothe-sis 1, we would expect significant effects hereuse of 1-2 orders in contexts that are more difficult to process, as in Dutch (Bloem et al., 2017). Instead, the only significant features are associated with borrowed constructions, or are significant in the opposite direction as in Dutch and therefore associated with the other word order. These clear usage differences support hypothesis 2: the 1-2 orders appeared due to learned borrowing, and unlike in Dutch, did not have a clear function besides stylistic marking (i.e. in rhymed text). Unfortunately a direct, number by number comparison to the Dutch model is not possible due to different categories (i.e. for the types of auxiliary verbs), stemming from different corpus annotation schemes used for the Dutch and Frisian data. Furthermore, the numbers cannot be compared directly because both models include different features.

Conclusion
Our study has shown that it is possible to apply computational methods to a historical variety of a lesser used language. We investigated a case of contact-induced change, a phenomenon that is mainly found in low-resource languages, and were able to test hypotheses regarding the nature of this change. In doing so, we made use of what is known about the construction in a closely related but higher-resourced language, Dutch. This allowed us to limit the hypothesis space, reducing the problem to a comparison with Dutch and testing whether features that model the observed variation in Dutch, are also relevant in Frisian, although the limited availability of data and annotation did not allow us to test all features. There was also not enough data to train a language model for estimating complexity through model perplexity. Nevertheless, by combining findings from our Frisian data and from previous studies on Dutch, we are able to get a good impression of the origin of the 1-2 order construction in Early-Modern Frisian.
As verb cluster order variation is a probabilistic phenomenon that is affected by multiple factors, we could not have found the verb cluster usage patterns described here without making use of computational models. Even when little data is available, computational methods can help supplement other types of evidence in historical linguistics, particularly on research questions involving variation, complexity and other matters that go beyond grammaticality versus ungrammaticality.