Cliticization of Serbian Personal Pronouns and Auxiliary Verbs. A Dependency-Based Account

The paper looks into cliticization of Serbian personal pronouns and auxiliary verbs. Cliticization is the operation whereby, in the process of clause construction, a clitic (= unstressed) form of a pronominal/verbal lexeme is chosen, rather than a full (= stressed) form. Cliticization of both pronouns and auxiliaries is obligatory under neutral communicative conditions (i.e., in the absence of contrast or emphasis) and unless specific syntactic/prosodic factors impose the choice of a full form. Under marked communicative conditions, cliticization is precluded. Corresponding rules are proposed within a Meaning-Text dependency framework. 1 Overview of the Problem Personal pronouns and auxiliary verbs in Serbian (and all other languages stemming from former Serbo-Croatian) have both full (= stressed, tonic) and clitic (= unstressed) forms, the latter being so-called second-position clitics (Halpern & Zwicky, eds, 1996). In any sentence featuring pronouns and/or auxiliaries, the choice between full and clitic forms is obligatory, which means that the opposition “tonic ~ clitic” is inflectional in nature. The operation whereby the inflectional value (= a grammeme) CLITIC is assigned to a lexical item, in the course of clause synthesis, is called cliticization. Roughly speaking, cliticization of both personal pronouns and auxiliary verbs is obligatory under neutral communicative conditions (i.e., in the absence of contrast or emphasis) and unless specific syntactic/prosodic factors impose the choice of a full form. Under marked communicative conditions, cliticization is precluded. It is precisely these conditions that the paper intends to specify. Here are some preliminary examples of the use of clitic vs. full pronominal and verbal forms; as most examples in the paper, these are taken from the Serbian corpus (Korpus savremenog srpskog jezika: www.korpus.matf.bg.ac.rs). (1) a. Možda me je Mira podsticala na brbljivost. Gledala me je netremice ... lit. ‘Maybe me is Mira having.incited on volubililty. [She] having.looked me is intently...’ ‘Maybe Mira was inciting my volubility. She was looking at me intently...’ b. No, bilo kako bilo, prepoznao ga jeste. lit. ‘But, be it as it may, having.recognised him [he] is.’ ‘But, be it as it may, he did recognize him.’ c. Ali nije gledala njega, gledala je mene. lit. ‘But [she] is.not having.looked him, having.looked [she] is me.’ ‘But she wasn’t looking at him, she was looking at me.’ Example (1a) illustrates a communicatively unmarked context, where clitic forms are used by default and the corresponding full forms would be inappropriate; we see here instances of the accusative 1p pronominal clitic, me ‘me’, and the 3sg past tense auxiliary clitic, je ‘is’. In sentence (1b), a full form of the past tense auxiliary is used contrastively—to insist that the fact of recognizing did take place; note also a marked word order, with the auxiliary clausefinal. The corresponding clitic auxiliary is possible here if the contrast is expressed lexically: [...] zaista ga je 1 The term cliticization has at least another two usages that I do not subscribe to: 1) a diachronic process of becoming a clitic; 2) the operation of attachment of a clitic to its host. prepoznao ‘[...] really him is [he] having.recognized’. Finally, the use of full personal pronouns njega ‘him’ and mene ‘me’ in sentence (1c) is warranted by the contrastive focus they bear; in this type of context clitic forms are excluded. While some other aspects of clitic behavior, in particular their linear placement, have been extensively researched, cliticization (in the sense intended here) has received less attention. Kayne (1975) is a seminal study of cliticization in French, which has served as a springboard for work on this phenomenon in other languages. A discussion of cliticization in Slavic languages can be found, for instance, in Dimitrova-Vulčanova (1999) and Franks (1998 and 2010); the most complete existing account of the cliticization in Serbian/Croatian is the one in Browne (1975: 276-282). Some aspects of the problem were addressed in Progovac (2005: 126-136), Mrazovac, 2009: 364-366), and (in a different perspective) Caink 2000; Peti-Stantić (2017 and 2018) reports on some recent research on the topic on Croatian data. Cliticization is theoretically interesting because it involves the interplay of the syntactic and communicative (a.k.a. information) structures in sentence production and is linked to other important phenomena such as subject ellipsis and conjunction reduction. In the remaining part of this Section, I provide some basic facts about Serbian lexical items susceptible of undergoing cliticization (1.1) and describe the essentials of the theoretical framework adopted (1.2). Conditions under which the cliticization of personal pronouns and auxiliaries occurs are informally characterized in Section 2; their formal description, in terms of rules belonging to a Meaning-Text linguistic model, is offered in Section 3; Section 4 is reserved for a conclusion. 1.1 Full and clitic forms of personal pronouns and auxiliaries As indicated above, cliticizable lexical items in Serbian include personal pronouns and auxiliary verbs. The paradigms of three personal pronouns and two auxiliary verbs follow; the stressed vowel (in the full forms) is boldfaced; tonal accents are not shown. JA ‘I’ ON ‘he’ VI you [PL] ’ BITI ‘be’ in the present, past tense aux. TONIC CLITIC TONIC CLITIC TONIC CLITIC SG PL NOM ja –––– on ––– vi ––– TONIC CLITIC TONIC CLITIC ACC/GEN mene me njega ga vas vas 1 jesam sam jesmo smo DAT meni mi njemu mu vama vam 2 jesi si jeste ste INSTR mnom(e) ––– njim(e) –– vama ––– 3 jeste je jesu su LOC meni ––– njemu ––– vama ––– HTETI lit. ‘want’ in the present, future tense aux. VOC ––– ––– ––– ––– vi ––– 1 hoću ću hoćemo ćemo 2 hoćeš ćeš hoćete ćete 3 hoće će hoće će Table 1: Full and clitic forms of some personal pronouns and auxiliary verbs Pronominal clitic forms exist in the accusative, genitive and dative. The nominative, i.e., subject, pronouns are never cliticized; they are dropped in neutral communicative conditions (Serbian is a PRO-Drop language). Oblique case personal pronouns, whether full or clitic, function as objects of verbs, nouns and adjectives. The auxiliary BITI ‘be’ has the forms identical to that of the copula and the locative verbs; all three verbs exhibit identical behavior with respect to cliticization and linear placement. A finite auxiliary, whether full or clitic, is the head of its clause (Milićević, 2009b) and the top node of the corresponding dependency tree (see immediately below). 1.2 The Framework Within a Meaning-Text linguistic model, a semantically-driven, dependency-based, synthesis-oriented stratificational model (Mel’čuk, 2016: 41-85), cliticization happens in the transition between the Surface-Syntactic Representation [SSyntR] and the Deep-Morphological Representation [DMorphR] of a clause. Formally, the basic structure of the SSyntR is a (linearly non-ordered) dependency tree; that of the DMorphR is a (fully ordered) string. 2 In addition, the interrogative conjunction DA LI has a clitic form, LIINTERR (homophonous with the emphatic particle LIEMPHATIC, with no corresponding full form); it will not be considered in this paper. 3 There is a third auxiliary, BITI in the aorist tense, used to construct the conditional mood forms; it is currently undergoing grammaticalization and becoming a particle, just like its cognate in Russian. Cliticization is part of the operation of morphologization, whereby lexemes in the SSyntS are assigned syntactic inflectional values. Two other major operations—linearization and prosodization of the SSyntS—are part of this transition, which is guided in an essential way by the communicative structure (Mel’čuk, 2001) of the clause under synthesis. During linearization, all lexemes of the clause that have been assigned the grammeme CLITIC (including auxiliary verbs) are gathered in a clitic cluster and linearly positioned together, according to special linearization rules (Milićević, 2009a)—not with respect to their governors, but with respect to a host. The clitics are by default positioned after the first available host, which means that they often “land” clause-second (whence their name). Full pronominal forms obey the same linearization rules as full-fledged nominal complements; their linear positioning is normal in that it is done taking their governor(s) as the reference point. A full finite auxiliary is the reference point for the linearization of all other clause elements, just as a finite lexical verb is. Since our dependency trees are not linearly ordered, for two (or more) clauses containing items that differ only along the “tonic ~ clitic” opposition, the basic dependency structures are identical; their respective communicative structures are different, and so are, of course, their DMorphSs. As an illustration, the corresponding structures for sentences in (2) are given in Figure 1; an underlying question [in square brackets] is supplied for each sentence, providing a minimal communicative context in which it can felicitously be uttered. (2) a. [Did you tell him?] Rekao sam mu. ‘Having.told [I] am to.him.’ = ‘I told him.’ b. [Who did you tell?] Rekao sam njemu. ‘To.him [I] am having.told.’ = ‘It’s to him that I told.’ c. [Why didn’t you tell him?] Jesam mu rekao. ‘[I] am to.him having.told.’ = ‘I did tell him.’


Overview of the Problem
Personal pronouns and auxiliary verbs in Serbian (and all other languages stemming from former Serbo-Croatian) have both full (= stressed, tonic) and clitic (= unstressed) forms, the latter being so-called second-position clitics (Halpern & Zwicky, eds, 1996). In any sentence featuring pronouns and/or auxiliaries, the choice between full and clitic forms is obligatory, which means that the opposition "tonic ~ clitic" is inflectional in nature.
The operation whereby the inflectional value (= a grammeme) CLITIC is assigned to a lexical item, in the course of clause synthesis, is called cliticization 1 . Roughly speaking, cliticization of both personal pronouns and auxiliary verbs is obligatory under neutral communicative conditions (i.e., in the absence of contrast or emphasis) and unless specific syntactic/prosodic factors impose the choice of a full form. Under marked communicative conditions, cliticization is precluded. It is precisely these conditions that the paper intends to specify.
Here are some preliminary examples of the use of clitic vs. full pronominal and verbal forms; as most examples in the paper, these are taken from the Serbian corpus (Korpus savremenog srpskog jezika: www.korpus.matf.bg.ac.rs).
(1) a. Example (1a) illustrates a communicatively unmarked context, where clitic forms are used by default and the corresponding full forms would be inappropriate; we see here instances of the accusative 1p pronominal clitic, me 'me', and the 3sg past tense auxiliary clitic, je 'is'. In sentence (1b), a full form of the past tense auxiliary is used contrastively-to insist that the fact of recognizing did take place; note also a marked word order, with the auxiliary clausefinal. The corresponding clitic auxiliary is possible here if the contrast is expressed lexically: […] zaista ga je prepoznao '[…] really him is [he] having.recognized'. Finally, the use of full personal pronouns njega 'him' and mene 'me' in sentence (1c) is warranted by the contrastive focus they bear; in this type of context clitic forms are excluded.
While some other aspects of clitic behavior, in particular their linear placement, have been extensively researched, cliticization (in the sense intended here) has received less attention. Kayne (1975) is a seminal study of cliticization in French, which has served as a springboard for work on this phenomenon in other languages. A discussion of cliticization in Slavic languages can be found, for instance, in Dimitrova-Vulčanova (1999) and Franks (1998 and2010); the most complete existing account of the cliticization in Serbian/Croatian is the one in Browne (1975: 276-282). Some aspects of the problem were addressed in Progovac (2005: 126-136), Mrazovac, 2009: 364-366), and (in a different perspective) Caink 2000;Peti-Stantić (2017 and2018) reports on some recent research on the topic on Croatian data.
Cliticization is theoretically interesting because it involves the interplay of the syntactic and communicative (a.k.a. information) structures in sentence production and is linked to other important phenomena such as subject ellipsis and conjunction reduction.
In the remaining part of this Section, I provide some basic facts about Serbian lexical items susceptible of undergoing cliticization (1.1) and describe the essentials of the theoretical framework adopted (1.2). Conditions under which the cliticization of personal pronouns and auxiliaries occurs are informally characterized in Section 2; their formal description, in terms of rules belonging to a Meaning-Text linguistic model, is offered in Section 3; Section 4 is reserved for a conclusion.

Full and clitic forms of personal pronouns and auxiliaries
As indicated above, cliticizable lexical items in Serbian include personal pronouns and auxiliary verbs. 2 The paradigms of three personal pronouns and two auxiliary verbs follow; 3 the stressed vowel (in the full forms) is boldfaced; tonal accents are not shown.   Pronominal clitic forms exist in the accusative, genitive and dative. The nominative, i.e., subject, pronouns are never cliticized; they are dropped in neutral communicative conditions (Serbian is a PRO-Drop language). Oblique case personal pronouns, whether full or clitic, function as objects of verbs, nouns and adjectives.
The auxiliary BITI 'be' has the forms identical to that of the copula and the locative verbs; all three verbs exhibit identical behavior with respect to cliticization and linear placement. A finite auxiliary, whether full or clitic, is the head of its clause (Milićević, 2009b) and the top node of the corresponding dependency tree (see immediately below).

The Framework
Within a Meaning-Text linguistic model, a semantically-driven, dependency-based, synthesis-oriented stratificational model (Mel'čuk, 2016: 41-85), cliticization happens in the transition between the Surface-Syntactic Representation [SSyntR] and the Deep-Morphological Representation [DMorphR] of a clause. Formally, the basic structure of the SSyntR is a (linearly non-ordered) dependency tree; that of the DMorphR is a (fully ordered) string.
Cliticization is part of the operation of morphologization, whereby lexemes in the SSyntS are assigned syntactic inflectional values. Two other major operations-linearization and prosodization of the SSyntS-are part of this transition, which is guided in an essential way by the communicative structure (Mel'čuk, 2001) of the clause under synthesis.
During linearization, all lexemes of the clause that have been assigned the grammeme CLITIC (including auxiliary verbs) are gathered in a clitic cluster and linearly positioned together, according to special linearization rules (Milićević, 2009a)-not with respect to their governors, but with respect to a host. The clitics are by default positioned after the first available host, which means that they often "land" clause-second (whence their name).
Full pronominal forms obey the same linearization rules as full-fledged nominal complements; their linear positioning is normal in that it is done taking their governor(s) as the reference point. A full finite auxiliary is the reference point for the linearization of all other clause elements, just as a finite lexical verb is.
Since our dependency trees are not linearly ordered, for two (or more) clauses containing items that differ only along the "tonic ~ clitic" opposition, the basic dependency structures are identical; their respective communicative structures are different, and so are, of course, their DMorphSs. As an illustration, the corresponding structures for sentences in (2) are given in Figure 1; an underlying question [in square brackets] is supplied for each sentence, providing a minimal communicative context in which it can felicitously be uttered.
( 2)   In the SSyntS of (2a), the auxiliary BITI 'to be' and the pronoun ON 'he' bear no marked communicative values and neither of them appears within a syntactic configuration which does not allow for cliticization (for instance, in coordination or as the only word in a clause); therefore, they are both assigned the grammeme CLITIC, which appears in the DMorphS of (2a).
The communicative value Focalized, assigned to the pronoun ON 'he' in the SSyntS of (2b), marks it as logically prominent with respect to some contextual information (cf. the corresponding underlying question); it is this communicative marking that triggers the assignment of the grammeme TONIC to the pronoun in the transition towards the morphological string. An analogous situation obtains with the auxiliary BITI 'to be' in the structures underlying (2c).
This architecture of the Meaning-Text Model determines the form of cliticization rules: they are transition rules, operating between (fragments of) SSyntRs and DMorphRs of utterances and having as conditions the communicative load and syntactic/prosodic environment of the items whose tonicity status they specify.

Factors Relevant for Cliticization of Personal Pronouns and Auxiliary Verbs
The use of clitic vs. full forms of pronouns and auxiliaries is determined both by communicative factors and syntactic/prosodic ones. Three cases can be distinguished.
Case 1 A full form of a PRON/V (Aux) is freely chosen to express a value of a communicative opposition (Mel'čuk 2001: 93-258): • The value Focalized (the marked value of the Focalization opposition) or/and the value Emphatic (the marked value of the Emphasis opposition).
(3) a. A pronoun used as an answer to a WH-question carries the rhematic focus and must appear in a full form. This holds not only when it is clause-initial/the only word in the clause (this environment being unavailable for an enclitic for prosodic reasons), as in (5a), but also when it appears clause-internally, as in (5b), where Kažem mu, otherwise a fully grammatical sentence, is inappropriate.

Case 2 A full form of a PRON/V (Aux) is imposed by syntactic/prosodic factors (rather than freely chosen to express some communicative opposition values).
1) The word order constraints are such that a PRON/V (Aux) must be/preferably is clause-initial or follows an internal prosodic break (i.e., it finds itself in a linear position unavailable for an enclitic). The pronoun in (6a) preferably appears in the clause-initial position (because it functions as a semantic theme within a thematic progression sequence) and is therefore full; however, it could have been used in the corresponding clitic form clause-internally ([…] Veruje mu se i sada je najpopularniji …).
A full form of the auxiliary is standardly used as an elliptic (only-word) affirmative answer to a YES/NO question, as shown in (6b-i). 4 When not clause-initial, as in (6b-ii), a V (Aux) must appear in a clitic form. This contrasts with the behavior of personal pronouns in the same syntactic environment; cf. (5b).
In (6c), a full form of the auxiliary BITI 'be' is used because it follows an internal prosodic break (marked by a comma in writing). 5

2) Coordination
A pronoun used in coordination (with another pronoun or with a noun) must be full, as illustrated in (7); however, this restriction does not hold for the auxiliaries, as shown in (8b)  c. I baš zato što je to istina cela stvar i (Conj) jeste tako smešna! 'And precisely because this is true the whole thing and is so funny = is so funny in the first place.' Pronouns as propositional objects must appear in a full form. 7 No communicative load is attached to the full form; to express focalization, prosody is used (symbolized by capitalization in our examples): Mislim na NJU 'It is of her that I am thinking'.
Some "focalizing" conjunctions impose the use of a full form a PRON/V (Aux) .

4) Presence of a specific dependent [for pronouns only]
A pronoun governing a restrictive modifier (baš 'precisely', samo 'only', jedino 'uniquely', isključivo 'exclusively', …) must appear in a full form. (Again, we could say that such a modifier has a focalizing effect, and that this triggers the assignment of the grammeme TONIC to the pronoun.) If a dative and a 1/2p accusative pronoun cooccur, one of them must appear in the full form; cliticizing both pronouns leads to ungrammaticality. The incompatibility of dative -accusative clitic sequences is known in other Slavic languages, for instance Bulgarian (Franks 1998: 85), as well as in Romance languages (Miller & Monachesi 2003: 87ff).
Case 3 A clitic form of a PRON/V (Aux) is chosen by default, i.e., if no communicative load is attached to it and no syntactic/prosodic factors are present which preclude cliticization. As shown above, in most cases, clitic and full forms of personal pronouns are in complementary distribution, and so are clitic and full forms of auxiliaries. There are two types of situation where this does not hold.
1) In some unmarked contexts, either a clitic or a full form is possible without any perceptible communicative difference: Čini mi CLITIC se da … '[It] seems to.me REFL that (Conj) …' <Meni FULL se čini da… > 'To.me [it]seem REFL that (Conj) …' 2) In some neutralizing contexts, the communicative load carried by a full form is also expressed by another clause element; thus, sentence Stvarno jeste FULL tako 'Really [it] IS like.that', in which the adverb STVARNO 'really' provides a neutralizing context, allows for a paraphrase making use of the corresponding clitic form of the auxiliary: Stvarno je CLITIC tako 'Really [it] is like.that'. Also, interchangeability of a full and a clitic form is possible if the communicative load carried by a full form can alternatively be expressed by a lexical mean: Jeste FULL tako <Stvarno je CLITIC tako>.

Cliticization Rules for Personal Pronouns and Auxiliary Verbs
To account for the fact described in Section 2, two cliticization rules are needed, one for the pronouns and another one for the auxiliaries; they are given in Figures 3 and 4, respectively. (Shaded areas in the left-hand side of a rule indicate the context of its application. Both rules are a "short hand" for several more specific rules.) SSynt-level DMorph-level L (Pron.pers) ó L (Pron.pers)CLIT | L is NOT 1) communicatively marked 2) placed clause-initially or after a clause-internal prosodic break 3) the governing member of the coordinative SSyntRel 4) the governing member of the restrictive SSyntRel 5) the dependent member of the prepositional or conjunctional SSyntRel Figure 3: Cliticization rule for personal pronouns According to this rule, the cliticization of personal pronouns will take place in all cases except those illustrated in (1c), (2b), (3), (5), (6a), (7) (9a/b) and (10). As for the case illustrated in (11), it will be taken care of by filter rules presiding over the construction of the clitic cluster (Milićević, 2007: 109-114).

Conclusion
The use of clitic forms of Serbian personal pronouns and auxiliary verbs is the default case, while using tonic forms requires additional conditions. Tonic forms are either freely chosen to express marked values of communicative oppositions or are imposed by specific syntactic configurations/prosodic environments. This is in line with the conclusions of Peti-Stantić (2018) for Croatian; cf.: "Short, clitic forms [in Croatian] are the first (and the only) choice in informationally neutral contexts".
Tonic forms are more prominent morphologically and syntactically: unlike clitics, which are deficient, stresslacking wordforms, they are full-fledged wordforms and full-fledged sentence elements, less restricted in their linear positioning. Thus, being tonic is a sort of a promotion. It is not surprising, then, that tonic forms appear under more involved communicative/syntactic conditions.
To what extent are the conditions that license cliticization similar cross-linguistically? Are the factors identified above for Serbian 2P clitics applicable to clitics of other types? I would expect communicative factors to be more generally applicable than syntactic factors, but this has yet to be determined on a large enough sample of languages.
Given the fact that in some cases a full form of a pronoun/auxiliary is selected freely, to express a communicative opposition, we could ask whether tonicity is really (or only) a syntactic inflectional category. It looks like in this case a syntactic inflectional category has been "enlisted" to express some semantic/communicative information. This situation is similar to gender conversion as a means of expressing some derivational meanings (e.g., in Spanish) or a change of nominal class in order to express plurality (e.g., in Bantu languages), where a syntactic feature is pressed into service for word formation or inflection purposes.