A Concatenation Operation to Derive Autosegmental Graphs

Autosegmental phonology represents words with graph structures. This paper introduces a way of reasoning about autosegmental graphs as strings of concatenated graph primitives. The main result shows that the sets of autosegmental graphs so generated obey two important, putatively universal, constraints in phonological theory provided that the graph primitives also obey these constraints. These constraints are the Obligatory Contour Principle and the No Crossing Constraint. Thus, these constraints can be understood as being derived from a finite basis under concatenation. This contrasts with (and complements) earlier analyses of autosegmental representations, where these constraints were presented as axioms of the grammatical system. Empirically motivated examples are provided.


Introduction
Autosegmental phonology represents words with graph structures. This paper provides a new way of defining the set of valid autosegmental representations through concatenating a finite set of graph primitives with particular properties. This 'bottomup' approach to formalizing autosegmental representations (henceforth APRs) contrasts with the 'top-down', axiomatic approach of previous formalizations of APRs (Goldsmith, 1976;Bird and Klein, 1990;Coleman and Local, 1991;Kornai, 1995). However, we show that APR graphs constructed in the way we define hold to these axioms. One advantage to this perspective is that it brings out the stringlike quality of APRs, in that they can be generated by the concatenation of a finite set of primitives. Furthermore, it shows that two putatively universal constraints, the Obligatory Contour Principle and the No Crossing Constraint (see below), are guaranteed to hold of autosegmental representations provided the graph primitives also obey these constraints. In other words, concatenation preserves these properties. Finally, the empirical generalization that languages may exhibit unbounded spreading but not unbounded contours is naturally expressed by this finite set of primitives, as spreading is derivable through concatenation but the only available contours are those found in the set of graph primitives. In short, important properties of autosegmental representations of words can be understood as being derived from a finite basis under concatenation. Goldsmith (1976) originally defined APRs as graphs. Likewise, this paper models APRs using graphs representing both the associations and precedence relations of APRs. We apply established graph-theoretic methods to APRs, in particular graph concatenation, as defined by Engelfriet and Vereijken (1997). Engelfriet and Vereijken (1997) generate all graphs from concatenation and sum operations and a finite set of primitives. What is proposed here is a much weaker version of this idea, using concatenation only to build a specific class of graphs from a set of primitives. In doing so, it is shown how the properties of structures in the generated class derive from the operation and the primitives.
As detailed in the next section, there are several properties that most researchers agree are essential to APRs. One is that their composite autosegments are divided up into disjoint strings called tiers, with associations linking autosegments on different tiers. Second, the No-Crossing Constraint (NCC) (Goldsmith, 1976;Hammond, 1988;Coleman and Local, 1991) states that these associations cannot 'cross'; i.e., they must respect the precedence relations on each tier. Finally, the Obligatory Contour Prinicple (OCP) (Leben, 1973) states that on the melody tier adjacent autosegments cannot be identical.
Formal treatments of these properties, starting with Goldsmith (1976), state these properties as axioms. For example, Bird and Klein (1990) provide a model-theoretic definition of APRs given a particular interpretation of association as overlap, and state axioms restricting the overlap relation. More recently, Jardine (2014) axiomatizes the NCC and one-to-one association in monadic-second order logic. Kornai (1995)'s treatment defines concatenation operations similar to the one given here, but his definition of APRs as bistrings does not derive from these operations. As a result, key properties like the NCC must be specified as axioms.
Instead, the current paper shows that the NCC and OCP can be derived by a concatenation operation alone, given a well-defined set of primitives. This paper is structured as follows. §2 details the set of properties phonologists deem important for APRs. §3 gives the relevant mathematical preliminaries, and §4 defines APRs as graphs and how the properties in §2 can be formalized as axioms. §5 defines a concatenation operation over graphs, and §6 proves how APR graphs derived using this concatenation operation obey the relevant axioms from §4. §7 then shows how to describe some common natural language phenomena using concatenation, as well as some phenomena that raise issues for concatenation. §8 reviews the advantages of viewing APRs through concatenation and discusses future work, and §9 concludes.

Basics of Autosegmental Phonology
Autosegmental phonology (AP) (Goldsmith, 1976;Goldsmith, 1979;Clements, 1976;McCarthy, 1979;McCarthy, 1985) has been a widely adopted theory of phonological representations in which phonological units, called autosegments, appear on one of some finite set of strings, or tiers, and related to au-tosegments on other tiers by association. Such autosegmental representations (APRs) are usually depicted with the tiers as vertically separated strings of symbols and the association relation shown as lines drawn between autosegments, as in (1) below.
The core insight APRs express is that a single autosegment on one tier may be associated to multiple autosegments on another tier, as in (1). For purposes of exposition, this paper focuses on two-tiered APRs: a melody tier, which carries featural information, and a timing tier, which represents how features on the melody tier are pronounced in the linear speech stream. For example, in tonal phonology, APRs often comprise a melody tier over the symbols {H, L} for high and low tones and a timing tier over {µ} for morae (the timing unit most commonly associated with tone). The APR in (1c) thus represents a high-toned mora followed by a falling tone mora.
Thus, the insights of autosegmental phonology can be studied minimally with two-tier APRs, and so this paper focuses on two-tier APRs. However, in practice, APRs often use more than two tiers. As we explain at the appropriate points throughout the paper, the concepts discussed here can be straightforwardly applied to AP graphs with multiple tiers.
Two principles have been seen as crucial to constraining the theory of APRs. One is the No Crossing Constraint (NCC) (Goldsmith, 1976;Hammond, 1988;Coleman and Local, 1991), which states that if autosegment a is associated to autosegment y, no autosegment b which follows a on its tier may be associated to an autosegment x which precedes y. An example APR violating the NCC is given in (2a). The other principle is the Obligatory Contour Principle (OCP), which states that on each tier, adjacent autosegments must be different (Leben, 1973;Mc-Carthy, 1986). The APR in (2b) violates the OCP.
Formal definitions of the NCC and OCP will be given in the following section, after we have defined APRs explicitly in terms of graphs. The NCC is usually considered to be inviolable, where the OCP is considered violable by some authors (Odden, 1986). This paper treats the OCP as an inviolable principle, although this point is returned to in §8.
It is often, but not always, assumed that the sets of autosegments which are allowed to appear on each tier are disjoint. This assumption is usually adhered to in tonal and featural APRs, but not always in morphological APRs in which separate tiers represent separate morphemes (a la McCarthy (1979)). Here, we assume that the sets of elements allowed to appear on each tier are disjoint, and leave theories of APRs which allow a particular autosegment to appear on multiple tiers for future work.

Preliminaries
Let N represent the natural numbers. Given a set X of elements, a partition P is a set {X 0 , X 1 , ...X n } of nonempty subsets or blocks of X such that X is the union of these blocks and for each X i , X j ∈ P , X i ∩ X j = ∅. P induces an equivalence relation ∼ P over X such that for all x, y ∈ X, x ∼ P y iff for some X i ∈ P , x ∈ X i and y ∈ X i . We also say ∼ P partitions X into P . A partition P is said to refine another partition P ′ iff every block of P ′ is a union of blocks of P . We also say ∼ P is then finer than ∼ P ′ . If R is a relation on X then let ∼ R denote the finest equivalence relation on X containing R.
If Σ is a finite alphabet of symbols, then Σ * denotes the set of all strings over that alphabet, including the empty string λ. We consider here alphabets structured by partitions. We refer to a partition T = {T 0 , T 1 , ..., T n } of Σ as a tier partition over Σ, and refer to some block T i in T as a tier alphabet.
A labeled mixed graph is a tuple V, E, A, ℓ where V is a set of nodes, E is the set of undirected edges, A is the set of directed edges (or arcs), and ℓ : V → Σ is a total labeling function assigning each node in V a label in an alphabet Σ. For elements of the set V we will use early elements in N. An undirected edge is a set {x, y} of cardinality 2 of nodes x, y ∈ V , and a directed edge is a 2-tuple (x, y) of nodes in V . When not obvious from context, the elements of a graph G will be marked with subscripts; e.g., V G . Let G λ , the empty graph, refer to the graph ∅, ∅, ∅, ∅ .
Unless otherwise noted, all graphs in this paper are labeled mixed graphs, and thus will simply be referred to as graphs. All graphs are also assumed to be simple graphs without multiple edges; {x, y} ∈ E implies (x, y) ∈ A, and (x, y) ∈ A implies {x, y} ∈ E. Let GR(Σ) denote the union of {G λ } with all graphs whose labels are in Σ. A In other words, H has exactly the edges in G that appear between the nodes in X. We say X induces H and also write G[X] for H. By a partition of G we refer to some set

APRs as graphs
Here we define autosegmental graphs (APGs), or explicit graph representations of APRs. In this section, the set of valid APGs is defined axiomatically based on the phonological principles discussed in §2. In §6.2 we show that these principles can all be derived from graph concatenation. For an APG G, A represents the ordering relation on each tier, and E represents the association relations between them. 1 We first define the tiers as subgraphs of G that are string graphs for which A represents the successor relation (Engelfriet and Hoogeboom, 2001). Let be the reflexive, transitive closure of A. That is, for any x, y ∈ V , if x y then either x = y or there is a directed path from x to y. Definition 1 A graph is a string graph if E = ∅ and its relation is a total order on V .
Let ∼ A be the smallest equivalence relation that results from the symmetric closure of . The first axiom says ∼ A partitions V into two tiers.
Axiom 1 V is partitioned by ∼ A into at most two sets V 0 , V 1 such that G[V 0 ] and G[V 1 ] are string graphs. V 0 and V 1 are the tiers of G.
The second axiom, related to Axiom 1, is that the partition of G into tiers respects some partition of Σ.

Axiom 2 There is some tier partition
Axiom 2 corresponds to the principle discussed in §2 that each kind of autosegment may only appear on a particular tier. Note that a tier in G thus corresponds to a tier alphabet in T . For notational brevity, we mark this with matching subscripts; e.g., Axiom 3 governs the general form of associations.
This simply states that the undirected edges, which again represent associations, must have one end in each tier. Thus, as noted by Coleman and Local (1991), the set of associations between two tiers in an APG forms a bipartite undirected graph V, E, ℓ where the two parts are the tiers V 0 and V 1 .
Having defined the structure of APGs in Axioms 1 through 3, we now define the NCC and OCP.
Finally, Axiom 5 defines the OCP. Recall that the OCP only holds at the melodic level, so we choose only one of the tiers V m for the OCP to hold.

Axiom 5 (OCP) For one tier
This concludes the axioms for APGs. For an alphabet Σ and tier partition T = {T m , T t } over Σ, let AP G(Σ, T ) denote the class of APGs obeying the tier partition T of Σ, where for each G ∈ AP G(Σ, T ), ℓ maps elements in the tier V m adhering to Axiom 5 to T m . 2 §6 shows how to derive these axioms from the concatenation, as defined in the following section, of an alphabet of graph primitives with certain properties.
These axioms can be extended to graphs with more than two tiers. Instead of binary partitions, Σ and V could be partitioned into {T 0 , T 1 , ..., T n } and {V 0 , V 1 , ..., V n }, respectively. In this case, Axiom 3 would specify a single tier in which all undirected edges must have one end. Axiom 5 would then hold for all tiers besides this tier. This results in 'paddle-wheel' APRs, like those defined by Pulleyblank (1986). Theories of feature geometry (Archangeli and Pulleyblank, 1994;Clements and Hume, 1995;Sagey, 1986) could also be accommodated for by positing additional structure on T . This, however, shall be left for future work.

Concatenation
This section defines a concatenation operation (•) based on that of Engelfriet and Vereijken (1997). Engelfriet and Vereijken (1997)'s operation merges nodes of graphs with specified beginning and end points; here, we use the tier structure to determine how the graphs are concatenated. We thus define G 1 • G 2 for two graphs G 1 , G 2 in GR(Σ) given a tier partition T = {T m , T t } over Σ. The basic idea is to connect, if they exist, the last node of the first graph and the first node of the second graph for each tier. Such 'end nodes' with identical labels in the T m tier alphabet are merged, whereas end nodes with labels in the timing tier alphabet and nodes with nonidentical labels in the melody tier alphabet are connected via a directed edge. As shown in §6.2 and §7, it is this 'merging' that derives both the OCP and spreading for APGs constructed this way. As the concatenation operation is defined over graphs in GR(Σ), it is at first very general and not of any phonological interest. However, we show in §6 that concatenation can be used to define a set of APGs that follow the axioms in §4, as shown in §6.2.

Definition
We assume that G 1 and G 2 are disjoint (i.e., that V 1 and V 2 are disjoint sets)-if G 2 is not disjoint with G 1 , then we replace it with a graph isomorphic to G 2 that is disjoint with G 1 . We use two partial functions first : Figure 2: Two graphs in GR(Σ) T → N and last : GR(Σ) × T → N which pick out the first and last nodes on a particular tier in a graph with edges and labeling as in Figure 2 Node indices are given as subscripts on the node labels. last(G 1 , T m ) = 1, and The concatenation operation combines the graphs, either merging or drawing arcs between the first and last nodes on each tier, depending on their labels. The operation can be broken down into multiple steps as follows. First, we define the graph G 1,2 as the pairwise union of G 1 and G 2 . We denote V 1 ∪ V 2 with V 1,2 and so on.
Next, two binary relations over the nodes of G 1,2 are defined. R pairs the last element in G 1 and the first element in G 2 of each tier. R ID is a restriction on R to pairs who share identical labels, excluding nodes whose labels are in T t .
We also often refer to the complement of R ID with respect to R; R ID def = R − R ID . We can then use Engelfriet and Vereijken (1997)'s merging operation which reduces a graph G with any relation R ⊆ V × V over its nodes. Informally, nodes which stand in the relation are merged; everything else stays the same. Given any such relation R, we consider ∼ R , the finest equivalence relation on V containing R. In the usual way, let Here, we use ∼ R ID , which assigns each node its own equivalence class, except for pairs (v, v ′ ) ∈ R ID of last and first nodes with identical labels, which are lumped together.
Example 2 Continuing with G 1 and G 2 from Example 1, G 1,2 is given in Figure 3a.
Given a graph G and a relation R ⊆ V × V , Engelfriet and Vereijken (1997) This simply 'merges' the nodes of V based on the equivalence relation ∼ R . G/R can then be defined as the graph reduced by this merged set of nodes; V /R, E, A, ℓ .
The final step is to add precedence arcs to connect R ID , the unmerged last and first nodes in G 1,2 /R ID . Again, R ID is the pairs of last/first nodes on the melody tier that are not identical and the last/first pair on the timing tier, which are never merged.
Definition 2 (Concatenation of APGs). The concatenation G 1 • G 2 of graphs G 1 and G 2 in GR(Σ) is: Example 3 The concatenation of G 1 and G 2 is given in Figure 4. The node numbered 1, 3 represents the nodes from Fig. 3 which have been merged. Node also the added directed edge (2, 4) from R ID in Example 2.
Technically, the resulting set V 1,2 /R ID is a set of sets of nodes representing the equivalence classes of ∼ R ID ; for example, Represented strictly in this way, successive concatenations will yield sets of sets of sets of nodes, ad infinitum. For example, concatenating a third graph, such as G 3 in Figure 5 below, to G 1 • G 2 would further merge node {1, 3} with node 5 in G 3 . Strictly speaking, the resulting node is {{1, 3}, {5}}. For clarity, we instead represent each node in this case as the union of the elements of each member of its equivalence class, e.g. {1, 3, 5} for the concatenation (G 1 • G 2 ) • G 3 in Figure 5. This convenient renaming 'flattens out' the nested sets. It does not result in any loss of generality because union is associative. Also, it will be useful later when showing concatenation is associative for the particular class of graphs described in §6.
{1, 3, 5} Importantly, the relations R and R ID do not depend on a binary partition over Σ; they only require that one partition T t for the timing tier be specified. Thus, while the examples given here focus on two tiers, this operation is defined for graphs representing APRs with multiple melody tiers.

Properties
This section proves two important properties of concatentation, that G λ is the identity for •, and that for any tier in both G 1 and G 2 , G 1 •G 2 contains a string graph corresponding to those tiers.
Theorem 1 G λ is the identity element for the • operation. That is, for any G ∈ GR(Σ), G • G λ = G λ • G = G.
Proof: Let G = V, E, A, ℓ . We first consider G λ • G. Recall that the concatenation of two graphs is a modification of their disjoint union. From the properties of the union operation, we know that the disjoint union of G λ and G is G. Note that first(G λ , T i ) and last(G λ , T i ) are undefined for all T i ∈ T , because the set of nodes is empty in G λ . Thus, R = ∅, and so R ID = R ID = ∅. Because R ID = ∅, V /R = V , because the smallest equivalence relation containing ∅ is =. Thus, The next lemma shows that concatenation preserves the string graph properties of any tiers in G 1 and G 2 . This is important for showing the associativity of concatenation under certain graph classes, as will be discussed in §6.
Lemma 1 Let U i and V i denote the set of all nodes in G 1 and G 2 , respectively, with labels in some is a string graph, where W i is the set of all nodes in G 1 •G 2 whose labels are in T i . Furthermore, for any T i , if v = first(G 1 , T i ), then first(G 1 • G 2 , T i ) is the unique node in G 1 • G 2 which contains v, and likewise for last(G 2 , T i ).
Proof: This follows immediately from the definition of concatenation if G 1 [U i ] is a string graph and V i is empty, because then first(G 2 , T i ) will be undefined and no member of U i will appear in R, and thus all will appear in G 1 • G 2 unmodified and with no new arcs associated with them. Thus, and so both are string graphs. The proof for the case in which U i is empty and G 2 [V i ] is a string graph is very similar.
For the final case, recall that a graph G is a string graph iff its set of arcs A forms a total order on its nodes V . For the case G 1 [U i ] and G 2 [V i ] are string graphs and v 1 = last(G 1 , T i ) and v 2 = first(G 2 , T i ), then (v 1 , v 2 ) appears in either R ID or R ID . If the pair is in R ID , v 1 and v 2 are merged into a node v 1,2 and no new arcs will be introduced to the set A i of the arcs in ( , respectively, which maintains the total orders of both U i and V i .
are all in A i , which also mantains the total order.
That for v = first(G 1 , T i ), first(G 1 • G 2 , T i ) is the unique node which contains v follows directly from the fact that the total order on U i is maintained. Likewise for v = last(G 2 , T i ) and V i .
These properties allow us to treat sets of graphs parallel to sets of strings, as the next section shows.

Alphabets of graph primitives
As Engelfriet and Vereijken (1997) observe, given a concatenation operation a class of graphs can be seen as an interpretation of a set of strings, where each symbol in the string corresponds to a graph primitive. We now define an APG graph primitive.

Definition 3
Over an alphabet Σ and tier partition T = {T t , T m }, an APG graph primitive is a graph G ∈ GR(Σ) which has the following properties: We can then treat a finite set of primitives like an alphabet of symbols: Definition 4 An alphabet of graph primitives over GR(Σ) is a finite set Γ of symbols and a naming function g : Γ → GR(Σ).
An alphabet of APG graph primitives is thus Γ for which for all γ ∈ Γ, g(γ) satisfies Definition 3. The strings in Γ * thus represent a class of graphs, which we will call AP G(Γ). We define AP G(Γ)by extending g to strings in Γ * .

Definition 5
For an alphabet of graph primitives Γ with naming function g, extend g to strings in Γ * as

Derived properties
We now show that if Γ is an alphabet of APG graph primitives, then AP G(Γ) has a number of desirable properties. The following assumes Γ is an alphabet of APG graph primitives. First, we prove the following theorem stating that all graphs in AP G(Γ) follow Axioms 1 through 3 from §4 regarding the general structure of APGs.
Theorem 2 For any G ∈ AP G(Γ), G satisfies Axiom 1 (that ∼ A partitions V into at most two sets V 0 and V 1 such that G[V 0 ] and G[V 1 ] are string graphs), Axiom 2 (that the tiers of G correspond to the partition T ), and Axiom 3 (that the ends of all undirected edges are between different tiers).
Proof: That G satisfies Axioms 1 and 2 follows directly from parts (a) and (b) of Definition 3 and the fact that concatenation only adds arcs between nodes whose labels are in the same T i ∈ T . That G[V 0 ] and G[V 1 ] are string graphs follows from parts (a) and (b) of Definition 3 and Lemma 1.
That G follows Axiom 3 follows directly from Part (c) of Definition 3 and the fact that concatenation adds no new undirected edges to E.
Next, concatenation is associative over AP G(Γ) . The following lemma allows one to prove Theorem 3 (associativity) below.
Lemma 2 For any u, v ∈ Γ * denote g(u), g(v) ∈ AP G(Γ) with G u and G v respectively. Then for any That E = E ′ and ℓ = ℓ ′ follow from Definition 2 of concatenation and associativity of union.
To show V = V ′ , there are seven relevent cases to consider. Let V u , V v , and V γ denote the sets of nodes for G u , G v , and G γ , respectively, and let v u denote a node in V u , etc. As merging is accomplished through grouping nodes into equivalence classes,all nodes in V or V ′ thus correspond to either Cases 1- we do not distinguish between nodes representing sets and nodes representing sets of sets).
As per the definition of concatenation, Cases 1-3. We first establish that when v ∈ V corresponds to a singleton set that v ∈ V ′ . Consider the case when v ∈ V corresponds to {v v }, when v v has not been merged. For V , this is exactly the case in which there is no is not the last node in G v for any T i , as by Theorem 2 and Lemma 1 the last node for T i in G u • G v must be the unique set which includes the last node for -u,v , then either v v is not the first node in G v or there is no v u with which it can merge. Thus, either {v v } is not the first node in G v • G γ (again by Lemma 1) or there is no node v u to merge with {v v }, and so there is no when v corresponds to {v u } and {v γ } are very similar. The proofs that v ′ ∈ V ′ implies v ′ ∈ V for all three cases are identical.
The remaining cases deal with merged nodes. Cases 4-6. Consider the case in which v ∈ V is {v u , v v } corresponding to merged nodes from V u and V v . This is the case in which The latter is a special case in which V v has no nodes for some T i , but v u and v γ are compatible to merge. Case 7.
That A = A ′ is very similar to the proof for V = V ′ . Let A i denotes the set of arcs in g(γ i ).
ID are defined parallel to R ID−u,v and R ID . As union is associative, it is sufficient to show that every pair ID and vice versa, and that every pair and vice versa. Both of these follow from the fact that V = V ′ and Lemma 1 in the same way as merging nodes above.
Next it is shown that graph concatenation is associative over arbitrary graphs in AP G(Γ) with the same kind of inductive argument which establishes concatenation is associative over strings.
Theorem 3 The • operation is associative over graphs in AP G(Γ). For any u, v, w ∈ Γ * denote g(u), g(v), g(w) ∈ AP G(Γ) with G u , G v , G w re- Proof: The proof is by induction on the size of w. For the base case, when w = λ, G w = G λ . Then , which equals G u • G v by Theorem 1. It follows, again by Theorem 1, that (G u • G v ) • G λ . Hence the base case is proved.
Next we assume the inductive hypothesis that associativity holds for strings of length n and we consider any w ∈ Γ * of length n + 1. Clearly there exists x ∈ Γ * of length n and γ ∈ Γ so that w = xγ.
Then, by the induction hypothesis, we have The next theorem states that any G ∈ AP G(Γ) follows the NCC.

Theorem 4 For any G ∈ AP G(Γ), G satisfies the NCC (Axiom 4).
Proof: The proof is by recursion on the length of w ∈ Γ * . G λ trivially satisfies the NCC because it has no nodes. For g(γ) for any γ ∈ Γ, Definition 4 states that there is only one node v t in V t and this node must be one of the endpoints for each edge in E. Thus for any two edges {x, y} and {x ′ , y ′ } in g(γ) where x x ′ , it must be the case that y = y ′ = v t , because directed edges only occur between nodes in tier V m . Thus, any g(γ) satisfies the NCC.
Next we assume it holds for w ∈ Γ * of length n and consider any w ∈ Γ * , γ ∈ Γ. Then g(wγ) satisfies the NCC because the graph concatenation operation does not add any undirected edges and because, by Lemma 1 concatenation preserves the order of each tier in g(w) and g(γ).
The final theorem states that any G ∈ AP G(Γ) follows the OCP if the graph primitives do.
Proof: The proof is again by recursion on the length of w ∈ Γ * . The OCP is trivially satisfied for G λ since it contains no nodes or arcs. The case when |w| = 1 is given as the condition of the theorem.
Assume that every w ∈ Γ * of length n satisfies the OCP. Now consider G = g(wγ) with w of length n and γ ∈ Γ. To see that G u • G γ satisfies the OCP, recall from Definition 2 of graph concatenation that the set of arcs for G 1 •G 2 is equal to A 1,2 ∪R ID ; i.e., the union of A 1 and A 2 and R ID . By definition R ID only includes pairs of nodes (x, y) s.t. ℓ(x) = ℓ(y), so if G 1 satisfies the OCP and G 2 satisfies the OCP R ID will not add any arcs on V m which violate the OCP (recall that the OCP only holds for tier V m ), and so G 1 • G 2 will also satisfy the OCP.
Thus, the merging part of the concatenation preserves the OCP. One may wonder why the OCP is built in to the concatenation operation this way, instead of using string-like concatenation and then invoking a constraint that merges adjacent, like nodes in the resulting graph. Such a method, though, cannot capture violations of the OCP-all would be merged. The next section shows that the concatenation operation defined here can capture violations by concatenating OCP-violating graph primitives.
This section has thus proved the important properties of AP G(Γ). We now show how such an AP G(Γ) can be used to model autosegmental phenomena in natural language phonology.

Analysis of natural language phenomena
In this section we examine the extent to which the analysis presented here accounts for common and uncommon phenomena in phonological theory. The first two subsections examine spreading and contour tones, respectively, and demonstrate how both phenomena can be effectively represented with a AP G(Γ) for some Γ. It is also shown that the empirical generalization that there are only finitely many contour tones present in any given language is an automatic consequence of the finite alphabet Γ and the concatenation operation.
The third subsection addresses the few cases where OCP violations may be necessary to properly describe the language. It is sketched out how these cases could be accounted for by using special graph primitives or a second concatenation operation. Similarly, the fourth subsection addresses underpecification and floating tones. We conclude that these concepts can be represented in this approach. The caveat is that it is also observed as a consequence that gapped structures are also permitted. Again, we note that such gapped structures are also permitted with axioms given in §4 approaches above, and we discuss how a different concatenation operation may address this.

Spreading
The 'merging' of nodes on the melody tier models autosegmental spreading, in which one melody unit is associated to more than one timing tier unit. A classic example is Mende (Leben, 1973). Mende nouns separate into tone categories, three of which are shown in Table 1. The first rows show words whose syllables are all high-toned, the second rows show words whose syllables are all low-toned, and the third rows show words whose syllables start high and end low. In the following [á] transcribes a high tone, [à] a low tone, [â] a falling tone.

Monosyllables
Disyllables kÓ 'war' pÉlÉ 'house' kpà 'debt' bÈlÈ 'pants' mbû 'owl' ngílà 'dog' Trisyllables háwámá 'waist' kpàkàlì 'three-legged chair' félàmà 'junction' An autosegmental analysis for this pattern is that a set number of melodies spread left-to-right over the tone-bearing units (TBUs; we assume that for Mende the TBU is the syllable, σ) of a word, as in  The APRs in Table 2 can be generated with the alphabet of APG graph primitives Γ given in Figure 7. The alphabet is Σ = {H, L, σ} and the tier partition T = {T t , T m } where T t = {σ} and T m = {H, L}. Note that for these APGs, we abstract away from consonants and vowels and focus on the TBU, σ. The APGs corresponding to the trisyllabic forms are thus g(σσσ) and g(σσσ), as in Figure 8.  Table 2 These spreading effects are achieved by, for example in g(σσσ), the like H nodes from each g(σ) merging during concatenation, resulting in a single H associated to multiple σ nodes (which are not merged, because σ ∈ T t ). Note that given Σ, T , Γ and g, we are able to generate APGs directly from the linear string of toned syllables.

Contours
Concatenation allows for unbounded spreading, as a single node on the melody tier may 'merge' any number of times. In contrast, concatenation does not allow for unbounded contours, as timing tier nodes do not merge. Figure 9 shows how concatenation obtains APGs corresponding to the APRs for the Importantly, any set of graphs is going to have a bound on the number of melody units a contour can have, which follows directly from the fact that Γ is finite, that each element of Γ has exactly one node on V t , and so concatenation never creates new contours. Thus, for the example Γ we have been using for Mende, the graph in Figure 10 is not in AP G(Γ). While this is a natural property of graphs in AP G(Γ), the axiomatic approach to defining APRs requires a further axiom stating that for any language, the number of contours must be bound by some n. To our knowledge, the only explicit formalizations of such a constraint are by Jardine (2014) and Yli-Jyrä (2013) (the latter requiring that n = 2).

Violations of the OCP
As discussed in Odden (1986) and Meyers (1997), the OCP may not be an absolute universal. For example, Odden lists the contrasting APRs in Figure  11 for two nouns in Kishambaa (Odden, 1986,  This is partially motivated by the different surface pronunciation of the two forms: the first, Figure 11 (a) 'snake' is pronounced with two level H tones, nyóká, and 11 (b) 'sheep' is pronounced with a H followed by a downstepped H; ngó ! tó.
The corresponding graphs for these APRs, assuming the mora as the TBU, are given in Figure 12. Given an alphabet of graph primitives obeying the OCP, as the Γ for Mende in Figure 7, Figure 12  There are at least two solutions to admitting graphs like in Kishambaa. One is to introduce OCPviolating graph primitives, as in Figure 13. Given this alphabet of graph primitives, the spreading Kishambaa graph in Figure 12 (a) is g(γ 1 γ 1 ), and the OCP-violating (b) is g(γ 1 γ 2 ). The graph primitives follow the linear pronunciation of the morae; g(γ 1 γ 1 ) represents a sequence HH of two high-toned morae, and g(γ 1 γ 2 ) a sequence H ! H of a high followed by a downstepped high.
Another option is to define a second concatenation operation, in which there is no merging and directed edges are drawn between all last/first pairs. Spreading Kishambaa graph in Figure 12 (a) would be concatenated by the operation defined in this paper, and the OCP-violating Figure 12 (b) would be concatenated by this second no-merging operation. We shall leave it up to future work to compare the theoretical and empirical benefits of these approaches to OCP violations.

Underspecification and floating tones
Some graph primitives in Γ may not have any nodes in V m ; these represent underspecified timing units. However, such underspecified graph primitives can give rise to 'gapped structures' via concatenation, as in g(γ 1 γ 2 γ 1 ) in Figure 14. This can be seen as an unwelcome consequence as some researchers have argued against gapped structures (Archangeli and Pulleyblank, 1994). One solution could be to use a second concatenation operation which does not merge nodes, instead only drawing directed edges between the end nodes on each tier. This appears identical to the operation proposed in §7.3 for dealing with OCP violations. Again, studying additional concatenation operations will be left for future work. Finally, graph primitives with more melody tier nodes than timing tier nodes can be used to generate floating tones, as in Figure 15.

Dicussion
The examples in the previous section show several advantages to considering APRs through concatenation. One, as seen in Mende, simple cases allow direct translation of strings into APRs. Second, concatenation allows for unbounded spreading, as a single node on the melody tier may 'merge' any number of times. However, concatenation does not allow for unbounded contours, as timing tier nodes do not merge in this way. Thus, the number of contours is bounded by the number of graph primitives. This reflects the fact that languages exhibit unbounded spreading, but no language (to our knowledge) has an unbounded number of contour segments. There are several avenues for future work. It was already mentioned that the set of valid autosegmental representations may be expanded by allowing a second concatenation operation. Also, while we have shown that every element of AP G(Γ) obeys the axioms in §4, it remains to be shown that for every graph which obeys those axioms, there is a finite alphabet which generates it.
Future work can also study the nature of transformations from underlying APGs with one alphabet to surface APGs with another (for instance it is known surface APGs can admit more contours than underlying APGs through association rules).
Another line of development concerns extending the analysis to feature geometry (Clements and Hume, 1995;Sagey, 1986), in which association lines also link featural autosegments and 'organizational' nodes, such as PLACE. Deriving a set of such operations would require more complex primitives and additional marking on the tier partition T , to denote timing tier nodes, organizational nodes, and melody nodes. The concatenation operation would then need to be revised to be sensitive to this marking. A more serious challenge would be adopting a concatenation-based framework for autosegmental morphology, which as mentioned in §2, disposes of the requirement that autosegments of a particular type must appear on a particular tier.

Conclusion
In this paper we addressed the question of what is the set of valid autosegmental representations looks like. In contrast to previous research, which explored this question axiomatically, we showed that the autosegmental representations can be generated recursively and constructively from a finite set of graph primitives, a concatenation operation, and an identity element for concatenation, much in the same way that strings can be so generated. Hence, the theory of free monoids may be fruitfully applied to APRs.
The advantages we wish to highlight are as follows. First, we proved that provided the finite set of primitives obey the NCC and the OCP, the autosegmental representations will as well. Second, we showed it also follows naturally from the nature of the alphabet and concatenation that new contour tones cannot be generated ad infinitum. Finally, this method makes clear the stringlike nature of autosegmental representations, and that their properties can be viewed as a consequence of this nature.