Unification of Drags and Confluence of Drag Rewriting

Drags are a recent, natural generalization of terms which admit arbitrary cycles. A key aspect of drags is that they can be equipped with a composition operator so that rewriting amounts to replace a drag by another in a composition. In this paper, we develop a uniﬁcation algorithm for drags that allows to check the local conﬂuence property of a set of drag rewrite rules. © 2022 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).


Introduction
Rewriting with graphs has a long history in computer science, graphs being used to represent data structures, but also program structures, and even concurrent and distributed computational models.They therefore play a key rôle in program evaluation, transformation, and optimization, and more generally program analysis; see, for example, [4].
Our work is based on a recent, purely combinatorial, view of labeled graphs [9].Drags are labeled graphs equipped with roots and sprouts, which are vertices without successors labeled by variables.Drags appear as a generalization of terms, they admit roots at arbitrary vertices, sharing, and cycles.Rewrite rules are then pairs of drags that preserve variables and number of roots, hence avoiding the creation of dangling edges when rewriting.A key aspect of drags is that they can be equipped with a composition operator so that matching a left-hand side of rule L w.r.t. an input drag D amounts to write D as the composition of a context graph C with L, and rewriting D with the rule L → R amounts to replace L with R in that composition.Composition plays indeed the rôle of both context grafting and substitution in the case of terms.
To assess our claim that drags are a natural generalization of terms, we extend the most useful term rewriting techniques to drags: the recursive path ordering [8], unification (Section 3) and local confluence (Section 4).
Our first main result here is that unification is unitary and can be performed in quadratic time and space, a complexity bound which is not shown to be sharp and is possibly not.In the case of terms, unification is based on overlapping two terms at a non-variable subterm, from which a recursive propagation process takes place that identifies the labels of both term fragments as long as no label is a variable.The unification process for drags is similar, starting at a set of partner vertices at which the overlap takes place.Propagation operates on pairs of vertices which have not been propagated yet, provided no vertex in a pair is a sprout.Propagation may of course fail, for example at a pair of vertices labeled by different function symbols.When it succeeds, a most general unifier can be extracted from the propagation's result.
Our second main result is that local confluence of a set of drag rewrite rules can be checked by the usual joinability test of their critical pairs.This is so because local confluence follows easily in the non-overlapping case since the rewritten drag is then the composition of two drags that are both rewritten independently of each other.The so-called disjoint and ancestor cases that pop up in the case of terms are therefore both captured here by the same case, hence showing the advantage of packaging context grafting and substitution within a single composition mechanism.As a result, confluence is decidable for terminating finite sets of drag rewrite rules, as is the case for term rewrite rules.Comparisons with the literature are addressed in Section 5.An interesting relationship between unification of drags and of rational dags is pointed out in conclusion.

The drag model [9]
To ameliorate notational burden, we use vertical bars | • | to denote various quantities, such as length of lists, size of sets or of expressions.We use ∅ for an empty list, set, or multiset, ∪ for set and multiset union, • for list concatenation, and \ for set or multiset difference.We mix these, too, and denote by K \ V the sublist of a list K obtained by filtering out those elements belonging to a set V .[1..n] is the set (or list) of natural numbers from 1 to n.We will also identify a singleton list, set, or multiset with its contents to avoid unnecessary clutter.
Drags are finite directed rooted labeled multi-graphs.We presuppose: a set of nullary variable symbols , used to label some vertices without outgoing edges, called sprouts; and a set of function symbols , disjoint from , whose elements, equipped with a fixed arity, are used as labels for all other vertices called internal.

Definition 1 (Drags).
A drag is a tuple V , R, L, X, S , where 1. V is a finite set of vertices; 2. R : [1 .. |R|] → V is a finite list of vertices, called roots; 3. S ⊆ V is a set of sprouts, leaving V \ S to be the internal vertices; 4. L : V → ∪ is the labeling function, mapping internal vertices V \ S to labels from the vocabulary and sprouts S to labels from the vocabulary , writing v : f for f = L(v); 5. X : V → V * is the successor function, mapping each vertex v ∈ V to a list of vertices in V whose length equals the arity of its label.
The pair (R, S) is called the interface of the drag D.
We use R for both the function itself of domain Dom(R) = [1..n] and its resulting list [R (1) .. R(n)] of length |R| = n.
The labeling function extends elementwise to lists, sets, and multisets of vertices.We introduce below some classical vocabulary, mostly originating from graph theory.Definition 2. Let D = V , R, L, X, S be a drag.If v ∈ X(u), then (u, v) is a directed edge with source u and target v.We also write u X v.We also say that v is a successor of u, and u a predecessor of v.The reflexive-transitive closure X * of the relation X is called accessibility.A vertex v is said to be accessible (or reachable) from vertex u, and likewise that u accesses v, if u X * v, otherwise it is unreachable from u. u is a true ancestor of v if v is reachable from u but u is unreachable from v. Vertex v is accessible (or reachable) if it is accessible from some root, and unreachable otherwise.A sprout is isolated if it has no predecessor.A vertex u is rooted if it occurs, possibly several times, in the list R, otherwise it is rootless.A drag is clean if all its vertices are accessible, and linear if no two sprouts have the same label.
It will be convenient, in particular in examples, such as Examples 1 and 2, and pictures 1, 2, and 6, to identify a sprout of a linear drag with the variable symbol that is its label.If a drag is non-linear, with n sprouts sharing the same variable label x, we will then denote these sprouts by x 1 , x 2 , . . .x n .Similarly, the label of a vertex will be used as its name when non-ambiguous.Upper indices will be used to denote roots, the notation u n,m telling us that u appears at positions n and m in the list R. Any vertex, including sprouts, can be rooted, possibly several times.This is essential for having a nice algebra of drags.A further convention is that the n successors of an internal vertex whose label has arity n are drawn from left to right.An internal vertex labeled by a constant (of arity 0) has of course no successors, this is the same for sprouts.
Terms as ordered trees, sequences of terms (forests), terms with shared subterms (dags) and sequences of dags (jungles [18]) are all particular kinds of drags, which are clean when rooted.The drag having no vertex, called the empty drag (which is also the empty tree), is clean too.
Example 1. Four drags are depicted in Fig. 1.The leftmost one represents a term equipped with roots, namely f 2,3 (x 1 , x 4 ), made of one internal vertex labeled f , two sprouts labeled x, and the list of roots (x 1 , f , f , x 2 ).In the graphical representation, roots become arrows going from distinct integers in the interval [1, N] to each rooted vertex, where N is the number of roots in the drag.Notice also that we are implicitly assuming that the arity of f is 2.
The second drag, called D is another term, while the third drag, D , is a dag, which has no roots.Finally, D is a drag including two loops.3 f 1 4 x x Fig. 1.Four drags.
Definition 3. Given a drag D = V , R, L, X, S , we make use of the following notations: Ver(D) for its set of vertices; Int(D) for its set of internal vertices; S(D) for its set of sprouts; X D for its successor function; R(D) for its roots (list or set, depending on context); L D for its labeling function; s : x if sprout s has variable x for label, Var(D) for the set of variables labeling its sprouts; |D|, its size, for the number of accessible internal vertices plus the size of R; and in(v, D), the in-degree of vertex v, for the number of predecessors of v in D plus the number of roots of v in D.

Equality of drags
Drags are particular graphs, the name of their vertices is not relevant.The order of roots in their list is not relevant either, as we shall see when defining rewriting.What matters is whether two sprouts are labeled by the same or different variables.
Equality of drags will of course play a key role when it comes to unification.We define a vertex renaming to be a bijection between two finite sets of vertices that restricts to internal vertices and sprouts, and a variable renaming to be a bijection between two finite sets of variables.Definition 4. Two drags D = V , R, L, X, S and D = V , R , L , X , S in this order, are equal modulo renaming, namely a vertex renaming ι : V → V , a variable renaming α : Var(D) → Var(D ) and a permutation σ of [1 .. n] (we also say that D is a renaming of D), iff: (extending ι to lists of vertices in the natural way) We then write D = ι α,σ D .The drags D and D are said to be equal modulo variable renaming if ι is the identity, and identical if α is the identity as well.
The subscripts α, σ and superscript ι are usually omitted when equal to an identity.They may also be omitted when no ambiguity arises with definitional equality (which actually corresponds to identity with identical lists of roots).In particular, in the absence of annotations, equality should always be interpreted as definitional in definitions.
Equality of drags modulo renaming is an equivalence relation, since the identity is a bijection, inverse of a bijection is a bijection, and bijections compose.The notion of injection, injective on internal vertices only, is specific to drags which have variables: different sprouts sharing the same variable label must be mapped to the same vertex, and that vertex can even be the image of some (unique) internal vertex.Property (ii) implies that D has three kinds of edges: those between internal vertices of D, those between vertices which are not the image of vertices in D, and those which are the image of roots in D. This property is directly related to the definition of composition to come later.
Example 2. Let D = f (x, y, x) and D be the middle two drags of Fig. 1).The map this is the only possibility here, but there would be others if f were also rooted in D ).Let now D be like D , except that it has no sprout and three edges Morphisms ignore names.If o is a drag morphism from C to D and C is a renaming of C , then composing this renaming with o yields a morphism from C to D. This remark will be used without saying later, o being then an injection.
As expected, morphisms and injections are closed under composition.

Composition of drags
In this section we introduce a main operation on drags that generalizes the notion of substitution for trees.
A variable in a drag should be understood as a potential connection to a root of another drag, as specified by a connection device called a switchboard.A switchboard ξ is a pair of partial injective functions, one for each drag, whose domain Dom(ξ ) and image Im(ξ ) are a set of sprouts of one drag and a set of positions in the list of roots of the other, respectively.Three examples of well-behaved switchboards are given in Fig. 2: {x → 1} for the first, {x → 1, y → 2} for the second, and {x → 3, y → 2} for the third.
Sprouts labeled by the same variable should be connected by ξ to the same vertex -unless ξ is undefined for them all-which must then occur several times in the list of roots, as required by the first two conditions and the injectivity of switchboard components.These two conditions are of course automatically satisfied by linear switchboards.
Note that we could define ξ as a partial function from Dom(ξ D ) ∪ Dom(ξ D ) when these domains are disjoint sets, we have actually implicitly used this property in the above explanation and will use it in the sequel whenever convenient.
Defining its value would however require us to consider ξ as a pair of functions using R and R respectively.
Rewriting extensions play a key rôle for defining rewriting later, in which case D will stand for a left-hand side of rule and D for its context.The conditions mean that all sprouts and roots of the left-hand side of rule must disappear in the composition.
We now move to the composition operation on drags induced by a switchboard.The essence of this operation is that the union of the two drags is formed, but with sprouts in the domain of the switchboards merged with the roots to which the switchboard images refer.Merging sprouts with their images requires one to worry about the case where multiple sprouts are merged successively, when switchboards map sprout to rooted-sprout to rooted-sprout, until, eventually, thanks to wellbehavedness, a vertex of one of the two drags must be reached which is not a sprout in the domain of the switchboard.That vertex is called target: Different forms of composition: substitution, formation of a cycle, and transfer of roots.(For interpretation of the colors in the figure(s), the reader is referred to the web version of this article.) Definition 9 (Target).Let D = V , R, L, X, S and D = V , R , L , X , S be drags such that V ∩ V = ∅, and ξ be a switchboard for D, D .The target ξ * () is a mapping from sprouts in S ∪ S to vertices in V ∪ V defined as follows: where n = ξ(s).
The target mapping ξ * (_) is extended to all vertices of D and D by letting ξ Example 3. Consider the last of the three examples in Fig. 2, in which a drag D, whose list of roots is R = [ f h x] (identifying vertices with their label) is composed with a second drag whose list of roots is R = [g y y], via the switchboard {x → 3, y → 2}.We calculate the target of the two sprouts: We are now ready for defining the composition of two drags.Its set of vertices will be the union of two components: the internal vertices of both drags, and their sprouts which are not in the domain of the switchboard.The labeling is inherited from that of the components.

Definition 10 (Composition).
Let D = V , R, L, X, S and D = V , R , L , X , S be drags such that V ∩ V = ∅, and let ξ be a switchboard for D, D .Their composition is the drag If ξ D is surjective and ξ D total, then all sprouts of D disappear in the composed drag, while all vertices of D which are also roots become rootless vertices in the composed drag.
Example 4. We show in Fig. 2 three examples of compositions.The first is a substitution of terms.The second induces a cycle.In that example, the remaining root is the first (red) root of the first drag which has two roots, the first red, the other black.The third example shows how sprouts that are also roots connect to roots in the composition (colors black and blue indicate roots' origin, while red indicates a root that disappears in the composition).Since x points at y and y at the second root of the first drag, a cycle is created on the vertex of the resulting drag which is labeled by h.Further, the third root of the first drag has become the second root of the result, while the first (resp., second) root of the second drag has become the third (resp., fourth) root of the result.This agrees of course with the definition, as shown by the following calculations The third computation illustrates the fact that composition impacts the list of roots in complex ways.Here, the second root of the left drag (in the composition) pointing at vertex x becomes the second root in the result, pointing now at vertex h, while vertex x has become vertex h.Likewise, the first root of the right drag has become the third root of the result, both pointing at vertex g, and the third, pointing at vertex y, has become the fourth, pointing now at h.The definition of composition does not assume any property of the input drags.Composing a single-rooted clean drag D having at least one internal vertex with a non-clean drag C consisting of a single non-rooted sprout labeled x, has an observable effect on D: the result of the composition C ⊗ {x →1} D is the drag D , which is D deprived of its root, hence is non-clean since D has internal vertices.In other words, any clean drag D can be sent by an appropriate composition to a drag whose set of accessible vertices is empty.This also implies that D can be sent to any drag U , once cleaned, by taking We conclude this section by showing that drag equality is observational: Lemma 1.Let D, E be drags that are equal modulo renaming, and C , ξ an extension of D. Then, there exists an extension C , ζ of E such that C ⊗ ξ D and C ⊗ ζ E are equal modulo renaming.
and to extend the bijections ι and σ as the respective identities on the vertices of C which do not belong to Dom(ξ C ), and on the roots of C which do not belong to Im(ξ C ).
The important observation is that σ becomes the identity in case ξ C is surjective, hence explaining why the order of roots in drags is irrelevant as far as rewriting is concerned.

Drag algebra
Composition has important algebraic properties, existence of identities and associativity [7].We recall the second which will be needed later on, and describe a particular case for which composition is commutative.
Lemma 2 (Associativity).Let U , V , W be three drags sharing no vertices nor variables.Then, there exist two switchboards ζ and ξ for respectively (V , W ) and (U , V ⊗ ζ W ) iff there exist two switchboards θ and γ for respectively (U , V ) and (U ⊗ θ V , W ) such that Furthermore, γ is a rewriting switchboard if ξ, ζ are rewriting switchboards and ξ is a rewriting switchboard if γ , θ are rewriting switchboards.
Lemma 2 is proved in a particular case in [9].We give here the proof for the general case that is needed later in the proof of Lemma 6.We will need restrictions of ξ, ζ, γ , θ to some subsets of their domain and target, such as ξ V →U whose domain is the subset of sprouts of V which are sprouts of V ⊗ ζ W and image is the list of roots whose corresponding vertices belong to U .Likewise, ξ U →W is the restriction of ξ U whose image is the list of roots whose corresponding vertices belong to Proof.We carry out one direction of these statements, the other having obviously the same proof.We define θ and γ so that they define the same sets of switchboard components as ξ and ζ , hence ensuring that both compositions are identical as we shall show.A difficulty is to show that these definitions are well-behaved injective maps, as required for switchboards.The property of ξ and ζ that makes it all true follows from the fact that the expression ), since occurring inside a pair of parentheses: The definition of θ, γ , which can be easily followed on Fig. 3, is by cases on the domains of the switchboards ξ, ζ : It is then easy to verify that θ, γ satisfy (*): We first show that both compositions define the same drag, that is, that and we are done.
We show now that θ and γ are switchboards.By their definition by (disjoint) cases, they are maps.θ is injective since so is ξ .For γ , injectivity results from injectivity of ξ and ζ and the assumption that U , V , W do not share vertices.The coherence conditions (1) and (2) follow from the coherence conditions for ξ, ζ and the assumption that the sets of variables of the drags U , V , W are pairwise disjoint.We are left with well-behavedness.
Assume there exists a cycle among the sprouts of U , V , W for either γ or θ .Then, there would exist a cycle among those sprouts involving ξ and ζ .Since and ζ are well-behaved, this cycle must alternate ξ and ζ sequences.By property (*), the only possible sequences of ξ and ζ are of the form, using ξ, ζ relationally, s ζ * ξ * t.But again, (*) imposes that s = t.
So, no cycle using ξ and ζ is possible, and therefore θ and γ must be well-behaved.
We are left showing that γ is a rewriting switchboard if so are ξ, ζ .By its definition, and since ξ U and ζ V must be linear, so is γ U ⊗ θ V .And since surjectivity of ζ V →W implies surjectivity of γ U ⊗ θ V , we are done.
Remark 1.Note that we do not claim that θ is a rewriting switchboard when so are ξ, ζ .We will not need it, fortunately, since it is not true: ξ U →V is surjective on the roots of V which are not already eaten by ζ W →V , but not on all roots of V .The composition of two drags D, D is obviously commutative (modulo a circular permutation of their respective lists of roots):

Lemma 3 (Commutativity). Let D, D be two drags sharing no vertices and ξ a switchboard for (D, D ).
where σ is a circular permutation of roots which is the identity if ξ is a rewriting switchboard.

Drag rewriting
Rewriting with drags is similar to rewriting with trees: we first select an instance of (some renaming of) the left-hand side L of a rule in a drag D by exhibiting an extension W , ξ such that D = W ⊗ ξ L -this is drag matching, then replace L by the corresponding right-hand side R in the composition.First, we define what kind of drags is allowed in rules: Definition 11.A pattern is a clean drag containing no isolated sprout and all of whose vertices have at most one root, i.e.
for each vertex u, R(u) ≤ 1.A renaming of a pattern L away from a drag D is a renaming L of the pattern L such that Var(D) ∩ Var(L) = ∅.

|, and (iii) Var(R) ⊆ Var(L).
A set of drag rewrite rules is called a drag rewriting system.
Condition (i) does not show up in [9].Although it seems restrictive, it is not.Following Fig. 4, assume we need to match a drag D = h(a) which has two roots at vertex h named 1 and 2, with the left-hand side of a rule L = h(x) → x which has a single root at h named 1.Matching would be straightforward if h (in h(x)) had two roots, but is nevertheless possible with a single root: take the extension z ⊕ a, {z → 1, x → 3} , where z has two roots, 1 and 2, and a a single root numbered 3.
Then, root transfer will ensure that the result is indeed D (up to drag equality).By exploiting the root transfer mechanism, condition (1) will slightly simplify unification of patterns as well as the confluence section.
Condition (ii) and (iii) ensure that L and R fit with any extension C , ξ of L, since switchboards map sprouts to positions in a list of roots.Both lists being of the same length ensure that any position in the list R is a position in the list R .There is however a difficulty to be faced later: the switchboard ξ does not necessarily satisfy well-behavedness with respect to R.  Rewriting drags uses a specific kind of switchboard, which allows one to "encompass" a pattern L within drag D, so that all roots and sprouts of L disappear from the composition: We now need an important observation absent from [9]: The composition C ⊗ ξ L yields a drag whose internal vertices are those of C and L , which explains the need for renaming L, since D and L are both given.Note that the injection o plays the same rôle as a position in the case of trees.We will use this facility when defining rewriting.
Usage of term rewriting systems has sanctified the "match" from a term L to a term D as being the substitution ξ identifying L ξ with a subterm of D, the context C obtained by removing Lσ from D being ignored.Here, we insist that the match is made of both the context C and the switchboard ξ .
Proof.Since D and L have no variable, in common, we can assume w.l.o.g. that all vertices of D are internal.

Given a rewriting extension
The obtained map o is the identity, hence injective, on internal vertices of L, preserves the successor function, and forces sharing since two sprouts labeled by the same variable x are mapped by ξ * to the same vertex by the compatibility property of a switchboard.We are left showing that o satisfies property (ii) of injections, that is, an internal vertex v of L is rooted if there exists a new edge in D whose target is o(v).Since a new edge can only be the result of the composition, this can happen in two different ways, v being necessarily rooted in both cases: some sprout t successor in C of some u is mapped to v by ξ * resulting in the new edge (u, o(v)); or some sprout s successor in L of some u is mapped successively to a sprout t of C by ξ L and then to v by ξ * , resulting in the new edge (o(u), o(v)).
Conversely, we construct the rewriting extension C , ξ of L from the given map o as shown at  vertex of A and one of B, going one way or the other.In all these three cases, the corresponding edges in D will have to be reconstructed by the composition C ⊗ ξ L. This requires to appropriately define the sprouts of C and the switchboard ξ .Let W be a set of fresh sprouts t v , with v ∈ Int(L) hence o(v) ∈ A, such that one of the following two conditions hold: We define the set of vertices of C to be B ∪ W .Before defining the successor function and roots of the drag C , let us define the switchboard ξ as follows (we abuse our notations for simplicity): -For each sprout We now define the successor function of Finally, we define the roots of C as follows.
-w ∈ B has n s roots in C , where n s is the number of sprouts of L mapped by o to w ; -t v ∈ W has n s roots in C , where n s is the number of sprouts in L mapped by o to v.
We can now show that ξ is a switchboard, implying easily that C , ξ is a clean rewriting extension, the difficult part being that sprouts must be mapped injectively to rooted vertices.Since the rôle of composition is to build new edges, there are three different situations.Blending two of them, we get two cases: ξ L by definition of the number of roots defined for the vertices in C ; for ξ C , we claim that v has at least one root for mapping t v to that root, which is a consequence of the new edge property of o. 2. there exists an edge o(w) X D w in D with o(w) in A and w in B and sprouts s mapped to w (both must occur for a given w).Indeed, w is the i-th successor of o(w) in D iff the i-th successor of w in L is a sprout s such that o(s) = w .We are left showing that w has enough roots for mapping to w all sprouts s such that o(s) = w , which follows from the definition of the number of roots for w in C .
Compatibility follows from the property that morphisms force sharing; well-behaved-ness is trivial, as is totality of ξ L and surjectivity of ξ C .The verification that C ⊗ ξ L = D can be read on Fig. 5.

Example 5 (Example 2 continued).
This example depicted at Fig. 6 illustrates the correspondence between matching and injections described in Lemma 4. Let C be the drag z ⊕ h(z 4 ), z having 3 roots named 1, 2, 3. Conversely, let o be that injection.Using the notations of the proof of Lemma 4, we get A = { f } and B = {g, a}.Vertices in {x 1 , y, x 2 } are all mapped to a, and f is mapped to f by o.We get W = {t f } (note that only the first condition is satisfied for generating Verification that the composition yields D is left to the reader.Definition 14 (Rewriting).Given a drag rewrite system R, a drag D rewrites at position o to a drag D with a renaming Rewriting and cycles. 1 Rewriting and connected components.
Notice that, according to the previous definition, if D rewrites to D , we have that C , ξ is a rewriting extension of both, D and D , which means that D is the result of replacing L by R in D.

Remark 2.
The assumption that C , ξ is a rewriting extension for R does not always follow from the assumption that it is one for L as one could expect.The point is that the switchboard ξ of the extension does not satisfy all properties needed to be a switchboard for R .Take for example f (x) → x, with one root on each side.Take now for D a loop on a rooted vertex labeled by f , for C the context reduced to the sprout y with two roots, and ξ = {x → 1, y → 1}.C , ξ is well-behaved for f (x) but ill-behaved for x since x is mapped by ξ to y and y to x.It turns out that there exists no switchboard ξ well-behaved for both L and R that permits rewriting this cycle with this rule, a degenerated case that went unnoticed in [9].This is why we need to assume in the definition that C , ξ is a switchboard for both L and R .Checking whether the switchboard ξ for L is also a switchboard for R is only needed for rules whose right-hand sides is a variable, hence is not really painful in practice.
Because ξ is a rewriting switchboard, ξ C must be linear, implying that the variables labeling the sprouts of C that are not already sprouts of D must all be different.Then, ξ C must be surjective, implying that the roots of L , hence those of R , disappear in the composition, a case where the composition is commutative -we shall mostly write the context on the left, though.Further, ξ L must be total, implying that the sprouts of L (hence those of R ) disappear in the composition.Finally, D and C being clean, it is easy to show that D is clean as well, which is therefore a property rather than a requirement.
In the sequel, we adopt the convention that L and R are renamed appropriately, whatever D is, that is, that rules in R are defined up to renaming of their vertices.We also use −→ * , as is usual, for the transitive closure of the rewriting relation.
Example 6.In Fig. 7, the (red) rewrite rule g( f (x)) → h(x), whose roots are g and f on the left-hand side and h and x on the right-hand side, applies with a blue context, colors which are reflected in the input term (showing the rule applies across its cycle) and output term.
Example 7.This time, the rooted term f (a, b) in Fig. 8 is rewritten to the drag made of two components, the rooted terms g(a) and b.Note that allowing the non-clean right-hand side made of the rooted drag g(x) and the non-rooted term y, as in [9], would result in the clean rooted term g(a), the component b being then rootless and thrown away.
Lemma 4 is important, since it implies that the result of rewriting a drag at some position o is unique, as it is for trees.
Rewriting drags is of course monotonic with respect to composition, which subsumes monotonicity and stability of rewriting terms:

Lemma 5 (Monotonicity). Assume that D−→ L→R D and let C , ξ be an extension of D such that
We are now finished with the material from [9] needed for the rest of this paper.

Unification
Unification of two terms s, t is somewhat simple: a substitution applied to both identifies them (makes them identical).Assuming s, t share no variable, this substitution is simply the union of two substitutions, one for s and one for t.A substitution is just a particular case of composition as we have seen at Example 7, using a switchboard whose one component is empty, hence our definition of unification will be based on composition: two patterns U , V are unified by composing them with some rewriting extensions C , ξ and D, ζ , resulting in the same drag W , same referring here to drag equality modulo renaming.
We could be satisfied with that definition, but we also want to take care of our particular use of unification to characterize drags, called overlaps, that can be rewritten in two different ways with two rules L → R and G → D. In the case of terms, one of L, G stands above the other in the overlap, that is, G is unified with the subterm of L at some position p, or vice versa.If σ is a unifier, the overlap is then either Lσ or Gσ (or both if p is the root position).The situation is different with graphical structures, none is above the other, they just share some common subdrag.Two drags U , V are therefore unified at partner vertices (u, v), the solution being a pair of extensions C , ξ of U and D, ζ of V that identifies C ⊗ ξ U and D ⊗ ζ V at these partner vertices.
Definition 15.Given two drags U , V sharing no vertices, we call partner vertices two lists L U , L V of equal length of internal vertices of U and V , respectively, such that no two vertices u, u ∈ L U (resp., v, v ∈ L V ) are in relationship with X U (resp., The two lists of partner vertices u and v can also be organized as a set of unordered pairs {(u i , v i )} i .The order between the elements of a pair is not important since one must be in U and the other in V , and U , V share no vertex, hence eliminating any potential ambiguity.
, is a pair (U , V ) of patterns that have been renamed apart, along with partner vertices P = (u, v).A solution (or unifier) of the drag unification problem ) the (possibly empty) set of all its solutions.
The overlap drags W = C ⊗ ξ U and W = D ⊗ ζ V witness the property that U and V are embedded in W and W respectively, and that these two embeddings, o and o coincide at a list of partner vertices (condition (ii)) and recursively at their successors (condition (i)), but not at their ancestors which are unreachable from either u or v (conditions (iii) and (iv)).Note that we could have allowed W = C ⊗ ξ U and W = D ⊗ ζ V to be equal modulo an arbitrary renaming including a variable renaming: these two definitions are actually equivalent.

Solutions of a unification problem are defined with the context drag coming first in the products
which is of course consistent with our definitions of rewriting and rewriting extensions.We will stick to this convention in the sequel, even if it does not actually matter since composition is commutative.
Example 8 (Figs. 9 and 10 ).Let U = f (h(x)) and V = g(h(a)) in Fig. 9, in which U has two roots, f and h in this order, and V has two roots g and h in this order (root numbers of U , V being written in bold face on the figure).Let the partner vertices be {(h, h)}.
Consider the rewriting extension C , ξ such that C = z 1 ⊕ g(z 2 ) ⊕ a with three roots at z 1 , g and a in this order and ξ = {z 1 → 1, z 2 → 2, x → 3}.Then C ⊗ ξ U is the drag with two roots at f and g in this order sharing the subdrag h(a).
Consider now the rewriting extension D, ζ such that D = f (y 1 ) ⊕ y 2 with two roots at f and y 1 in this order and Let us now consider Fig. 10, with drags U and V as in the previous case, except that V = f (h(a)), with partner vertices {(h, h)}, as before.In this case, the drag W = f (h(a)), with two roots on f , would not be an overlap of U and V below {(h, h)}, i.e., it will not define a correct solution to this unification problem.The reason is that conditions (iii) and (iv) of Definition 16 would not be satisfied, because of identifying the two vertices f , which are above the partner vertices.That is, the correct solution, depicted in Fig. 10, should not identify these vertices.Indeed, we want unification to be minimal, that is, to capture all possible extensions that identify U and V .In a first subsection, we define the subsumption order on drags (and drag extensions) and show that it is well-founded.This order aims at defining precisely the notion of minimality of solutions.In a second subsection, we show that unification of drags is unitary, as for terms and dags.

Subsumption
Definition 17.We say that a clean drag U is an instance of a clean drag V , or that V subsumes U, and write U V , if there exists a rewriting extension C , ξ of V such that U = C ⊗ ξ V .
Note that V being clean, U must be clean as well by definition of a rewriting extension.
In the following, we assume for convenience that the sprouts of U , V are labeled by different sets of variables.Lemma 6. is a quasi-order on clean drags, called subsumption, whose strict part is a well-founded order.Two clean drags are equivalent modulo subsumption iff they are equal modulo variable renaming and the number of roots of their rooted vertices coincide.The subsumption quasi-order for drags, despite its name, does not generalize the subsumption quasi-order for terms, which does not take the context into account, but only the substitution.The existence of cycles in drags makes it however impossible, in general, to separate the substitution from the context.Our subsumption quasi-order corresponds therefore to what is called encompassment of terms, that is, a subterm of one is an instance of the other.On the other hand, its equivalence generalizes the case of terms, since encompassment and subsumption for terms have the same equivalence.
Given a clean drag D whose one vertex u is rooted, it is always possible to add new roots at u by composition with a rewriting extension, namely the rewriting extension z, {z → u} , where z is a fresh many-rooted sprout.It is possible as well to remove a root (if u is accessible from some other vertex in D in case it has a single root) by composition with the rewriting extension z, {z → u} where z is a fresh rootless sprout.On the other hand, a single-rooted vertex u that is not accessible from any other vertex cannot loose its only root by composition with a rewriting extension.So, there may be many equivalent rewriting extensions of a pattern L, whose compositions with L will only differ in the number of roots of the vertices of the resulting drags.This will be used to ease the construction of most general unifiers, by choosing the number of roots that makes unification easiest.

Marking algorithm for unification
Since subsumption is well-founded, the set of solutions of a unification problem U [u]= V [v] has minimal elements when non-empty.What is yet unclear is how to compute them, and whether there are several or one as for terms.This is the problem we address now.More precisely, we describe a marking algorithm that computes an equivalence relation between the vertices of two drags U , V to be unified, from which their most general unifier is extracted in Section 3.6 if no failure occurs.The algorithm consists of a set of transformation rules operating on the drag U ⊕ V , where some pairs of vertices may already hold a mark, meaning that they are equivalent and that they should be identified by any solution of the given unification problem.The rules construct this equivalence by marking new pairs of vertices of U ⊕ V , in the style of Patterson and Wegman unification algorithm [31].A related idea appears even earlier in [22].Our treatment is very close to the latter.The algorithm includes also failure rules that return the special expression ⊥.
Identifying C ⊗ ξ U and D ⊗ ζ V at a pair of vertices (u, v) requires that u and v have the same label, and that the property can be recursively propagated to their corresponding pairs of successors.Since C , D are yet unknown, this propagation takes place on vertices of U and V , hence on the drag U ⊕ V .To organize the propagation, we shall mark the pair (u, v) with a fresh red natural number before the propagation has taken place (the initial partner vertices will hold marks 1, . . .|u|), and turn this mark into blue once the propagation has taken place.In case one of u, v is a sprout, no propagation occurs, it is enough to turn the red mark into blue (in practice, we can mark it in blue from the beginning).To ensure freshness, we shall memorize the number c of pairs of vertices that have been marked so far, and increment c by one at each use of a mark.The drag U ⊕ V in which some pairs of vertices hold a same mark is called a marked unification problem.Two vertices u, v of a marked unification problem U ⊕ V (sometimes denoted U ⊕ V [u][v]), are on the same side if they both belong to either U or V , and on opposite sides otherwise.
Propagation (rule Propagate) computes therefore a succession of marked unification problems, denoted by U ⊕ m V , starting with the marked unification problem U ⊕ 0 V whose marked vertices are exactly the partner vertices.Propagation will stop when there are no more pairs of internal vertices holding a red mark, unless one of the following two situations occurs: (i) two sprouts v, w hold the same variable; (ii) some vertex u marked with both i and j provides a link between two other different vertices v and w marked i and j respectively.
In both cases, the pair of vertices (v, w) must now be marked by rules Variable case and Merge, respectively, if not marked already, giving then possibly rise to new propagations rules.
When no rule is applicable, the procedure stops at some step k, where some of the vertices of U ⊕ V are marked and the others unmarked.At that point, an internal vertex u in U ⊕ V is said to be singular if it doesn't share a mark with another internal vertex.Vertices that are unreachable from the partner vertices are particular singular vertices, but some reachable internal vertices may also be singular.Note that singular vertices may share marks with sprouts.
Failure rules detect situations where unification of U ⊕ k V is not possible.There are three of them: (i) two internal vertices sharing a red mark hold a different label (rule Symbol conflict); (ii) two internal vertices of the same drag share a red mark, since a unifying solution cannot identify them (rule Internal conflict); (iii) the absence of root at a given vertex makes it impossible to build a unifying solution from the resulting marked unification problem (rule Lack of root).This is the case when a sprout s and an internal vertex u of the same drag share the same red mark.We then need to identify them, implying that all edges incoming to s should be transferred to u, hence requiring that u is rooted.The other case is when no root is available to mimic singular vertices on the side where they are missing.
We assume that vertices singled out in the precondition of a rule are pairwise different, and that a pair of vertices sharing a mark is never marked again.

Definition 18 (Marking algorithm). Given a unification problem
the marking algorithm computes a sequence of marked unification problems U ⊕ 0 V , . . .U ⊕ m V , where U ⊕ 0 V is the result of marking u 1 , v 1 with 1, . . ., and u n , v n with n.Then, U ⊕ m+1 V (m ≥ 0) is defined by the application to U ⊕ m V of one of the 7 rules given below.
We assume that c is the number of pairs of vertices that have been marked so far, and that u : h • i 1 • • • • •i j denotes the vertex u labeled with h in the drag U ⊕ V and holding the marks i 1 , . .., i j , painted in either blue or red, in the drag U ⊕ m V .

Propagate:
, and u and v are on the same side, then U ⊕ m+1 V is ⊥.

Lack of root:
-v is a sprout on the same side as u, or -u and v are on opposite sides and there is a singular vertex w in U ⊕ m V whose one successor of v, then U ⊕ m+1 V is ⊥.Remark 3. Some rules are reminiscent from the unification rules for terms [7], although we don't use the same rule names except for Merge.For example, we use Propagate here rather than Decompose to stress the fact that drags cannot be treated as terms.The failure rules also depend on the roots present in an equivalence class, since drag equality checks their number at all pairs of corresponding vertices.Remark 4. Unification of finite terms differs from unification of infinite rational terms by only one rule called occur-check.Since terms and rational terms are two particular cases of drags, one might expect that the occur-check rule applies in case the occur-check cannot be solved by forming a cycle.This is indeed a particular case of the first alternative in Lack of root.
Example 9.In our example of Fig. 11, unification of the initial two drags proceeds in eleven steps and succeeds.Propagation steps are labeled by the red mark processed.For instance, in the first step, we apply the Propagation rule to the pair of vertices g • 1.As a consequence, their successors labeled h are marked 2 and the marks in the two vertices f • 1 are now blue.The same happens with the second step, where Propagation is applied to the pair of vertices h • 2, causing that their successors, the two vertices labeled f on the one hand, and vertices labeled z, g on the other hand, are marked 3 and 4, respectively.In the case of Transitivity, steps are labeled by the generated mark.This explains why some steps have the same label.For instance, this happens with the two steps labeled 9, where the first one is the application Transitivity step to vertices f • 3 • 7, f • 3, and y • 7.As a consequence, mark 9 is added to f • 3 and y • 7.
When a red mark labels a sprout, we apply the Variable case rule, as in step 4, and the mark is simply turned blue.Example 10.An example of failure is given at Fig. 12.The first 5 steps are all Propagation or Variable case steps.Step 6 is a Transitivity step and step 7 an Internal conflict, since the two vertices marked 7 are on the same side and both are internal.
Note that we violate our definition of a unification problem by having a common variable z across the "=" sign.We could of course have two different variables z, z , and a third successor to the f vertex, a sprout labeled z on the left, and a sprout labeled z on the right.We would then satisfy the constraint to the price of a few more steps before finding the failure.Carrying out the precise calculation in this case is left as an exercise.
An important, immediate property of the unification rules is termination: Lemma 7. Unification rules terminate.
Proof.Since a pair of identical vertices is never marked, a pair of different vertices is never marked twice, and added sprouts take place of unreachable vertices that are never marked, the number of marked vertices of a unification problem

Unification congruences
Correctness of a set of unification rules is the property that the solutions of a unification problem are preserved by application of the rules, until some normal form is obtained which contains them all.Defining precisely what preservation means is the problem we tackle now.
As a general fact, congruences are at the heart of unification and of unification algorithms.In our case, solutions define congruences, and markings define congruences as well.Preservation then relates both kinds of congruences, those defined by markings being coarser than those defined by solutions.
The notion of congruence on terms applies to drags directly: Definition 19.An equivalence relation ≡ on the set of vertices of a drag U ⊕ V is a congruence if it satisfies the following properties: 1. any two equivalent internal vertices u, v have identical labels; 2. the successors of equivalent internal vertices are pairwise equivalent; 3. sprouts with identical label are equivalent.
The main difference between terms and drags is that the latter may have cycles, hence a sprout can be equivalent to any other vertex in a given drag while it cannot in a term.
We now define the congruence associated with the solutions of a given unification problem: we associate the least equivalence = S on the vertices of U ⊕ V such that: Note that, if w, w are both sprouts of U (or of V ) holding the same variable, then they must be sent to the same vertex by ξ (or by ζ ), hence they are equivalent.

Lemma 8 (Unification congruence). Given a solution S of a unification problem U
.|u|] : u i = S v i , and = uni f is a congruence generated by the set of partner vertices.
Proof.First, = S is an equivalence associated with a solution, it is therefore a congruence.Further, ξ * (u i ) =u i and ζ * (v i ) = v i since u, v are internal, and v i =ι(u i ) since S is a solution.It follows then that ≡ uni f is the least congruence satisfying this same property, hence is generated by the partner vertices.
Since unreachable ancestors of partner vertices cannot be identified by a solution, it follows that the unification congruence of a solvable unification problem does not contain unreachable ancestors of its partner vertices.
We now define the congruence computed by the unification rules: Definition 21.Given a marked unification problem U ⊕ k V , we denote by ≡ k the binary relation on the vertices of the drag U ⊕ V generated by all pairs of vertices that share a common mark.When unification succeeds, we define marking equivalence ≡ uni f as k ≡ k .
The rules show that vertices of Since the unification rules never remove markings, ≡ k is monotonically increasing with k: Lemma 9. ≡ k ⊆≡ l for all l ≥ k such that ≡ l is defined.
It follows that ≡ uni f coincides with ≡ n defined by U ⊕ n V , the obtained normal form of U ⊕ 0 V at step n.We believe that this normal form is unique, a property not needed here.

Lemma 10 (Marking congruence). ≡ uni f is a congruence on the vertices of U ⊕ V generated by (u, v).
Proof.By definition, ≡ k is symmetric, hence is an equivalence thanks to Transitivity.Hence ≡ uni f is an equivalence.Since unification has terminated with success, Propagation and Lemma 9 ensures the first two properties of a congruence, and Merge and Lemma 9 the third.Finally, Initialization and Lemma 9 ensure the last required property.
It should by now be clear that, although they are defined quite differently, = uni f and ≡ uni f are nevertheless the same congruence on the vertices of U ⊕ V , and that's why we adopted a very similar notation.The proof of this key property is the matter of the next two sections.
Proof.Follows directly from Definition 22.
The equivalence classes of a congruence in solved form can therefore contain any number of sprouts, but at most one internal vertex from each drag U , V .
We now show that the unification rules deliver solved forms: Lemma 14.Assuming the unification problem U [u] = V [v] does not fail, the equivalence ≡ uni f defined on the vertices of U ⊕ V by a marked unification problem in normal form is the least congruence in solved form generated by (u, v).
Proof.By Lemma 7, ≡ uni f is well defined, and by Lemma 10, it is a congruence generated by (u, v).It is the least such one generated by these partner vertices, since it contains them and any congruence is closed under Propagate, Transitivity, and Merge.
We are left showing that a failure rule applies to unification problems when ≡ uni f is not in solved form, contradicting the assumption of a successful unification.
for some i, hence Internal Conflict applies at all steps from k + 1.

Assume the equivalence class of u contains a single other vertex
By assumption, the class of v contains the same two elements at all steps k ≥ k.Hence Root conflict applies to the result of unification.3. Let u ∈ Int(U ) be rootless.We proceed by contradiction, showing in both sub-cases that a failure rule applies to the result of unification, contradicting the assumption of success.Assume there exists a vertex v equivalent to u which is either a sprout of U or the successor in V of a vertex w which is singular.By definition of ≡ uni f , there exists some for some i, hence Lack of root applies at step k by Lemma 12. Since U ⊕ V remains unchanged during the monotone marking process, Lack of root applies at all steps k ≥ k, hence to the result of unification.

Construction of the most general unifying extensions
We now show that a solved form is always solvable, hence their name.Here, the input is a congruence in solved form, which can be seen as a specific unification problem.We therefore construct a most general unifying extension for that solved form.

Definition 23.
[mgu] Given a unification problem U [u] = V [v] and an equivalence ≡ on the vertices of U ⊕ V which is in solved form, we define the unifying extensions C , ξ of U and D, ζ of V , as well as the renaming ι : Let S = {u, s 1 , . . ., s m S ; t 1 , . . ., t n S , v} be an equivalence class containing internal vertices u from U and v from V that are possibly absent, m S ≥ 0 sprouts {s i } i originating from U and n S ≥ 0 sprouts {t j } j originating from V .The construction is by case on the form of S.
At step 1, we set up an infrastructure of fresh sprouts in C and D that will serve connecting C , D with U , V and ensuring that each vertex in the composition has zero or one root.At step 2, we define the mapping ι.At step 3, we define the successor functions for C and D.
1.For each rooted internal vertex u in U (respectively, v ∈ V ) belonging to some class S, we include a fresh sprout s S : x S in C , (resp.t S : y S in D), with i + m S roots (resp., i + n S roots), where i is the number of roots of the internal vertex v from V (resp., u from U ) belonging to S if there is one, otherwise 1.
For each rooted sprout r in U (resp., in V ) such that r belongs to the equivalence class of an internal vertex in U (resp. in V ), we include a fresh rootless sprout s r in C (resp., t r in D).
Define ξ C (s S ) = u, ξ C (s r ) = r, and for each sprout and for each sprout t j in S, ξ V (t j ) = t S ).
% the root of s will disappear in the composition, while the edges of C ending in s S will then end up in u.
2. -For each class S containing internal vertices u, v from U , V respectively, define ι(u) = v.
-For each class S containing a (necessary single) singular internal vertex u from U (resp., v from V ), include in D (resp., C ) a fresh internal vertex u S , equipped with n S + r roots, where r is the number of roots, zero or one, of u, with L D (u S ) = L U (u) (resp., v S equipped with m S + r roots, r is the number of roots of v, and  3.For each internal vertex v S in C (resp., u S in D) associated with the class S, let We define X C (v S ) (resp.X D (u S )) as the tuple w 1 . . .w k , where, denoting by S i the class of v i (resp., u i ): - Before showing that we have defined a solution, we develop two examples.In both cases, the given congruence in solved form is obtained from the marking congruence resulting from applying the unification algorithm.The solution obtained is therefore the most general one for the starting unification problem, not only for the solved form.
Example 11.Fig. 13 shows the two marked drags obtained at Fig. 11 by our unification algorithm, as well as the context drag, switchboard, and overlapping drags obtained by composition with the two marked drags.
The equivalence on vertices is in solved form and has 5 classes, whose elements are given by the name of their drag (U or V , we assume that U is the drag on the left and V on the right), their label and their marks.
At Step 2, we include in C an internal vertex, which mirrors the singular vertex labeled a in V , with 1 root, since there is a single sprout in U .Accordingly, we define ξ U (x) = 1.
Example 12. Fig. 14 shows how cycles may result from unifying non-linear drags.The congruence obtained by unifying the input drags at the pair of roots labeled by f has 3 classes: At step 1, corresponding to the rooted internal vertices in U ⊕ V , sprouts x 1 , x 2 are included in C and sprouts, y 1 , y 2 in D, all of them with 1 root each.Accordingly, ξ C (x 1 ) Finally, at step 3, the successor of the vertex labeled h in C is defined to be x 2 and the successor of the vertex labeled h in D is defined to be y 2 .
Note that, in the figure, the lefthand side h vertex of the unified drag is the mirror of the h vertex of the righthand side input drag in the left overlap, while the righthand side h vertex is the mirror vertex of the h vertex of the lefthand side input drag in the right overlap.So, both overlaps are not really identical as drags, although their drawing is the same.Proof.We show first that C , ξ and D, ζ are rewriting extensions.We carry out the proof for C , ξ , the other being similar.
-First, the switchboard ξ is clearly well-defined.
-Totality of ξ U : each sprout s of U belongs to some class S, hence is mapped by ξ to s S .
-Surjectivity and linearity of ξ C : by construction, for every internal rooted vertex, say u, belonging to a class S, there exists s S in C such that ξ(s S ) = u.And for every rooted sprout r in U , there is a sprout s r in C such that ξ(s r ) = r.
Finally, C is linear by construction.
-Cleanliness: let u be a vertex in C ⊗ ξ U .By totality, it can't be a sprout of U .
If u is an internal vertex of C , then none of its ancestors can be vertices of U , hence all are mirror vertices of vertices in V with the same number of roots by construction, which ensures that u is accessible in C ⊗ ξ U .If u is an internal vertex of U , then u is accessible in U , hence in C ⊗ U from some vertex u which is rooted in U .If u is rooted in ⊗ ξ U , we are done.Otherwise, u must have a predecessor in C , which is accessible, as we have already proved.
If u is a sprout of C , then u is a fresh sprout s S , the equivalence class S being a set of sprouts of U , V .In that case, depending whether all sprouts are on the V side or not, u has an ancestor in either U or C , which must be accessible by the two previous cases, hence u is accessible.

Let us show that
-By definition the labels of internal vertices in C ⊗ ξ U and D ⊗ ζ V coincide.Moreover, for each pair of sprouts s S and t S , their label is x S in both drags (having x S and y S = x S instead would require an additional variable renaming to show that both drags are equal modulo renaming).
-For each w -Finally, the number of roots at each vertex u in C ⊗ ξ U is equal to the number of roots at ι(u As an important consequence, we have Corollary 1. Unification congruence = uni f and marking congruence ≡ uni f coincide.
be a unification problem.The result is clear if it is unsolvable.Otherwise, let ≡ be a congruence in solved form for that problem.By Lemma 15, ≡ is the equivalence associated with the unifying extensions introduced at Definition 23, hence is coarser than = uni f which is the intersection of all equivalences associated with the solutions of a given unification problem.Now, it is easy to see that = uni f is itself a congruence in solved form, hence is coarser than ≡ by Lemma 14.

Completeness of the unification algorithm
Theorem 1.Let ≡ be the equivalence returned by the unification algorithm for the problem U [u] = V [v] when no failure occurs.Then, mgu(U ⊕ V , ≡) is a most general unifier.
A particular case worth mentioning, as suggested to us by Jan-Willem Klop, is orthogonality.Orthogonal term rewriting systems are confluent, whether terminating or not.It so happens with drag rewriting systems, with the exact same definition of orthogonality: Definition 26.A drag rewriting system is said to be orthogonal if it does not possess critical pairs.Note that left-linearity is not needed here: a non-left linear rule and the linear rule obtained by sharing all sprouts labeled by the same variable define the same rewriting relation on terms.Hence our definition of drag rewriting is inherently linear, as we have remarked already.Proof.Lemma 16 shows that rewriting has the so-called diamond under the assumption of orthogonality, hence can be shown confluent by the standard pasting technique.
We are currently developing a new version of drags for which linearity is not built in the definition of composition as it is here.This new model would require the assumption of left-linearity for orthogonal systems.

Related work
The first, dominant model for graph rewriting was introduced in the mid-seventies by Hartmut Ehrig, Michael Pfender and Jürgen Schneider [14].Referred to as DPO (Double Push-Out), this purely categorical model was then extended in various ways, but also specialized to specific classes of concrete graphs, namely those that do not admit cycles [35].In particular, termination and confluence techniques have been elaborated for various generalizations of trees, such as rational trees, directed acyclic graphs, jungles, term-graphs, lambda-graphs, as well as for graphs in general.See [19] for a survey of various forms of graph rewriting and of available analysis techniques.
DPO applies to any category of graphs that has pushouts and unique pushout complements [12].A rule is a span L ← I → R, where I is the interface specifying which elements (vertices and edges) from the input graph G matching the left-hand side L by an injective morphism m are preserved by the transformation, the elements in m(L \ I) being removed from G while the elements in R \ I are added to G. The term DPO refers to the two pushouts generated by the span that define the result of a rewrite step.DPO suffers two drawbacks: applying a rewriting rule fails in case it results in dangling edges, and rules do not have variables, except in the case of symbolic graphs [29], where variables may just denote values.
One might argue that the first drawback has not completely disappeared with drags: a left-hand side of rule may match a drag D with a switchboard which is ill-behaved with respect to the right-hand side R of the rule, hence forbidding its application.However, this can only happen in a very restricted case: D must contain a loop on a vertex labeled f , and the rule must be of the form f (. . ., x, . ..) → x with one root on each side, the variable x matching the loop of D.
Categorical approaches are very general, they do apply to many different kinds of graph structures.Besides DPO, the most popular one, they include many variations: matching by a non-injective morphism [12], arbitrary adhesive graph categories [12], single pushout transformation (SPO [13,36]), or Sesqui-Pushout transformation (SqPO [6]), AGREE [5], and Hyperedge Replacement Systems [11].A detailed comparison of the approach based on drags with all these approaches is not obvious and is carried out in [10].
DRAG rewriting aims instead at providing a faithful generalization of term rewriting techniques to a certain class of graphs named drags by generalizing to drags all constructions underlying term rewriting, i.e., subterm, substitution, matching, replacement and unification.This is done constructively by providing a composition operator for drags which does not pop up in the other approaches, which aim at describing abstractly subgraph replacement.As a consequence, for a long time neither graphs nor rules included variables that can be substituted in the transformation process.An old work that has similarities with drag rewriting, in particular the objective of generalizing term rewriting in a natural way, is the hypergraph model of Bauderon and Courcelle [2].Like drags, it has symbols with arities as well as a list of roots called sources there.It has also an algebraic theory based on the same sum operation as well as operations on sources which are quite different from our composition operator, since there is no notion of variable in their model.Rewriting is done by exhibiting an injective morphism first, and then using gluing for constructing the right-hand side, in a way which resembles DPO very much.A recent approach that has also some similarities with drag rewriting is port graph rewriting [17], where graphs include ports and roles, which, in a way, play a similar rôle as roots and variables in drags.However, the transformation process remains similar to DPO graph rewriting with interfaces [3].
Since most of these general approaches lack variables, most works that study graph unification concentrate in the specific case of directed acyclic graphs (dags) that are used to represent terms with shared subterms (see, e.g., [31]).A preliminary attempt to handling variables in graph unification is [30], where variables are used to represent labels equipping the vertices or edges.A quite more general approach is [35], where variables represent hyperedges that may be substituted by pointed hypergraphs, but unification is solved there for a very restrictive case only.More recently, Hristakiev and Plump consider graph unification for their graph programming language GP2 [21].Graphs in GP2 are symbolic graphs whose attributes's values are given by variables satisfying some set of constraints [29].Variables are not substituted by graphs, but by constrained values.
In contrast, drag variables are real variables as in terms, and drag unification is shown here unitary, and decidable in quadratic time and space, a bound which we believe is not tight.This major result does not only subsume unification of trees, dags and jungles, but also of rational trees, dags or jungles, as we shall discover in the concluding remarks.The complexity analysis exploits the fact that the successors of a vertex are ordered and their number is fixed by the symbol labeling that vertex.Relaxing these constraints would blow up the number of most general unifiers resulting in a nonpolynomial complexity of matching and unification.
Confluence of graph transformation systems was first studied by Plump, who defined the notion of graph critical pairs and proved their completeness, but also showed that local confluence is undecidable already for terminating systems [32][33][34].He also considered the case of symbolic graphs [20].A main problem with Plump's notion of critical pair is that they are too many.More precisely, according to Plump's definition, the set of critical pairs of two rules r 1 , r 2 consists of all pairs of transformations H 1 ←− r 1 G −→ r 2 H 2 that are parallel independent (e.g., see [12]) and such that G is an overlap of L 1 and L 2 .This means that, in principle, to compute all possible critical pairs, we need to compute all possible overlaps of L 1 and L 2 and check if they are parallel independent.Moreover, even if it is difficult to estimate what is the exact number of critical pairs, since it is difficult to estimate how many of these pairs of transformations will be parallel independent, we know that many of them are useless.Less prolific notions of critical pairs have been introduced in [1,26,27].For instance, [26] includes an example to show how large may be the different number of critical pairs depending on the approach considered.The example considers the definition of finite automata in terms of graph transformation.More precisely, a finite automaton is represented by a graph including: a) the state/transition diagram of the automaton b) a cursor (represented by a loop) on the vertex denoting the current state of the automaton, and c) a queue of symbols representing the word to be recognized.Then, the transformation rules describe how the given automaton works, i.e., when the first symbol in the queue is recognized by the automaton, the movement of the cursor and the deletion of the symbol.In this example, computing the critical pairs of that rule with itself gave the following results: the number of all the overlaps of the left-hand side of the rule with itself was 51602; the number of critical pairs according to Plumps definition was 21478; the number of critical pairs computed using the method presented in [27] was 49; finally, the number of critical pairs computed using the method presented in [1,26]  Recently, local confluence was shown decidable for terminating graph rewriting with interfaces [3], where an interface is a subset of the indices of the given graph that are used to define an operation of graph composition by connecting the interfaces of the given graphs.Then, rewriting a graph with an interface, according to [3], means rewriting the graph but leaving the interface invariant: the interface restricts the application of rules, since it must be preserved.For instance, it would not be allowed to apply a rule if, as a consequence, a vertex in the interface would be deleted or if two vertices in the interface would be merged.With respect to confluence, a main difference between standard DPO rewriting and this variation is that, in DPO rewriting, two graphs G 1 , G 2 are considered joinable if they can be rewritten into isomorphic graphs H 1 , H 2 , respectively, but when the graphs have an interface I it would be required the existence of an isomorphism h : H 1 → H 2 such that h(v) = v, for every v ∈ I .This difference is the reason why joinability of critical pairs in standard DPO graph transformation does not imply local confluence, while that implication holds for graphs with interfaces, implying the decidability of confluence of terminating DPO rewriting of graphs with interfaces.Let us see an example from [3]: Consider the following rewrite rules on directed graphs with labeled edges, where a −→ represents an edge labeled by a and vertices are subindexed with numbers 1 and 2 to identify them and make the morphisms explicit: a Then, among the possible critical pairs only the following two have non-trivial overlap: a which are trivially joinable.However, the two rules are not confluent, as we can see below: Let us see what happens when we work with graphs with interfaces.If we associate an interface, consisting of the two nodes 1 and 2, to the first graph that gave rise to the first critical pair above, we have: • 2 ] a

Definition 5 .Definition 6 .Definition 7 .
The disjoint union of two drags D, E, written D ⊕ E is a drag obtained by first renaming D and E apart, and then forming the union of their labeled vertices and edges, and the concatenation of their roots, those of D coming first.In case D and E don't share vertices and/or variables, their vertices and/or variables will be kept identical so as to facilitate technicalities: D ⊕ E will then be the juxtaposition of D and E (in this order).Since juxtaposition is clearly associative (up to vertex renaming), we denote by i D i the juxtaposition of several drags.Given drags D = V , R, L, X, S and D = V , R , L , X , S , whose respective internal vertices are I and I , a (drag) morphism o : D → D is a map from V to V such that: 3. o preserves the successor function: ∀u ∈ I : X (o(u)) = o( X(u)); 4. o forces sharing: ∀s : x, t : x ∈ S : o(s) = o(t).A morphism o : D → D is an injection if (i) its restriction to I is injective, (ii) A vertex v ∈ I must be rooted if there exists an edge (u , o(v)) in D , called a new edge, such that either u ∈ I \ Im(o), or u = o(u) for some vertex u of D and (u, v) is not an edge of D.

Definition 8 (
Switchboard).Let D = V , R, L, X, S and D = V , R , L , X , S be drags.A switchboard ξ for D, D , equivalently an extension D , ξ of D, is a pair of partial injective functions ξ D : S → Dom(R ); ξ D : S → Dom(R) such that

Lemma 4 .
Given a clean drag D and a pattern L, there exists an injection o : L → D iff there exist a renaming L of L and a clean rewriting extension C , ξ of L , called match of L in D at o, such that D = C ⊗ ξ L .

Fig. 5 .
Let A = o(Int(L)) be the image by o of the internal vertices of L, and B = Ver(D) \ A be its complement, the set of vertices of D which are not the image by o of an internal vertex of L. Vertices in B will be the internal vertices of C so that o can be extended to the internal vertices of C by the identity.Vertices in A are the renamings by o of the internal vertices of L. Edges in D between vertices of B are edges from C ; and edges in D between vertices of L are edges from L which may involve a sprout of L mapped to an internal vertex of L by o, a first difficulty.Another difficulty arises with edges in D between a 1 2 3
where x 1 , x 2 denote the two occurrences of x in L.Then, D = C ⊗ ξ L. The injection embedding L into D maps the vertex f of L to the vertex f of D, and all three sprouts of L to the vertex a of D as defined in Example 2.
coincide on the figure), and the pair of rewriting extensions C , ξ , D, ζ is therefore a solution.Note that flipping the two roots of V would give a different unification problem, whose solution would simply require to slightly change ζ : changing the order of roots in either U or V does not alter unifiability.

Proof.
The relation being reflexive, we show transitivity.Let U , V , W be three clean drags whose sprouts are labeled by pairwise disjoint sets of variable, such that UV W .Then, U = C ⊗ ξ V and V = D ⊗ ζ W , for rewriting extensions C , ξ of V and (D, ζ ) of W .By Lemma 2, U = E ⊗ θ W , for some rewriting extension E, θ of W , hence U W . Assume that U V U , hence U = C ⊗ ξ V and V = D ⊗ ζ U ,using the same notations as above.It follows that C and D have no internal vertex, hence are a bunch of sprouts.Therefore, U and V have the same internal vertices, while their sprouts correspond bijectively.Further, U and V must have the same (modulo renaming) rooted vertices, since a rootless vertex cannot become rooted by composition, but the number of roots of a rooted vertex can be increased by composition, or decreased down to zero.Assume now that U V .Then, either |U | > |V |, or |U | = |V |.In the latter case, they cannot have the same number of variables labeling their sprouts, since otherwise U and V would be identical up to variable names, hence contradicting our assumption.Since U V , then U = C ⊗ ξ V where C cannot have internal vertices since |U | = |V |, and ξ cannot be bijective, otherwise U V .Hence ξ maps at least two variables of V to a same (rooted) variable of C , which becomes a variable of U in the composition.Well-foundedness follows, since the number of variables labeling the sprouts of a drag of a given size is bounded.
figure.The five classes are:

]) According to step 1 ,
we include variables x 1 in C and y 1 , y 2 in D, corresponding to the rooted vertices U [g • 1 • 8], V [g • 1 • 4], and V [ f • 3 • 9], respectively.Moreover, x 1 has three roots since the class of U [g • 1 • 8] includes one rooted vertex in V and one sprout in U .Similarly, y 1 has one root and y 2 has two.Since there are no rooted sprouts in U or V , no additional sprouts are added to C or D. Accordingly, we define ξ with three roots, since V [h • 2) has one root and there are two sprouts from U in the corresponding equivalence class.Similarly, we include in D a vertex labeled h, mirroring vertex V [h • 3], with three roots.In this case, ξ U sends both x's to the two roots of the vertex labeled h in C and ζ V sends both x's to the two roots of the vertex labeled h in D.
We now prove that (C, ξ) and(D, ζ )  are rewriting extensions and a solution of the given unification problem.Lemma 15.Let ≡ be an equivalence in solved form for the unification problem U [u] = V [v].Then, the most general unifying extensions C , ξ and D, ζ is a solution of the unification problem.

Fig. 14 .
Fig. 14.Building a most general unifier from a solved congruence.
and both sprouts have one root in C ⊗ ξ U and D ⊗ ζ V , by definition.If u is an internal vertex in U , whose equivalence class S does not include any internal vertex, by definition, u and ι(u) will have the same number of roots in C ⊗ ξ U and D ⊗ ζ V .Finally, if S includes internal vertices u, v, with ι(u) = v, then, if both u and v are rooted in U and in V then, by definition, u and v will be rooted in C ⊗ ξ U and D ⊗ ζ V .But if one of the vertices u or v is unrooted in U or V , respectively then, both vertices will be unrooted in C ⊗ ξ U and D ⊗ ζ V .
where f has arity k, s 1 , . . ., s k are the k successors of u and t 1 , . . ., t k are those of v, then U ⊕ m+1 V is obtained from U ⊕ m V by marking the pairs (s 1 , t 1 ) with c + 1, . .., (s n , t n ) with c + n, and turning the mark i to blue.2.Variable case If s and for each sprout t j in S, ζ V (t j ) = u S (resp., for each sprout s i in S, ξ U (s i ) = v S ); -For each class S containing no internal vertex, include two sprouts s S in C and t S in D both labeled x S , equipped with 1 + m S and 1 + n S roots respectively.Define ι(s S ) = t S , and for each sprout s i in S, ξ U (s i ) = s S (for each sprout t j in S, ζ V (t j ) = t S ).