Simplicity in Eulerian Circuits: Uniqueness and Safety

An Eulerian circuit in a directed graph is one of the most fundamental Graph Theory notions. Detecting if a graph $G$ has a unique Eulerian circuit can be done in polynomial time via the BEST theorem by de Bruijn, van Aardenne-Ehrenfest, Smith and Tutte, 1941-1951 (involving counting arborescences), or via a tailored characterization by Pevzner, 1989 (involving computing the intersection graph of simple cycles of $G$), both of which thus rely on overly complex notions for the simpler uniqueness problem. In this paper we give a new linear-time checkable characterization of directed graphs with a unique Eulerian circuit. This is based on a simple condition of when two edges must appear consecutively in all Eulerian circuits, in terms of cut nodes of the underlying undirected graph of $G$. As a by-product, we can also compute in linear-time all maximal $\textit{safe}$ walks appearing in all Eulerian circuits, for which Nagarajan and Pop proposed in 2009 a polynomial-time algorithm based on Pevzner characterization.


Background
Finding an Eulerian circuit in a graph, namely a closed walk passing through every edge exactly once, is arguably the most famous problem in Graph Theory.Euler's theorem from 1741 [5], states 1 :

A graph has an Eulerian circuit if and only if every node has the same number of in-neighbors and out-neighbors.
In this paper all graphs are directed, and we further assume without loss of generality that they are also weakly connected, in the sense that their underlying undirected graph is connected.For simplicity of presentation, we also assume that they have neither parallel edges nor selfloops, otherwise we can replace these by paths of length two.
The above characterization implies not only that we can check in linear-time if a graph is Eulerian (i.e., it has an Eulerian circuit), but we can also find an Eulerian circuit in linear time: when arriving with an in-coming edge (u, v) to a node v, there is at least one unused out-going edge (v, w), and any such out-going edge can be used to continue constructing the circuit, by a suitable representation of circuits using doubly-linked lists and keeping track of nodes with unused edges (see e.g., Hierholzer's algorithm [7,6]).
This choice among out-going edges also means that the graph may admit multiple Eulerian circuits.Indeed, another classical result (for directed graphs) is that the number of Eulerian circuits can be computed in polynomial time, with the BEST theorem by de Bruijn, van Aardenne-Ehrenfest, Smith and Tutte [16,15], from 1941-1951.This theorem states that the number (G) where d(v) is called the degree of v and equals the outdegree (equivalently, in-degree) of v, and t(G) denotes the number of arborescences of G (spanning directed trees of the directed graph G) rooted at any fixed arbitrary node of G and directed towards that node.The quantity t(G) can be computed via Kirchoffs matrix-tree theorem [11] by computing matrix determinants.
In some applications, where some unknown object to be reconstructed can be modeled as an Eulerian circuit, one is also interested in checking whether the graph has a unique Eulerian circuit, in order to be certain that the reconstructed object is indeed the correct one.For example, in Bioinformatics, Eulerian circuits are a theoretical model of genome assembly, see e.g., the introductory textbook by Waterman [17].While the BEST theorem can be applied to check whether (G) = 1, Pevzner proved in 1989 [13] a direct characterization of graphs with a unique Eulerian circuit.We cite below this characterization as stated by Waterman in the textbook [17,Theorem 7.5]:

Graph G has a unique circuit if and only if the intersection graph G I of simple cycles from G is a tree.
The undirected graph G I is obtained by decomposing , cycles with all nodes distinct except for v i ). Nodes in G might belong to several such cycles, but each edge can be used in at most one cycle c i .We add a node C i to G I for each such cycle c i obtained from G. We then add an undirected edge between two nodes C i and C j in G I for each node contained in both c i and c j in G. See Fig. 1 for an example and [17, Section 7.2] for more details.Note that if G I is a tree, then such decomposition of G is also unique.
Even though this does not rely on the BEST theorem, ultimately it is not very different, since this tree will then correspond to the unique arborescence of G (i.e., giving t(G) = 1).Indeed, assuming G I is a tree, we can take an arbitrary node v of G, and observe that there is a unique path from any other node u to v, implying a unique arborescence rooted at v and directed towards v.This holds since any node in the same cycle c i as v has a unique path to it in c i and for any node u not in c i there is a unique path from any node in its cycle c j to any node in cycle c i , as G I is a tree.
Since in practice the input graph may have more than one Eulerian circuit, we may settle for less when reconstructing the unknown object.Namely, we may report instead those walks that are subwalks of all Eulerian circuits.By definition these are also part of the unknown Eulerian circuit, and thus correct for the application at hand.The idea of finding such partial solutions common to all solutions to a problem has appeared concurrently in several fields, such as Bioinformatics (see e.g.[9], and almost all state-of-the-art genome assembly programs), and combinatorial optimization (see e.g.[4] for edges common to all maximum bipartite matchings).Recently, such partial solutions have been called safe [14], and a series of papers proposed algorithms finding all safe partial solutions for other problems.For example, [14,2,3,1] gave characterizations and optimal algorithms for the walks appearing in all edge-covering circuits of a strongly connected graph (that cover each edge at least once, not exactly once).Note that safe walks for edge-covering circuits are also safe for Eulerian circuits (by definition), but they are not all the safe walks.Recently, [10] characterized the paths appearing in all flow decompositions of a flow in a directed acyclic graph.Note that an Eulerian circuit in a graph also induces a flow if we assign flow value 1 to every edge, however the result of [10] is restricted to acyclic graphs.
In 2009, Nagarajan and Pop [12] proposed the first algorithm for finding safe walks for Eulerian circuits, in a brief note on page 901: In the case of Eulerian tours [in our terminology: circuits], reconstructing sub-tours [in our terminology: sub-walks] that are part of every Eulerian tour is feasible in polynomial time (see Theorem 7.5 in Waterman [17]) based on finding acyclic subgraphs in the cycle-graph decomposition of the original graph.
In the above, the cycle-graph decomposition is the graph G I from [17,Theorem 7.5].Even though the intuition of the authors is the correct one, we argue that this note is incomplete.First, since G I may not be a tree, it may not be unique, a fact which is overlooked in the above note.Second, not any acyclic subgraph corresponds to a walk appearing in all Eulerian circuits, which we illustrate in Fig. 1.We believe that the authors meant those acyclic subgraphs of G I with the additional property that none of their edges are contained in a bi-connected component in G I .Both of these issues suggest that this may not be the "right" characterization of such walks.Moreover, even though this characterization can be fixed, it is still based on the intersection graph G I of simple cycles of G, thus leading to an algorithm more complex than necessary.

Our contribution
In this paper we simplify both Pevzner's characterization of graphs with a unique Eulerian circuit [13] (also presented in Waterman's textbook [17]), and the incomplete characterization of Nagarajan and Pop of safe walks for Eulerian circuits from [12].Our idea is to characterize when an edge (u, v) is followed by an edge (v, w) in all Eulerian circuits.Clearly, if d(v) = 1, this is always the case, and if d(v) 3, it can be easily proved that this is never the case, see Fig. 3(a) (which is also a simple idea behind the BEST theorem).The interesting case is thus when d(v) = 2. Let U (G) denote the underlying undirected graph of G, namely the graph obtain from G by removing the orientation of the edges; note that U (G) is connected since we assume that G is weakly connected.For any node v, denote by U (G) \ v the graph obtained from U (G) by removing v together with all incident edges of v.We say We prove the following result: Theorem.Let G = (V , E) be an Eulerian graph, and let (u, v), (v, w) ∈ E. The walk (u, v, w) is safe for Eulerian circuits (i.e. it For example, in Fig. 1, d and obtain a very simple characterization of graphs with a unique Eulerian circuit, not based on arborescences (as in the BEST theorem), nor on the intersection graph of simple cycles (as in Pevzner's characterization):

Corollary 1. Let G = (V , E) be an Eulerian graph. We have that G admits a unique Eulerian circuit if and only if A(G) = V . Moreover, we can detect if this is the case in time O (|E|).
Indeed, if A(G) = V , once we enter a node v with some edge, we are forced to leave v with a prescribed other edge, and since an Eulerian circuit visits every edge exactly once, the Eulerian circuit is uniquely determined.
The complexity bound follows from the fact that we can compute cut nodes in linear-time [8], and this is the only non-degree based condition to be checked.
While at the beginning of the paper we mentioned that parallel edges and self-loops can be assumed to be absent (because they can be replaced by paths of length two), their presence has a very simple effect on Corollary 1. Parallel edges in an Eulerian graph clearly imply the presence of at least two Eulerian circuits, and self loops are allowed only for nodes having exactly one other outgoing edge.
In a similar way we can also obtain a simple and lineartime algorithm reporting all maximal safe walks (i.e., not subwalks of other safe walks).

Corollary 2. Let G = (V , E) be an Eulerian graph. We can compute all maximal safe walks for the Eulerian circuits of G (i.e., maximal subwalks of all Eulerian circuits of G) in time O (|E|).
The above corollary is obtained by simply computing an Eulerian circuit C , in time O (|E|), and cutting C at any node v not in A(G) by keeping a copy of v as an endpoint in each resulting segment (see Fig. 2 for an example).As argued above, each such maximal segment with internal nodes in A(G) is a maximal safe walk because Eulerian circuits visit each edge exactly once.Another immediate consequence is that maximal safe walks do not overlap on edges, and since every edge is safe, have total length exactly |E|, which is harder to observe without our theorem.This fact also makes the algorithm run in both input-and output-linear time.

Notation
Let G = (V , E) be a directed graph.We denote V by V (G) and E by E(G).Given v ∈ V (G), we denote by d(v) the number of out-neighbors of v in G (if G is Eulerian, d(v) also equals the number of in-neighbors of v).We say that W = (v 0 , . . ., v k ) is a walk in G if there is an edge from v i to v i+1 , for all i ∈ {0, . . ., k − 1} (the nodes of W do not have to be distinct).We say that W is non- Since we assume G has neither parallel edges nor self-loops (since we can replace them by paths of length two), we can state the above two notions of "appearance" more simply just in terms of nodes.
These four nodes define the cutting points for maximal safe walks in any Eulerian circuit of G. Fig. 2(b): one Eulerian circuit of G, the occurrences of , and the maximal safe walks of G (in green).
Fig. 3. Illustration of the two cases in the proof of the theorem.Fig. 3a: Node v has degree 3, and the Eulerian circuit C visits v with edge (u 0 , v) (dashed blue), then follows the walk C 0 (solid blue), then the walk C 1 (solid violet) and then the edge (v, w 1 ) (dashed violet).Fig. 3b: By swapping C 0 and C 1 one obtains an Eulerian circuit C * in which (u, v, w) does not appear.Fig. 3c: Graph G constructed as in Case 2. Node v is not a cut node and its degree is 2. The original edges are in black, and the newly added length-two paths are in violet; walk (u, v, w) is not safe.
For a directed or undirected graph H , and a node v ∈ V (H), we denote by H \ v the graph obtained from H by removing v together with all incident edges of v.If H is undirected and connected, we say that v is a cut node of H if H \ v is not connected.For a set K ⊆ V (H), we denote by H[K ] the subgraph of H induced by K , that is, the subgraph of H obtained by deleting all nodes not in K .

Proof of the theorem
In order to be self-contained and to show that our arguments do not rely on complex results, we prove our theorem without using the BEST theorem.The only arguments are based on simply swapping parts of an Eulerian circuit (Figs.3a and 3b), and a local construction when removing a non-cut node (Fig. 3c).
Proof.First assume that (u, v, w) is a subwalk of all Eulerian circuits in an Eulerian graph G. Let C be an Eulerian circuit of G, containing thus (u, v, w).We will show that Moreover, notice that in C * the walk (u, v, w) does not appear, because edge (u, v) is followed by edge (v, w 1 ).This contradicts the fact that (u, v, w) appears in all Eulerian circuits of G.
Case 2. The degree of v is d(v) = 2 and v is not a cut node of U (G).Let u 1 = u and u 2 be the in-neighbors of v and let w 1 = w and w 2 be the out-neighbors of v. Consider the subgraph G[V \ {v}] and note that, since v is not a cut node of U (G), we have that where v 1 and v 2 are new nodes (see Fig. 3c).Note that in G there are neither parallel edges nor self-loops, every node has an equal number of in-neighbors and outneighbors, and since U (G[V \ {v}]) is connected, G contains an Eulerian circuit C .Note that C can be transformed into an Eulerian circuit C * of G, as follows: whenever C passes through a new path (u i , v i , w 3−i ), we make C * pass through (u i , v, w 3−i ).The Eulerian circuit C * thus constructed does not pass through (u = u 1 , v, w 1 = w).Thus, (u, v, w) does not appear in all Eulerian circuits of G, a contradiction.
In order to prove the converse direction, assume that v ∈ A(G) and (u, v, w) is a subwalk of an Eulerian circuit in an Eulerian graph G.We will show that (u, v, w) is safe, i.e. that it is a subwalk of all Eulerian circuits in G.If d(v) = 1, the claim obviously holds.Assume that d(v) = 2 and it is a cut node of U (G).Thus, U (G) \ v has exactly two connected components, K 1 and K 2 , and v has exactly one in-neighbor and one out-neighbor in each of K 1 and K 2 .The only way for an Eulerian circuit C to visit edges of K 1 or K 2 is using node v.
Therefore, immediately after reaching v from its in-

Conclusion
In this paper we considered the problem of detecting when a graph has a unique Eulerian circuit, and the related problem of finding safe walks for Eulerian circuits.We simplified both existing characterizations, which are based on the intersection graph of simple cycles of G. Inspired by the safety paradigm, we characterized when two edges must be consecutive in all Eulerian circuits, whose single non-trivial condition involves cut nodes of the underlying undirected graph.
This leads to very simple and linear-time detection of graphs with a unique Eulerian circuit, and linear-time computation of maximal safe walks for Eulerian circuits.The latter result also fits a line of research on walks safe for related objects, such as the walks appearing in all edgecovering circuits, or in all flow decompositions of a directed cyclic graph.The resulting linear-time complexity bounds are optimal and match the one for computing an Eulerian circuit.

Acknowledgment
We are very grateful to the anonymous reviewers who helped improved the presentation of this paper.This work was partially funded by the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme (grant agreement No. 851093, SAFEBIO) and partially by the Academy of Finland (grants No. 322595, 328877, 314284 and 335715).

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Fig. 1 .
Fig. 1.The acyclic subgraph of G I induced by {C 2 , C 3 , C 4 } (in red) has the property that no walk passing through cycles C 2 , C 3 and C 4 in G (i.e., using edges from each of them) is contained in all Eulerian circuits in G: the only such walk (v 1 , v 2 , v 3 , v 4 ), marked in red, is not contained in the Eulerian circuit which starts with v 1 , v 2 , then goes instead to v 0 and v 1 and continues to visit the rest of the graph, and finally back to v 1 .However the acyclic subgraph of G I induced by {C 3 , C 4 , C 5 , C 6 } indeed gives a walk in G (marked in green) contained in all Eulerian circuits of G. (For interpretation of the colors in the figure(s), the reader is referred to the web version of this article.) is a subwalk of all Eulerian circuits) if and only if it is a subwalk of an Eulerian circuit of G, and d(v) = 1 or v has d(v) = 2 and is a cut node of U (G).

Case 1 .
2 and is a cut node of U (G) (equivalently, that v ∈ A(G)).Assume for a contradiction that v / ∈ A(G); we have two cases: Node v has d(v) 3. From C , we construct another Eulerian circuit C * which does not contain (u, v, w), as follows.Denote by v 0 , v, and v 1 three consecutive occurrences of v in C , which exist because d(v) 3. Let u 0 , w 0 , and u 1 , w 1 be the nodes before, and after v 0 and v 1 , respectively, on C .Let C 0 = (v 0 , w 0 , . . ., u, v) and C 1 = (v, w, . . ., u 1 , v 1 ) be the subwalks of C between v 0 and v, and between v and v 1 , respectively.We can obtain the circuit C * by swapping the two occurrences of C 0 and C 1 in C (see Figs. 3a and 3b).Since C * is still a circuit and has the same set of edges as C , it is still Eulerian.
in terms of edges (i.e.Q , viewed as a string of edges, is a substring of W , viewed as a string of edges).Similarly, if W is a circuit, we say that Q appears in W if there is an in- neighbor in K t (for t ∈ {1, 2}), any Eulerian circuit C must go to the other component K 3−t , because it is the only point when it can traverse it.This means that u and w do not belong to the same connected component K t (because (u, v, w) appears in an Eulerian circuit), and any Eulerian circuit visits the edge (v, w) immediately after visiting the edge (u, v).