Journal of Graph Algorithms and Applications Symmetry Breaking Constraints for the Minimum Deficiency Problem

An edge-coloring of a graph G = (V, E) is a function c that assigns an integer c(e) (called color) in {0, 1, 2,. .. } to every edge e ∈ E so that adjacent edges receive different colors. An edge-coloring is compact if the colors of the edges incident to every vertex form a set of consecutive integers. The minimum deficiency problem is to determine the minimum number of pendant edges that must be added to a graph such that the resulting graph admits a compact edge-coloring. Because of symmetries, an instance of the minimum deficiency problem can have many equivalent optimal solutions. We present a way to generate a set of symmetry breaking constraints, called gamblle constraints, that can be added to a constraint programming model. The gamblle constraints are inspired by the Lex-Leader ones, based on automorphisms of graphs, and act on families of permutable variables. We analyze their impact on the reduction of the number of optimal solutions as well as on the speed-up of the constraint programming model.


Introduction
An edge-coloring of a graph G = (V, E) is a function c : E → {0, 1, 2, . . .} that assigns a color c(e) to every edge e ∈ E such that c(e) = c(e ) whenever e and e share a common endvertex.Let E v denote the set of edges incident to vertex v ∈ V .An edge-coloring of G = (V, E) is compact if {c(e) : e ∈ E v } is a set of consecutive non-negative integers for all vertices v ∈ V .
The problem of determining a compact k-edge-coloring (if any) of a graph was introduced by Asratian and Kamalian [3].Determining whether or not a given graph admits a compact edge-coloring is known to be an N P-complete problem [23], even for bipartite graphs.Given an edge-coloring c of a graph G and a vertex v, the deficiency of c at v, denoted d v (G, c), is the minimum number of integers that must be added to {c(e) : e ∈ E v } to form a set of consecutive integers.The deficiency of c is then defined as the sum d(G, c) = v∈V d v (G, c).Hence, c is compact if and only if d(G, c) = 0.The deficiency of a graph G, denoted d(G), is the minimum deficiency d(G, c) over all edge-colorings c of G.This concept, which was introduced by Giaro et al. [10], provides a measure of how close G is to be compactly colorable.Indeed, d(G) is the minimum number of pendant edges that must be added to G such that the resulting graph is compactly colorable.The Minimum Deficiency Problem is to determine d(G).
As observed in [1], the problem of determining the deficiency of a small graph is surprisingly hard.The main difficulty is not to generate an optimal solution, but rather to prove its optimality.This is mainly due to the existence of many equivalent optimal solutions.The objective of this paper is to introduce symmetry-breaking constraints, in order to eliminate as many redundant solutions as possible.
Multiple authors have each identified and defined in different ways various types of symmetries in their respective research contexts.This paper adopts the terminology of [5,7], which consists of a very general classification into solution and problem (or constraint) symmetries.Such permutations of the set of (variable, value) pairs respectively preserve the solutions or the constraints of the problem.Moreover, the group of constraint symmetries is a subgroup of the group of solution symmetries.It is also worth noting that identifying solution symmetries usually requires finding all the solutions first, whereas constraint symmetries can be derived from the structure and expression of the problem.Finally, both of these types of symmetries allow two special cases, variable and value symmetries, which only permute variables or values, respectively.
The Lex-Leader Method proposed by Crawford, Ginsberg, Luks and Roy [6], and later improved in [5,14,19], will be at the basis of the research presented here.It adds constraints so as to allow only one member of each equivalence class.Such a method can produce a huge number of constraints and sometimes adding them to a model can be counterproductive.We propose to generate only a subset of these constraints, called gamblle constraints, and analyze their impact on the reduction of the number of optimal solutions.
In [1], the authors have compared the performances of four models for the minimum deficiency problem.It clearly appears that constrained programming models are significantly better than integer programming ones.The constrained programming model defined in [1] is described in Section 2. Graph automorphisms play an important role in creating symmetries for the minimum deficiency problem.This is illustrated in Section 3, and two methods to identify some or all of these automorphisms are proposed in Section 4. We then define gamblle constraints in Section 5.These are added to the constraint programming model.Computational experiments are reported in Section 6 where we compare eight different ways of adding symmetry breaking constraints for the minimum deficiency problem.

Model
The symmetries encountered when solving a problem are clearly dependant on the model used to solve it.Following the previous experimentations in [1], a constraint programming model will be used to solve the minimum deficiency problem.It appears to be much faster than integer programming models and provides a simple correspondence to the graph formulation.Consider a graph G = (V, E), with vertex set V and edge set E. As mentioned in the introduction, E v is the set of edges incident to vertex v.We denote by deg v = |E v | the degree of vertex v, and by ∆ = max v∈V deg v the maximum degree of G. Also, C represents a set of colors {0, 1, . . ., K − 1} where K ≥ ∆, c e ∈ C is the color assigned to edge e, and c v and c v respectively denote the minimal and maximal color assigned to an edge incident to vertex v.For the domain of the last two kind of variables, rather than using the entire set C, it is possible to shrink it a little, simply by taking into account the degree of their associated vertex.Finally, d v is the deficiency at vertex v and v∈V d v is the deficiency of the coloring.We will use the following constrained programming model proposed in [1]: Constraints (2) to ( 5) are the usual constraints for the minimum deficiency problem and are sufficient to model it accurately.Constraint ( 6) is added to help breaking the value symmetries due to shifting all the colors up or down, by simply enforcing that color 0 should be used at least once.That single extra constraint already immensely improves the performance of the model, and does not interfere with the future ones dealing with variable symmetries based on graph automorphisms that is the subject of the rest of this paper.
It was proved in [1] that if G is a graph with n vertices, then all edge-colorings with minimum deficiency d(G) use at most 2n−4+d(G) colors.A fixed number of colors K = 3n − 4 can therefore be used for all practical purposes, based on the conjecture that the minimum deficiency d(G) of G is always at most equal to n.If the conjecture is proven wrong and the model is applied to a graph with minimum deficiency greater than n, then the optimal value D = v∈V d v produced by the model will be larger than n, and we can then run the model a second time, using K = 2n − 4 + D instead of 3n − 4 to determine the minimum deficiency of the considered graph.

Graph automorphisms
An automorphism of a graph G = (V, E) is a permutation σ of its vertex set V such that (u, v) ∈ E ⇔ (σ(u), σ(v)) ∈ E. This reordering of the vertices of G thus preserves the adjacency matrix.The set of all automorphisms of a graph G forms the permutation group Aut(G).This group acts naturally on V , by its definition.More interestingly, it can act on the edge set E by using the edge action h E that maps a permutation σ ∈ Aut(G) of vertices to a permutation σ E = h E (σ) of the edges, where σ In what follows, we use the standard cycle notations for permutations [21].It expresses a permutation as a product of cycles corresponding to the orbits of the permutation; since distinct orbits are disjoint, this is referred to as a decomposition into disjoint cycles.Cycles of length 1 are commonly omitted from the cycle notation.The identity permutation, which consists only of 1cycles will be denoted by Id.Two solutions to the minimum deficiency problem that can be obtained one from the other by a permutation in Aut(G) are called equivalent solutions.
Consider for example the chain P 4 on four vertices in Figure 1(a).Permutation (v 1 , v 4 )(v 2 , v 3 ) ∈ Aut(P 4 ) translates into permutation (e 1 , e 3 ) of the edge set, and these two permutations both indicate that the two bottom optimal edge-colorings s 3 and s 4 are equivalent.As another example, consider the clique K 3 on three vertices in Figure 1(b).The two permutations (v 1 , v 2 ) and (v 2 , v 3 ) are generators for the permutation group Aut(K 3 ), and these two permutations translate into permutations (e 2 , e 3 ) and (e 1 , e 2 ) of the edge set.In this case, the permutations in Aut(K 3 ) indicate that all optimal edge-colorings are equivalent.JGAA, 21(2) 195-218 (2017) 199 The chain P 4 on four vertices The four optimal edge-colorings of P 4 Solutions s 3 and s 4 are equivalent The six equivalent optimal edge-colorings of The clique K 3 on three vertices  Aut(G) also acts on families of vertex (resp.edge) variables which have a one-to-one mapping with the set of vertices (resp.edges) of the graph, such as c v , c v and d v (resp.c e ).For example, consider again the chain on four vertices in Figure 1(a).Permutation (v 1 , v 4 )(v 2 , v 3 ) in Aut(P 4 ) defines the four following variable permutations (c v1 , c v4 )(c v2 , c v3 ), (c v1 , c v4 )(c v2 , c v3 ), (d v1 , d v4 )(d v2 , d v3 ), and (c e1 , c e3 ).
Constructing Aut(G) is at least as difficult (in terms of computational complexity) as solving the graph isomorphism problem.Just counting the automorphisms is polynomial-time equivalent to graph isomorphism [15].It is therefore unknown whether there is a polynomial time algorithm for constructing Aut(G).

Methods for identifying automorphisms 4.1 nauty
Given an input graph G, the nauty library created by Brendan McKay [16,17] outputs a description of Aut(G) in terms of a generating set.The permutations that are part of this set tend to be fairly simple (when possible, a combination of disjoint transpositions).However, the set is not necessarily minimal for generating Aut(G).For example in Figure 2, nauty outputs generators (v 2 , v 3 ), (v 3 , v 4 ), (v 5 , v 6 ), (v 7 , v 8 ) and (v 5 , v 7 )(v 6 , v 8 ); (v 7 , v 8 ) is redundant since it can be obtained by applying (v 5 , v 7 )(v 6 , v 8 ) followed by (v 5 , v 6 ) and then (v 5 , v 7 )(v 6 , v 8 ) again.

clusters
We define two vertices u and v as twins if all vertices w = u, v are either adjacent to both u and v or to none of them.A stable set of twins is a set of at least two pairwise non-adjacent twins, and a clique of twins is a set of at least two pairwise adjacent twins.For illustration, vertices v 2 , v 3 , v 4 in Figure 2 form a clique of twins, while both sets {v 5 , v 6 } and {v 7 , v 8 } are stable sets of twins.The following theorem shows that there is a partition of the vertex set of a graph so that every block of the partition is a maximal stable set of twins, a clique of twins, or a singleton.

Theorem 1
The intersection of a stable set of twins and a clique of twins is empty.
Proof: Suppose, by contradiction, that there is a stable set S of twins and a clique K of twins such that I = S ∩ K = ∅.Clearly, I contains at most one vertex since vertices in S are non-adjacent, while those in K are adjacent.So let {w} = I and let u = w be a second vertex in S. All vertices v = w in K must be adjacent to u since u and w have the same set of adacent vertices.But no vertex v = w in K can be adjacent to u since v and w have the same neighborhood.Hence K contains only one vertex, a contradiction.
Finding a partition of the vertex set into maximal stable sets of twins, maximal cliques of twins and singletons is an easy task.Indeed, it is sufficient to observe that every maximal stable set of twins corresponds to a maximal set of identical lines of the adjacency matrix, and every maximal clique of twins corresponds to a maximal set of identical lines of the matrix obtained from the adjacency matrix by changing to 1 every element of its diagonal.All elements that are not in a stable set of twins or in a clique of twins are singletons of the partition.For example, the partition of the vertex set for the graph in Figure 2 is Note that a permutation of a subset of vertices in a stable set of twins or in a clique of twins corresponds to an automorphism.More precisely, let T = {v i1 , v i2 , . . ., v ip } be a stable set of twins or a clique of twins.The p − 1 permutations (v i1 , v i2 ), (v i2 , v i3 ),. . ., (v ip−1 , v ip ) define a set of generators for all possible permutations of a subset of vertices in T .We call clusters the procedure that produces these generators.For the example of Figure 2, clusters would produce the set {(v 2 , v 3 ), (v 3 , v 4 ), (v 5 , v 6 ), (v 7 , v 8 )} of generators.Observe that permutation (v 5 , v 7 )(v 6 , v 8 ) (i.e.swapping the two stable sets of twins) found by nauty cannot be obtained by these generators.
The union of the generators produced by clusters defines the clusterautomorphism group Aut C (G), which is a subgroup of Aut(G).Hence, if C is the set containing all stables sets of twins and all cliques of twins, we have

and the total number of generators for
It is known that for a graph G on n vertices, Aut(G) can be specified by no more than n − 1 generators.However, as mentioned in [18], nauty possibly requires an exponentional time to provide such a set of generators.For comparison, we have observed above that the partition of the vertex set into maximal stable sets of twins, maximal cliques of twins and singletons can be obtained in polynomial time, which means that clusters produces generators for Aut C (G) in polynomial time.

gamblle constraints
The Lex-Leader constraints [6] deal with variable symmetries.They use a vector representation of the variables of a solution, and constrain all permutations of this vector under a set of symmetries to be lexicographically smaller to the first one.This amounts to reducing the solution space to one element for each equivalence class defined by symmetries.Such a method can produce a huge number of constraints, while a subset of these constraints can already be useful, for example when limited to the symmetries due to graph automorphisms.Consider for example the graph K 2,3 of Figure 3 and the following ordering of the variables of the model: (c v1 , . . ., c v5 , c v1 , . . ., c v5 , d v1 , . . ., d v5 , c e1 , . . ., c e6 ).Both clusters and nauty produce the set {(v 1 , v 2 ), (v 3 , v 4 ), (v 4 , v 5 )} of generators.Permutation (v 3 , v 4 ) imposes the following constraint : When the head of the solution vector comprises all the elements of a family of permutable variables, part of the effect of a permutation of the vertices or of the edges is a rearrangment of this head.In the above example, the effect of permutation (v 3 , v 4 ) is to change (c v1 , c v2 , c v3 , c v4 , c v5 , . ..) into (c v1 , c v2 , c v4 , c v3 , c v5 , . ..).The first simplification (called trimming) is to only explicit the first non-trivial comparison of each lexicographical constraint.In the previous example, the constraint becomes If we rearrange the variables so that (c e1 , c e2 , c e3 , c e4 , c e5 , c e6 , . ..) is the head of a solution vector, then it is changed to (c e2 , c e1 , c e3 , c e5 , c e4 c e6 , . ..) by permutation (v 3 , v 4 ), and the constraint resulting from the trimming is In some cases, it is possible to use original constraints in the model to further strengthen a constraint to a strict inequality, as with the edge color variables when the considered edges have an endvertex in common.This results from the fact that such variables appear together in an allDifferent constraint.This special case makes the trimmed constraint equivalent to the full original lexicographical one.In the considered example, the constrained programming model imposes that c e1 , c e2 and c e3 must be all different, and permutation (v 3 , v 4 ) therefore gives c e1 < c e2 .
The second simplification consists in adding constraints for only a few permutations from the automorphism group.Consider a permutation in Aut(G) and let π be its effect on a family of permutable variables.Permutation π can be written as a product of disjoint cycles.Let C be the cycle in π that contains the variable with smallest index, say u i .For every u j in C with j = i we do the following: if the model does not imply u i = u j , we add inequality u i ≤ u j to the set of constraints; otherwise, we add the strict inequality u i < u j .Every permutation π thus produces |C| − 1 constraints, to account for the whole orbit of the variable u i created by repeated application of π.We name the resulting inequalities Graph AutoMorphism-Based Lex-Leader Enforcing (gamblle) constraints.This is summarized in Algorithm 1. Line 3 guarantees that the inequality compares the first differing pair of the underlying lexicographical constraint based on the solution vector headed by the family F of permutable variables.
For illustration, consider the graph G in Figure 4 with |Aut(G)| = 3.It is the smallest graph with all permutations π = Id in Aut(G) having no transposition (i.e., cycle with only two elements).nauty generates permutation   Aut(G)={Id, If we are interested in the edge colors, we first have to order the edges.One possible way is to order them using the lexicographical ordering of their pair of endvertices.Hence e 1 is the edge linking v 1 with v 2 , e 2 is the edge linking v 1 with v 4 , and so on.Such an ordering is shown on the leftside of Figure 4. Permutation (v 1 , v 4 , v 7 )(v 2 , v 5 , v 8 )(v 3 , v 6 , v 9 ) translates into the following permutation of the c e variables: The cycle with smallest index is C = (c e1 , c e9 , c e14 ) and we therefore add constraints c e1 ≤ c e9 and c e1 ≤ c e14 .
Note that the ordering of the variables has an impact on the generated constraints.Indeed, for the same graph, we give on the rightside of Figure 4 a different labelling, where (v The cycle with smallest index is then C = (c e1 , c e2 , c e3 ) and we therefore obtain the following strenghtened constraints: As a second example, consider the complete binary tree of Figure 5. nauty produces permutations (v 2 , v 3 )(v 4 , v 6 )(v 5 , v 7 ), (v 4 , v 5 ) and (v 6 , v 7 ) of the vertex set which correspond to permutations (e 1 , e 2 )(e 3 , e 5 )(e 4 , e 6 ), (e 3 , e 4 ) and (e 4 , e 6 ) of the edges when they are ordered as shown on the leftside of Figure 5, according to the lexicographical ordering of their pair of endvertices.The gamblle constraints associated with the c e variables are c e1 < c e2 , c e3 < c e4 and c e5 < c e6 .The reverse ordering of the edges gives different gamblle constraints since one of them is not a strict inequality.Indeed, the three permutations of the vertex set translate into permutations (e 1 , e 3 )(e 2 , e 4 )(e 5 , e 6 ), (e 3 , e 4 ) and (e 1 , e 2 ) of the edge set, and the associated gamblle constraints are c e1 ≤ c e3 , c e3 < c e4 and c e1 < c e2 .In the experiments reported in the next section, we consider all permutations obtained using nauty and clusters, and the edges are ordered according to the lexicographical ordering of their pair of endvertices.For example, consider again the P 4 of Figure 1 6 Computational experiments

Experimental setup
To generate gamblle constraints, two parameters come into play.The first one is the method to obtain a set of permutations, either through the nauty library (N) or the clusters method (C).The second is the family of permutable variables used, which can be the color c e of the edges (col), the minimum color c v at the vertices (min), the maximum color c v at the vertices (max), or the deficiency d v at the vertices (def).The set of extra constraints and the resulting algorithm when using them are both denoted by putting the letters of the options together in the format G<method><family>.Also, none denotes the original model solved without any gamblle constraints.
The following two datasets are considered in our experiment: D1, the complete set of connected simple graphs of size 4 to 9, and D2, a series of random connected simple graphs, 8 for each pair (n, p) with 4 ≤ n ≤ 100 vertices and density in (p − 0.05, p + 0.05] with p ∈ {0.1, 0.2, . . ., 0.9}. The tests were run on a Lenovo Thinkpad X300 laptop, with Intel Core 2 Duo CPU at 1.2 GHz and 4Gb of RAM.Given a graph and a family of permutable variables, the pre-processing step generates the gamblle constraints, using either nauty (v2.4r2) or clusters.Using IBM/ILOG's optimization suite, the basic model is expressed in OPL (Optimization Programming Language).It is then instanciated with the graph and augmented with the additional constraints.This object is solved with either cp optimizer (v12.2) to competitively find the deficiency, or CPLEX (v12.2) to find the list of optimal solutions.For the latter the generation of graph images in post-processing relies on Graphviz.This whole process and the batch processing are both orchestrated by programs written in Ruby.

clusters versus nauty
Given a graph G, it may happen that Aut C (G) is strictly contained in Aut(G).Also, we possibly have Aut C (G) = {Id}, which means that G does not contain any stable set of twins or clique of twins.In order to justify the use of clusters, we first show that most graphs have Aut C (G) = Aut(G).For this purpose, we distinguish the following four cases, denoted A1, A2, A3 and A4.
The graph K 2,3 of Figure 3 is in A2 since both clusters and nauty produce the 3 generators for the 12 permutations in Aut(K 2,3 ).The graph in Figure 4 belong to A3 since clusters does not produce any permutation while nauty does.The graph in Figure 2 is in A4 since nauty produces permutation (v 5 , v 7 )(v 6 , v 8 ) that does not belong to Aut C (G) and |Aut C (G)| = 24.An example of graph in A1, is shown in Figure 6.When G is in class A1 or A3, GC<family> is equivalent to none, and when G is in A1 or A2, GC<family> and GN<family> produce the same results.
As shown in Figure 7, the proportion of graphs in D1 that belong to A1 increases monotonically with the the number of vertices.This becomes even more clear in Figure 8 for the random graphs in D2.When |Aut(G)| > 1, class A2 seems to dominate, which suggests that most automorphisms are due to stable sets of twins and cliques of twins.

JGAA, 21(2) 195-218 (2017) 207
In D1, when fixing the number n of vertices and varying the density d, the middle range of density tends to have a bigger proportion of graphs with no automorphism, contrary to the extremes that mostly have some.Figure 7 shows the case of n = 8.This observation seems to stay true for larger graphs (see Figure 8 with n = 20).Let g N (G) (respectiveley g C (G)) denote the number of generators produced by nauty (respectively clusters) when applied to G. Figures 7 and 8 clearly show that most graphs belong to A1 and A2, in which case we have g N (G) = g C (G).If G belongs to A3 or A4, g C (G) tends to be usually only slightly smaller than g N (G), as shown in Table 1, where we indicate the number of graphs G with n = 8 vertices for every pair (g N (G), g C (G)).Indeed among the 11117 graphs, only 228 of them (i.e., 2%) have g N (G) − g C (G) > 1.By summing the numbers on the diagonal, we obtain that 8565 graphs out of 11117 (i.e.77%) belong to A1 or A2.These observations justify the use of clusters since Aut C (G) = Aut(G) in a majority of cases.
Using again D1, we analyze in Table 2 the relationship between the minimum deficiency d(G) and the automorphism group class a(G) ∈ {A1,A2,A3,A4} to which G belongs, as well as the relationship between d(G) and the number g N (G) of generators produced by nauty.We indicate the percentage of graphs   with n = 8 vertices for every pair (d(G), g N (G)) and every pair (d(G), a(G)).Notice first that most graphs (97.7%) have minimum deficiency 0. Among the 257 graphs (2.31%) with d(G) > 0, only 6 of them (0.05 % of 11117, which is also 2.33% of 257) have no automorphism (i.e., belong to A1).Also, it appears that the average value of g N (G) tends to increase when the minimum deficiency increases : it is equal to 1.11 for graphs with d(G) = 0, to 2.39 for graphs with d(G) = 1, and to 4.5 for graphs with d(G) = 2.

Impact on the size of the optimal solution space
In this section, we analyze the impact of adding symmetry breaking constraints on the size of the set of optimal solutions.For this purpose, let L none (G) be the set of optimal solutions to the minimum deficiency problem when no extra constraints are added to the constrained programming model of Section 2. Such a set can be represented by a graph H none (G) where each vertex is a solution in L none (G), and two solutions are adjacent if one can be transformed into the other by swapping the two colors along a path or cycle of alternating colors.These are the two most fundamental neighborhoods used in heuristics for solving the minimum deficiency problem [4].They leave unaltered the deficiency in their interior vertices, so there can only be a potential effect on the deficiency at the two ends of a path.Consider for example the graph in Figure 9.The coloring on the leftside is optimal since the deficiency is zero.A swapping of colors 1 and 2 on the cycle containing vertices v 1 , v 2 , v 5 , v 6 produces a new optimal solution.These two solutions are equivalent, one being obtained from the other by permuting vertices v 2 and v 5 which belong to a stable set of twins.The two solutions are therefore linked by an edge in H none (G).A swapping of colors 1 and 2 on a path is shown on the rightside of Figure 9.In this case, we obtain a deficiency at vertex v 8 .Hence, this third coloring is not optimal and therefore not represented in H none (G).
optimal edge-coloring with no deficiency colors 1 and 2 are swapped on the cycle with bold lines Let alg be any of the proposed algorithms.Since alg adds constraints to the original constrained programming model, it ideally forbids some solutions in L none (G) to produce a subset L alg (G) of optimal solutions.Therefore, the graph H alg (G) representing the links between the optimal solutions when the extra constraints produced by alg are taken into account is always an induced subgraph of the original graph H none (G).
One way to compare the performance of the algorithms is to observe their impact of their respective extra constraints on the solution space, and in particular on the the sets L alg (G) of optimal solutions.In the best case, a set of symmetry breaking constraints can at most divide the size of the original optimal solution space by |Aut(G)|.Hence, These bounds are not tight.Indeed, consider for example the P 4 in Figure 1(a).It has two automorphisms, the identity permutation and (v 1 v 4 )(v 2 v 3 ), and four optimal solutions.As already mentioned, the last solution is equivalent to the third one and is removed by our algorithms.Hence, every proposed algorithm alg breaks all symmetries while we have This means that when the lower bound is reached, we have the guarantee that all symmetries due to graph automorphisms have been broken, which is our objective.But it is also possible to attain that goal and still not reach the lower bound.We first consider the four graphs of Figure 10 with five vertices.The first one is the K 2,3 of Figure 3.The second one is a clique on 5 vertices, and hence a clique of twins.The third one is obtained from G 2 by removing one edge, and contains a clique of twins and a stable set of twins.The fourth one is obtained from G 2 by deleting two disjoint edges, and contains two stable sets of twins.The results of our algorithms are shown in Table 3.On the left part of the table, we indicate for every graph G i the class to which its belong (i.e., A1, A2, A3 or A4), the sizes of the cluster-automorphism Aut C (G i ) and of the automorphism group Aut(G i ), and the minimum deficiency d(G i ).On the rightside of the Table, we indicate the size of the sets L alg (G i ).The optimal solution spaces H alg (G i ) are shown in Figures 11,12,13 and 14.Solid edges correspond to a swapping on a path, while dashed edges correspond to a swapping on a cycle.The width of each edge is proportional to the number of vertices in the path or cycle.that guarantee.This can be observed in Figure 11 where three optimal solution spaces contain only one solution.
The best illustration of the decrease of the size of the optimal solution space is given by G 2 .The optimal solution space H none (G 2 ) at the leftside of Figure 12 clearly shows that many of the 720 optimal solutions are linked with each other.For comparison, H GNcol (G 2 ) and H GCcol (G 2 ) contain only 18 vertices grouped into 7 connected components.
Graphs G 2 , G 3 and G 4 , help to illustrate how the removal of an edge can change the landscape of the solution space dramatically.Note that the bound The col family of permutable variables is clearly the most efficient one for breaking symmetries.This is probably because it is the only one constraining the variables representing the color on an edge (the key decision variables of our problem), as well as the only one that has sometimes strengthened constraints.On the opposite end, using the def family of permutable variables has a much smaller impact on the reduction of the optimal solution space.The other two families min and max seem to have comparable performances, between the aforementioned two extremes.

Decrease of the computing time
We now consider six larger graphs for comparing the computing times needed to solve the constrained programming model with or without the extra constraints.The six graphs G 5 , . . ., G 10 are shown in Figure 15.G 5 and G 6 are cliques with 6 and 7 vertices, respectively.G 7 is obtained from G 6 by removing one edge.G 8 (respectively G 9 ) is obtained from G 6 by removing two incident (respectively disjoint) edges, while G 10 is obtained from G 6 by removing 3 disjoint edges.These graphs contain cliques of twins shown using boxes with dashed lines, and stable sets of twins shown using boxes with plain lines.All our previous general observations on the relative performance of the algorithms for the size of the optimal solution space remain true in the case of computing times, except that the methods based on the min family of permutable variables are slightly superior to those with the max family.Again, col gives the best results, with computing times reduced by at least one order of magnitude.Also, nauty consistently outperforms clusters, but within the same order of magnitude.Thus, GNcol clearly emerges as the best algorithm, combining successfully two ideas, the choice of the family of permutable variables and the method for finding a set of generators for the automorphism group.The graph with larger automorphism group is the clique G 6 on seven vertices with |Aut(G 6 )| = 5040.While the original constrained programming model does not find a proven optimal solution in 2 hours of computation, GNcol solves the problem in 11 seconds.The impact of the various methods on the computing time is also shown in Table 5 where we analyze the total time needed to solve the minimum deficiency problem for all graphs with n = 4, 5, 6, 7, 8 vertices.We do not report the results for n = 9 since among the 261084 graphs with 9 vertices, the cp solver has not produced proven optimal solutions within 2 hours of computation for about 100 of them.The orientation of the less than or equal (≤) inequality constraints used in section 5 to generate gamblle constraints has an impact on the performance of the algorithms, as it interacts with both the constraint that forces the usage of color 0, and the rule to choose the variable with the smallest index.Also, as already mentioned, the smallest index rule makes the gamblle constraints very sensitive to the labelling of the edges for the algorithms based on the col family of permutable variables.This is shown in Table 6 where we compare three different settings for graph G 9 .The first case (called G 9 (≤)) is the original one, where the vertices are labelled as shown at the leftside of Figure 16, the edges are labelled according to the lexicographical ordering of their pair of endvertices, JGAA, 21(2) 195-218 (2017) 215 and a cyclic permutation C with smallest indexed variable u i leads to inequalities u i ≤ u j or u i < u j for all u j in C with j = i.The second test (called G 9 (≥)) uses inequalities u i ≥ u j or u i > u j instead of the ≤ or < orignal ones.The third test (called G 9 (REV)) uses the ≤ or < inequalities, but considers the reverse labelling of the vertices with the corresponding labelling of the edges (according to the lexicographical ordering of their pair of endvertices) as shown at the rightside of Figure 16.While the models remain valid with these changes, we observe that the performance may significantly drop when we do not use the original setting.For example, the problem is solved in 7 seconds with the original settings, while the use of the ≥ or > inequalities increases the computing time to 100 seconds.

Conclusion
The generation of gamblle constraints is a general technique to help solving cp models of graph optimization problems.In the present paper, its potency is demonstrated through an application to the minimum deficiency problem.A straightforward cp model to find the exact deficiency is hindered by an overwhelming number of equivalent optimal colorings, in part due to the automorphisms of the considered graph.When included in the model, the gamblle constraints help to cut down the solution space by forbidding some equivalent optimal solutions, and thus improve the time performance of the solver.The total number of extra constraints generated remains very small, since it only depends on the number of generators found for the automorphism group, which is in the order of n in the worst case.These generators can be obtained using the famous nauty library created by Brendan McKay [16,17], which possibly requires an exponential computing time.Another possibility is to use the proposed procedure clusters, which generates in polynomial time a set of generators for a subgroup Aut C (G) of the automorphism group Aut(G).
Experiments have shown that most graphs have Aut C (G) = Aut(G), which means that most symmetries are due to cliques and stable sets of twins, and justifies the use of clusters.Also, four families of permutable variables have been considered, and we have noticed that the best results are obtained with gamblle constraints based on the color of the edges (col).We have shown that the GNcol and GCcol algorithms drastically decrease the size of the optimal solution space, and improve by at least one order of magnitude the basic model.The proposed algorithms are particularly efficient for graphs that have a lot of symmetries.
As last comment, we mentioned in Section 2 the conjecture that the minimum deficiency d(G) of a graph G with n vertices is always at most equal to n.No counterexample was found during our experiments, and the question therefore remains open.

Figure 2 :
Figure 2: A graph with clique and stable sets of twins.

Algorithm 1 :
with three cycles.The corresponding permutation of the c v variables is π = (c v1 , c v4 , c v7 )(c v2 , c v5 , c v8 )(c v3 , c v6 , c v9 ).The cycle with the variable of smallest index is C = (c v1 , c v4 , c v7 ) and we therefore add constraints c v1 ≤ c v4 and c v1 ≤ c v7 .Generation of gamblle constraints for a family of permutable variables input : Graph G, a subset P ⊆ Aut(G), a family F of ordered permutable variables output: A set L of gamblle inequality constraints Let P F be the set of permutations of the elements of F obtained from P foreach π = Id in P F do Let C be the cycle of π with the smallest indexed variable u i foreach u j ∈ C with j = i do if the model implies

Figure 4 :
Figure 4: Illustration of the generation of gamblle constraints.

e 1 e 2 e 3 e 4 e 5 e 6 (v 2 1 < c e 2 , c e 3 < c e 4 and c e 5 < c e 6 v 1 v 2 v 3 v 4 v 5 v 6 v 7 e 6 e 5 e 4 e 3 e 2 e 1 (v 2 1 ≤ c e 3 , c e 3 < c e 4 and c e 5 < c e 6 Figure 5 :
Figure 5: gamblle constraints for two different edge orderings of a binary tree. (a). clusters does not generate any permutation since the graph does not contain any stable set of twins or clique of twins.nauty generates permutation (v 1 , v 4 )(v 2 , v 3 ) that translates into permutation (c v1 , c v4 )(c v2 , c v3 ) of the c v variables.The considered cycle C is therefore (c v1 , c v4 ) which gives the gamblle constraint c v1 ≤ c v4 that forbids solution s 4 .Similarly, with the c v variables, we get the gamblle constraint c v1 ≤ c v4 which forbids solution s 4 .For the d v variables, we get the gamblle constraint d v1 ≤ d v4 that does not break any symmetry since d v1 = d v4 = 0 in both s 3 and s 4 .The permutation of the c e variables associated with (v 1 , v 4 )(v 2 , v 3 ) is (c e1 , c e3 ), and the associated gamblle constraint c e1 ≤ c e3 again forbids s 4 .Consider now the clique K 3 of Figure 1(b).Both clusters and nauty produce permutations (v 1 , v 2 ) and (v 2 , v 3 ).The corresponding gamblle constraints for the c v variables are c v1 ≤ c v2 ≤ c v3 , which remove all optimal solutions except s 1 and s 2 .The gamblle constraints for the c v variables are c v1 ≤ c v2 ≤ c v3 which are only satisfied by solutions s 1 and s 4 .With the d v variables, we get d v1 ≤ d v2 ≤ d v3 satisfied by s 4 and s 5 .The only variables that leave exactly one solution among the six equivalent ones are the c e variables.Indeed, their associated gamblle constraints are c e1 < c e2 < c e3 which forbid all optimal solutions except s 1 .Note that gamblle constraints associated with different sets of permutable variables cannot be combined, since all solutions to the minimum deficiency problem are then possibly forbidden.For example, for the clique K 3 , the union of the gamblle constraints d v1 ≤ d v2 ≤ d v3 and c e1 < c e2 < c e3 forbids all solutions: d v2 ≤ d v3 and c e3 > max{c e1 , c e2 } is equivalent to c e3 − c e1 ≤ c e3 − c e2 which implies c e2 ≤ c e1 .

Figure 7 :
Figure 7: Distribution of the automorphism group classes for dataset D1.

Figure 8 :
Figure 8: Distribution of the automorphism group classes for dataset D2.

colors 1 and 2 are swapped on the path with bold linesFigure 9 :
Figure 9: Illustration of swappings on cycles and paths.

Figure 16 :
Figure 16: Two labellings of the vertices and of the edges of G9.

Table 2 :
d(G) versus gN (G) and the automorphism group classes.

Table 3 :
Size of the optimal solution spaces for the proposed algorithms.

Table 4 :
Computing times (in seconds) for six graphs.

Table 5 :
Total computing times (in seconds) to solve all the graphs with 4, 5, 6, 7 and 8 vertices

Table 6 :
Computing times for three variants of the proposed algorithms applied to G9.