Conjure : Automatic Generation of Constraint Models from Problem Speciﬁcations

When solving a combinatorial problem, the formulation or model of the problem is critical to the eﬃciency of the solver. Automating the modelling process has long been of interest because of the expertise and time required to produce an effective model of a given problem. We describe a method to automatically produce constraint models from a problem speciﬁcation written in the abstract constraint speciﬁcation language Essence . Our approach is to incrementally reﬁne the speciﬁcation into a concrete model by applying a chosen reﬁnement rule at each step. Any non-trivial speciﬁcation may be reﬁned in multiple ways, creating a space of models to choose from. The handling of symmetries is a particularly important aspect of automated modelling. Many combinatorial optimisation problems contain symmetry, which can lead to redundant search. If a partial assignment is shown to be invalid, we are wasting time if we ever consider a symmetric equivalent of it. A particularly important class of symmetries are those introduced by the constraint modelling process: modelling symmetries. We show how modelling symmetries may be broken automatically as they enter a model during reﬁnement, obviating the need for an expensive symmetry detection step following model formulation. Our approach is implemented in a system called Conjure . We compare the models produced by Conjure to constraint models from the literature that are known to be effective. Our empirical results conﬁrm that Conjure can reproduce successfully the kernels of the constraint models of 42 benchmark problems found in the literature. © 2022 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).


Introduction
Efficient decision-making is of central importance to a modern society. It is natural to represent and reason about decision-making problems in terms of constraints. For example, in scheduling a football league many constraints occur, such as: every team has to play every other, home and away; every match must be assigned a set of officials, and no official or team can be in two places at once; no team should be scheduled to play more than, say, four consecutive away games. Constraint programming [1] offers a means by which solutions to such problems can be found automatically. Constraint 1 language Essence 1.3 2 given w, g, s : int(1..) 3 letting Golfers be new type of size g * s 4 find sched : set (size w) of 5 partition (regular, numParts g, partSize s) 6 from Golfers 7 such that 8 forAll g1, g2 : Golfers, g1 < g2 .

9
(sum week in sched . toInt(together({g1, g2}, week))) <= 1 Fig. 1. An Essence problem specification of the Social Golfers Problem (Problem 10 at CSPLib.org). In a golf club there are a number of golfers who wish to play together in g groups of size s. Find a schedule of play for w weeks such that no pair of golfers play together more than once.
solving of a given problem proceeds in two phases. First, the problem is modelled as a set of decision variables, and a set of constraints on those variables that a solution must satisfy. A decision variable represents a choice that must be made in order to solve the problem. The domain of potential values associated with each decision variable corresponds to the options for that choice. In our football league example, one might have two decision variables per match to represent each of the home and away teams. The second phase consists of using a constraint solver to find solutions to the model: assignments of values to decision variables satisfying all constraints.
There are typically many possible models for a given problem, and the model chosen can dramatically affect the efficiency of constraint solving. This presents a serious obstacle for non-expert users, who have difficulty in formulating a good (or even correct) model from among the many possible alternatives. Modelling is therefore a critical bottleneck in the process of constraint solving, considered to be one of the key challenges facing the constraints field [2].
This paper presents the automated constraint modelling system Conjure, which serves to demonstrate the efficacy of the refinement-based approach. A problem is input to Conjure in Essence, an abstract constraint specification language. Essence's support for abstract decision variables with types such as set, multiset, relation and function, as well as nested types, such as set of sets and multiset of relations allows a problem to be specified without committing to constraint modelling decisions. To illustrate, consider the fragment of the Essence specification of the Social Golfers Problem [31] presented in Fig. 1. Given a number of weeks (w), a number of groups (g) and a group size (s), the problem is to find a schedule of play over the w weeks for the g × s golfers divided into g groups of size s, subject to a socialisation constraint among the golfers that stipulates that no pair of golfers play together more than once. The Social Golfers Problem is naturally conceived as finding a set of partitions of golfers subject to some constraints, which can be specified in Essence via a single abstract decision variable, as presented in the figure where the variable is sched.
Since these abstract types are not supported directly by constraint solvers, 1 an Essence specification must be transformed (refined) into a constraint model. Automating this process presents a considerable challenge and the contributions of this work are in meeting that challenge. Principal among these is a carefully designed rule-based architecture implemented in Conjure to refine an Essence specification into a constraint model. One key contribution is that Conjure can refine nested types without resorting to enumerating the values of the inner type (for example, refining a set of sets of integers without enumerating all possible values of the inner set). This capability is vital to refining many of the Essence specifications that we use in the evaluation. As we will demonstrate, different rule application pathways produce different constraint models, supporting an automated model selection process among the many possible alternatives. This approach also facilitates the automated production of channelled constraint models [32], in which a single abstract decision variable is refined in multiple ways. Channelling constraints are elegantly generated for an abstract decision variable A by creating the equality A = A and refining it with two different representations of A, thus ensuring the two representations take the same abstract value in all solutions. Channelled models have previously been created manually by experts, typically in an effort to simplify the statement of the problem constraints so as to strengthen the inference of the constraint solver and reduce search.
A further important contribution of our rule-based architecture is in the treatment of symmetry, a structure-preserving transformation. In the context of a constraint problem, given a solution to a problem instance we can obtain another symmetric solution. Symmetry can lead to redundant search: if the constraint solver reaches a dead end in its search for a solution, we are wasting time if we ever consider a symmetric equivalent of it. A particularly important class of symmetries are those introduced by the constraint modelling process, which we have called modelling symmetries [33][34][35]. Modelling symmetries occur naturally as abstract decision variables are refined into constraint models.
As a simple example consider representing a set of size n by a vector of n variables, constrained to take distinct values. Without care, this can introduce n! symmetries for the set represented by the vector in all possible orders. If the elements of the set are integers, there is no deep problem: we can add to the model the constraint that the integers appear in the vector in increasing order. However, this simple approach cannot be used directly if the elements of the set are themselves (for example) sets of multisets. As we will discuss, our rule-based architecture can recognise when modelling symmetries arise in the refinement of a constraint model and add symmetry breaking constraints to deal with complex symmetries of this type. This obviates the need for an expensive symmetry detection step following model formulation, as used by other approaches [36][37][38]. When a refinement performed by Conjure introduces symmetry, the symmetry is broken consistently and completely by the addition of symmetry-breaking constraints. In several cases this also allows for improved refinement of Essence expressions. Furthermore the symmetry breaking constraints added hold for the entire parameterised problem class captured by the Essence specification -not just a single problem instance -without the need to employ a theorem prover.
Our final contribution is an empirical evaluation of the coverage of the model space provided by Conjure. In an extensive set of experiments we show that, for a wide variety of problems, the substantial majority of models crafted manually by human experts can be automatically generated by Conjure from Essence specifications of those problems. In addition, we present a simple and lightweight heuristic for choosing among the models generated by Conjure. The CompactEP heuristic is often able to select a good model for a given problem specification. We evaluate the accuracy of CompactEP on a wide range of problems. Rather than focusing on the runtime performance of models with particular solvers and instance sets (which would give a very limited picture of model quality), we performed a qualitative comparison of the generated models with previously published models in Section 7.
Our approach to the refinement of types and expressions is from the outside-in, which allows refinement rules to handle a single layer of a type, or single operator, at a time -although multiple types or operators can be handled where this can improve the refinement. Quantified expressions are handled generically in a way that is independent of which quantifier is used, by separating the gathering of values to be quantified over and application of the quantifying operator. Conjure is designed to be extended with further types, attributes and operators in the future -several types, including sequences, have been added to Essence since the first release of Conjure.
The work presented in this paper summarises and extends over fifteen years of our work on automated constraint modelling. Our earliest work on refinement-based automated constraint modelling appeared between 2002 and 2005 [39][40][41][42]24]. We introduced the Essence language in 2005 [43,44], which is the subject of a separate journal article [26]. Following the presentation of initial prototypes [24,45] the first full version of Conjure was presented in 2011 [46] then extended to handle automated symmetry breaking [34,35], and presented in detail in Akgun's thesis [47]. Herein, we give a complete overview of Conjure, including the most recent advances.

Contributions
In summary, our main contributions are as follows: • Conjure is unique in refining problem class specifications to class-level constraint models. • Multiple models are generated from one Essence specification by following different rule application pathways. • Conjure is able to refine nested abstract types (for example, a set of sets of integers) without enumerating all possible values of the inner type (in this example, set of integers).
• Symmetry introduced during refinement is broken consistently and completely. • Conjure is able to generate channelled models by representing an abstract decision variable in more than one way, with an elegant mechanism for producing channelling constraints from a simple equality constraint.
• Model selection is achieved via the simple and lightweight CompactEP heuristic, which is shown to select good models in many cases.
• The system is evaluated comprehensively on 42 problem classes from CSPLib [48], demonstrating that Conjure is able to generate models similar to models in the literature produced by experts.

CONJURE by Example
This section illustrates the operation of Conjure on a simple problem specification. It exemplifies some constructs of the input language Essence and the output language Essence Prime. There are a large number of refinements that are applied to transform a full Essence specification into a concrete Essence Prime constraint model. The goal of this example is to highlight the most important kinds of refinements before we describe them in their full generality. We include forward references to later sections where appropriate.
1 language Essence 1.3 2 given object new type enum 3 given weight, value : function (total) object --> int(1..) 4 given maxWeight: int(1..) 5 find knapsack : set of object 6 maximising sum i in knapsack . value(i) 7 such that (sum i in knapsack . weight(i)) <= maxWeight  Essence   Fig. 2 shows an Essence specification for the Knapsack Problem. We have chosen this familiar problem to illustrate the basics of refinement. The Knapsack specification does lack some of the more sophisticated features of Essence, such as nested types, and we will explain how Conjure treats these in later sections. Lines 2-3 specify the problem class parameters: an enumerated type of objects; a weight and a value per item, represented as total functions; and the maximum weight of the knapsack. Line 5 specifies the single decision variable, the set of objects to be placed in the knapsack. Line 6 specifies the objective function, which is to maximise the value of the collection of items in the knapsack. Finally, line 7 specifies the capacity constraint.

The Knapsack Problem in
Some features of Essence, such as the function domains in this specification, are not supported by conventional constraint modelling languages. Therefore, they need to be refined to use supported features like integer and matrix domains. Moreover, the problem constraints are stated in terms of Essence domains, which also need to be refined accordingly.
Some refinement steps are simple, such as replacing enumerated domains with isomorphic integer domains. Others are more complex, such as choosing a representation for abstract decision variables and refining abstract constraint expressions.
In the rest of this section we focus on the set decision variable knapsack and present multiple ways of refining it.

Choosing Representations
Before applying any modelling refinements, Conjure traverses the entire model and labels every reference to abstract decision variables (Section 3.1) with a representation decision (Section 4.1). In this example, the knapsack variable is an abstract decision variable and is referenced in two places, on lines 6 and 7.
We consider two representations for a set domain in this section: the Explicit representation and the Occurrence representation. The Explicit representation uses a matrix of decision variables, representing the elements of the set, together with a single integer variable, representing the cardinality of the set. Structural constraints (Section 4.1) are posted to ensure these variables represent valid set values. The Occurrence representation uses a Boolean matrix of decision variables, indexed by the domain of possible elements of the set. In this representation, a true value at a certain index of the matrix indicates set membership.
Conjure may use either of these representations for each reference to the knapsack variable. 2 Choosing multiple representations for the same abstract decision variable leads to channelled models (Section 4.3).

The Explicit representation
Choosing the Explicit representation for both references to the knapsack variable leads to the addition of new variable declarations and structural constraints to the model, as shown in Fig. 3. The matrix knapsack_ Explicit represents the elements of the set, and the integer variable knapsack_Size represents the cardinality of the set. The first structural constraint both enforces distinctness and achieves symmetry breaking by sorting the entries in the matrix. The sorting is only enforced up to the cardinality of the set, since entries after this point are not members of the set. The second structural constraint assigns the variables after the knapsack_Size marker to take an arbitrarily chosen value of their domain, as described in Section 4.1.3.
The two references to knapsack are refined using the Explicit representation. Fig. 4 shows the refinement of only one of the expressions, the other expression is refined similarly. The sum expression quantifying over the set decision variable is refined to another sum expression quantifying over a simple integer domain. We use a multiplication with the set membership condition inside the quantified expression. This allows us to exclude entries in the matrix that do not represent members of the set.

The Occurrence representation
Similarly, choosing the Occurrence representation for both references to the knapsack variable leads to the addition of a new variable declaration to the model. This is shown in Fig. 5. The matrix knapsack_Occurrence represents the set. A true assignment at index i of the matrix indicates that value i is in the set. This representation does not introduce any symmetry, and it does not require any structural constraints to be posted. This is because every assignment to the Boolean matrix corresponds to a unique assignment to the original set variable.
The two references to the knapsack variable are refined using the Occurrence representation. Fig. 6 shows the refinement of one of the expressions, and the other is refined similarly. The sum expression quantifying over the set decision variable is refined to another sum expression quantifying over the potential members of the set. Once again we use multiplication to exclude the values that are not members of the set.

Channelled models
Suppose we chose more than one representation for a single abstract decision variable. Each representation would be generated as above, but they would also need to be connected together to ensure the representations all represent the same value of the original decision variable. Models with more than one representation are called channelled models, and the constraints connecting the representations are channelling constraints.
In a channelled model with two representations, both sets of decision variables and structural constraints are added to the model. Each reference to the decision variable is refined using its chosen representation. The channelling constraints are generated by posting an equality constraint (in this example, knapsack=knapsack) and tagging the two occurrences of the decision variable with different representations. The equality is then refined using the same refinement procedures that are applied to any constraint. For our running example, the channelling constraints are given in Fig. 7. The first constraint ensures all members of the Occurrence representation are also members of the Explicit representation, and the second constraint ensures the same holds in the opposite direction.

Summary
In this section we have illustrated how Conjure generates multiple diverse models from a single specification by choosing representations of the abstract decision variables. In the following sections we describe the Conjure system, its input language Essence, and the set of refinement rules and representations that allow us to generate a diverse set of models. Section 3.3 presents Conjure in the context of a pipeline of tools and languages.

Automated Modelling in CONJURE
In this section we set the scene for automated modelling by describing Conjure itself, the toolchain it sits within, and the languages produced and consumed by Conjure and the other tools. First we summarise the Essence language consumed by Conjure and highlight its most important features. We then summarise the Essence Prime language produced by Conjure, and the tool Savile Row that translates Essence Prime to the language of a target solver.

Summary of the Essence Language
This section provides a summary of the current state of the Essence language sufficient to describe the operation of Conjure. For further details the reader is referred to the original journal paper describing Essence [26] and the frequently updated documentation accompanying the Conjure release [49].
Conjure takes as input an abstract problem specification written in Essence and automatically generates Essence Prime models as output. Essence is a high-level problem specification language providing a rich set of built-in domains and domain constructors (parameterised domains), such as multi-sets, functions, and partitions. Decision variables can have these domains so as to precisely encode what they mean, and to avoid the need to model these complex domains via multiple decision variables with simpler domains. Essence domains that are not directly represented in Essence Prime are called abstract domains and domains that are shared between the two languages are called concrete domains (Boolean, int, and matrices of these). We also characterise domains as compound when they contain multiple elements (such as a tuple or matrix). Tuples and records contain a fixed number of fields. Fields in a tuple domain are identified by their position and fields in a record domain are identified by the field name. Variants are tagged unions: they contain a single value for one of the components, tagged by the name of the component. The full set of domains and domain constructors in Essence and the handling of abstract and concrete domains is given in Table 1. Domains and domain constructors may be nested arbitrarily, allowing for rich domains such as a partition of sets of integers.
Unnamed types [26] may be unfamiliar so we briefly describe them here. An unnamed type represents a set of objects that are indistinguishable, such as the golfers of the Social Golfers Problem (Fig. 1). The elements of an unnamed type are not named or numbered individually, and so cannot be referred to directly in the specification. Unnamed types exist to provide an abstraction for sets of indistinguishable objects, allowing such sets to be specified without introducing symmetry. However, the current implementation of unnamed types in Conjure (mapping to integers) introduces symmetry. An implementation that does not do so is challenging and an important area of future work, as described in Section 4.1.2.
Domains are further specified by adding attributes, and each domain constructor has its own set of attributes that may be used with it. Attributes further restrict (i.e. make precise) an abstract domain, so the user of Essence does not need to use constraints to achieve the desired effect. For instance, a set variable may have a minSize attribute attached to it, which  Table 3 Operators of abstract types and matrices in Essence. In addition, equality, disequality, and ordering operators are provided for all types, and many types may be used as generators of comprehensions and quantifiers as shown in Table 4. For a full list of operators on all types and full definitions see the Conjure documentation [49].  Table 2.
Essence is statically typed and Conjure completely type-checks a specification before refining it. Each decision variable or parameter has a domain, and to obtain the corresponding type Conjure strips the attributes from the domain, replaces all int(...) with the type int, and replaces all subsets of enumerated types with the corresponding full enumerated type.
Essence also has a rich collection of operators that allow concise expressions to be written on abstract types. For example, for functions there is an inverse operator, which ensures two functions are inverses of each other. For relations, relation projection lets us create a relation of smaller arity while fixing some of the components to a specific value. Excepting integer and Boolean operators, which may be found in the manual, the complete set of operators in Essence is summarised in Table 3, organised by the types to which they may be applied. Operators may be nested in any way that respects typecorrectness.
Essence also provides quantifiers and comprehensions to construct complex expressions that are difficult or impossible to express using only the operators in Table 3. Quantifiers and comprehensions introduce local variables that take values from a domain or an abstract decision variable. For example, the knapsack specification in Section 2 contains the following sum quantifier, where knapsack is a decision variable of type set of int, and value is a function from objects to their monetary value. The quantifier calculates the total value of objects in the knapsack.
sum i in knapsack . value(i) A quantifier has a keyword (forAll, exists, or sum), the quantified variable, a domain or abstract decision variable that defines the set of values that the quantified variable will take, and finally an inner expression (of type int for sum quantifiers, otherwise bool). A quantifier can be evaluated by binding the quantified variable to each value in turn, evaluating the inner expression for each value, then aggregating the results by conjunction, disjunction, or addition for the quantifiers forAll, exists, or sum respectively. Table 4 summarises the types of expressions that may be used to generate the set of values that the quantified variable will take, and the corresponding type of the quantified variable in each case.
Medium-level constraint modelling languages (such as Essence Prime and OPL [13]) typically have the quantifiers forAll, exists, and sum (and in some cases others such as product, min, max), but the quantified variable has type int, and the values are drawn from a domain with type set of int, not from an abstract domain or abstract decision variable. Quantifiers in Essence are substantially more general than those in Essence Prime, which does not have the abstract types. Comprehensions in Essence create a one-dimensional matrix (a list). The list may then be aggregated to a single value using a function such as and, or, xor, sum, product, min, max, or global constraints like allDifferent. Lists generated via comprehensions can be used as arguments to several operators, in contrast quantified expressions are limited to forAll, exists, and sum. In common with quantified expressions, comprehensions have an inner expression and they introduce local variables whose values are drawn from an abstract domain or abstract decision variable. Comprehensions Mapping in the function Member of relation partition from τ set of τ Part in the partition also have conditions: Boolean expressions that act as a filter. The condition can contain references to decision variables, which is not possible in the comprehensions found in Essence Prime for example. Comprehensions (with aggregation functions) are more expressive than quantifiers, and they are used internally throughout Conjure in preference to quantifiers. The example above can be expressed as a comprehension as follows: As a final example of both quantifiers and comprehensions, suppose we wished to find a multiset of integers where all elements above 10 are even numbers. In the following Essence specification, the constraint on the elements of the multiset M is expressed using a quantifier and an implication. The same constraint can also be expressed using a comprehension with a condition, as follows.

such that and([ i % 2 = 0 | i <-M, i > 10 ])
In both quantified expressions and comprehensions, all collection types can be used as generators. The type of the quantified variable is chosen based on the type of the generator.

Summary of the Essence Prime Language
Essence Prime [50] is a medium-level solver-independent constraint modelling language with some similarities to other modelling languages such as OPL [13] and MiniZinc [12]. Essence Prime was originally conceived as a subset of Essence without the abstract types. For the purposes of this paper, Essence Prime can be considered as a subset of Essence with the following restrictions: 1. There are no abstract types (sets, multisets, sequences, functions, relations, or partitions). Essence Prime supports decision variables and problem class parameters of type int, bool, and matrix of int and bool. Matrices may have any number of dimensions, and may be indexed by any integer domain. 2. Generators and conditions within comprehensions and quantifiers are not allowed to contain decision variables.

The Pipeline
Our modelling and solving pipeline is illustrated in Fig. 8. An Essence problem specification is given to Conjure, which refines the specification into a set of concrete models in Essence Prime. Both the specification and the model typically relate to a problem class, i.e. they both have problem class parameters that need to be instantiated before instances of the class can be solved. Conjure separately translates problem class parameters expressed in Essence into Essence Prime using the representations selected when refining the problem specification. This allows the user to solve multiple instances of the same problem class while only performing refinement once.
Savile Row [16] is the second tool in the pipeline. It takes as input the model and problem class parameters in Essence Prime, and produces output for a number of different solvers. Savile Row instantiates the model and performs optimisations before translating the instance into the input language of a solver. Currently Savile Row translates to CP solvers Minion [51] and Gecode [52], the learning CP solver Chuffed [53], SAT solvers such as Glucose [54], MaxSAT solvers such as Open-WBO [55], and SMT solvers such as Yices [56], Z3 [57], and Boolector [58].
Once a solution has been found Savile Row translates the solution back into Essence Prime. Conjure then translates the Essence Prime solution back into Essence. Thus the user of Conjure can specify a problem in terms of abstract types such as partition, and receive solutions in terms of the same types.

How Essence is represented in Conjure
Problem specifications are represented internally using an abstract syntax tree (AST). A complete specification contains a language declaration line and a list of statements. Each statement is either a declaration (of parameters, decision variables, or aliases), a constraint, an objective (for optimisation problems) or a where statement. Decision variables (find), parameters (given) and aliases (letting) have names as part of their declaration statement and they can be referred to by their name in the subsequent statements. Constraints (such that) and where statements contain a list of Boolean expressions. The objective statement contains a single expression of type int or an enumerated type. A problem specification can have at most one objective statement. There is no restriction on the order of statements of different kinds, the only restriction is that declarations cannot be referred to before they are declared, thus circular definitions are disallowed.
Expressions in the Conjure AST are composed of references to existing declarations, operator applications, literal values for the various types in Essence, quantified expressions, and comprehensions. Conjure implements 76 operators in its latest version. We do not give a list of all operators here, these are available in the Conjure documentation [49]. Quantified expressions and comprehensions are commonly found in many modelling languages. Internally, only comprehensions are represented and quantified expressions are converted to comprehensions directly after parsing. Conjure implements a full evaluator for Essence, which can be used to validate solutions. The full evaluator is able to compute a Boolean value for constraint expressions as long as values for the declarations referenced in the expression are fully defined. Typically values for givens come from a parameter file and values for finds come from solution files during solution validation. In addition to the full evaluator, a partial evaluator is implemented which is used to simplify expressions where possible. The partial evaluator is applied in a very similar way to the refinement rules (discussed in Section 4). The partial evaluator has the highest precedence, so expressions are always evaluated rather than refined if possible.

Refinement Rules in CONJURE
Conjure translates an abstract problem specification written in Essence into a concrete model in Essence Prime via a series of transformations. These transformations are written as rules in Conjure. There are two main kinds of rules: representation selection and expression refinement. Applying representation selection rules to each abstract variable in a specification corresponds to choosing a viewpoint for the problem. A viewpoint is a selection of variables with associated domains sufficient to characterise the solutions to the problem. Different viewpoints give rise to fundamentally different models of a problem [59,60]. Multiple representation selection rules may be applied to the same abstract variable to create a channelled model [32], in which a single abstract decision variable is refined in multiple ways. Expression refinement rules rewrite expressions to use one of the selected representations of an abstract variable. Thus the two types of rules correspond to modelling steps taken by human modellers: selection of a viewpoint or viewpoints, and formulating the constraints.
Refinement rules in Conjure encode known modelling transformations that are well established in the literature and are known to be correct. We do not formally prove the correctness of the refinement rules; a full and formal exposition of the rules together with proofs of correctness is out of the scope of this paper.

Representation Selection Rules
Representation selection rules operate on decision variables or parameters with abstract domains. When a representation selection rule is applied to a domain, it removes the outermost abstract type and replaces it with a concrete type such as a matrix. The output domain is not necessarily concrete, however a concrete domain can always be reached by repeated application of representation selection rules.
In some cases the output domain of a representation selection rule may have values in its domain that do not correspond to values of the input domain. In this case, structural constraints are needed to rule out these values.
As an example, consider the Occurrence representation of a set. The original domain is set (size n) of T, where T represents an Essence domain. The new domain has one Boolean variable for each value that may be in the set, where the Boolean is assigned true if the value is in the set. The rule is represented below.
input-declaration: find x : set (size n) of T output-declaration: find x_Occurrence : matrix indexed by [T] of bool structural-constraint: (sum i : T . toInt(x_Occurrence[i])) = n The input-declaration part of the rule is pattern-matched against the abstract domains. The outputdeclaration gives the resulting domain, where the value of T is given from the input-declaration. Finally the structural-constraint requires that n of the Booleans are true, as the set is required to be size n.
Whenever multiple representation selection rules match one abstract domain, one or more representations must be selected in some way. In Section 6 below we present a simple heuristic that is often able to select a good model.
Each representation selection rule has associated mapping functions that translate between values in the input domain and those in the output domain. The mapping functions are used to translate parameter values from Essence to Essence Prime, and to translate solutions expressed in Essence Prime to Essence (Fig. 8). Each representation only encodes one step of this translation and Conjure applies them successively to convert between Essence and Essence Prime.

Conditional Structural Constraints
Structural constraints are essential for the correctness of representation selection rules. However, in some cases we need to condition the application of these structural constraints on other parts of the model. For example, if the Occurrence representation of a set (shown above) were contained in another set of cardinality 0 or 1, then the structural constraint would be required when the outer set has cardinality 1, otherwise the Occurrence representation is unused and its structural constraint is not required. For a further example, see Section 4.1.3.
We introduce an operator structuralCons(X) representing the structural constraints (if any) of the chosen representation of X. Concrete types have no structural constraints and by default these are treated as the true constraint. Structural constraints are always applied for the outermost type of the abstract domain of a declaration. Representation selection rules are responsible for applying structural constraints to any abstract decision variables that they declare in their output declaration section. Many representation selection rules simply apply structuralCons(X) for every X they declare, but some do not.

Modelling Symmetry
Symmetry enters constraint models in two ways. Some problems have inherent symmetries, for example the rotations of a chessboard, which if not broken are reflected in the model. Many symmetries however are introduced by the modelling process; in this case a single solution to the problem corresponds to multiple assignments to the variables of the model. For example, in the Explicit representation a set is represented as a list -reordering the members of this list does not change the set represented. Frisch et al. [33] show how each representation selection rule of Conjure can be extended to generate a description of the symmetries it introduces and how the generated descriptions can be composed to form a description of the symmetries introduced into the model. However, they do not show how to convert model symmetry descriptions into symmetry breaking constraints.
Conjure takes a different approach to generate symmetry breaking constraints: rules that introduce symmetries also generate a constraint to break those symmetries (excepting unnamed types, discussed below). A modelling symmetry is introduced whenever the application of a representation selection rule increases the number of solutions. This occurs when the output domain, constrained by the structural constraints, has more values than the input domain. Suppose we define the Explicit representation of a set as follows. In this rule the allDifferent structural constraint prevents repeated values in x_Explicit, however it does not constrain the order of the values in the matrix. The structural constraint suffices for correctness, however the rule would introduce modelling symmetry, which in turn may degrade the performance of a solver. The second line of the structural constraint section applies the structural constraints of the inner type to all elements of this set. Each representation is responsible for applying the structural constraints to the nested objects, since these are not always applied unconditionally. In Section 4.1.3 we see an example of the conditional application of the structural constraints for the elements.
To avoid modelling symmetry, in addition to ensuring the elements are all different we also impose an order on the matrix. As the elements of the matrix can be any type T we introduce two new operators, ≤ and < (also written as .<= and .<). These operators provide a total ordering (and a strict version of the same total ordering) for all types in Essence.
These orderings are not intended to be "natural" and are not available in the Essence language. As these orderings are only used to break symmetries, the specific ordering used will never change the solutions of any specification. The two arguments of ≤ and < must have the same representation. They are used only in refinement rules to generate effective symmetry-breaking constraints. Using these orderings, the Explicit rule for sets is modified to break all the symmetries it introduces, as follows.
input-declaration: find x : set (size n) of T output-declaration: find x_Explicit : Rather than introducing a chain of ≤ constraints, this rule exploits the fact that the elements of the set are required to be different and strengthens the ordering to a < constraint.
As well as providing efficient and composable symmetry breaking, breaking symmetry immediately in this way has other advantages. Expression refinement rules (described below) can exploit the fact that symmetry breaking is performed immediately to produce more efficient refinements. Consider refining the constraint S = T by representing the sets S and T of the same fixed size as matrices S and T with the allDifferent structural constraint. To check if S and T represent the same set we need to check if each element of S is equal to any element of T , since the order of elements in the matrices can be different. However, with the < ordering we can refine S = T to S = T , because each assignment of S corresponds to exactly one assignment to S . This gives a much smaller and simpler refined expression and both provides a simpler constraint and smaller search trees.
Both ≤ and < are entirely removed within Conjure by translating them into lexicographic (lex) ordering constraints [61,62]. The ordering imposed by ≤ and < is allowed to differ depending on the representation chosen for each variable, to allow Conjure to use the most simple and efficient lex ordering constraints. Removing ≤ and < operators is achieved with a small set of rewriting rules. First, references to abstract decision variables are replaced with their representation. If the representation has multiple output declarations (e.g. a matrix and a size variable) then they are contained in a tuple. Once the arguments of the ≤ or < contain no abstract types, each matrix is flattened into a one-dimensional matrix using flatten, and each tuple is concatenated into a single one-dimensional matrix using concatenate. Finally A≤B is replaced by A ≤ lex B and similarly for < (or ≤ and < for a single integer or Boolean). The flatten and concatenate functions exist in Essence Prime so there is no need to further translate them.
The representation selection rules in Conjure are designed to avoid introducing modelling symmetry. Many representation selection rules have additional structural constraints to prevent modelling symmetry arising. In this way we maintain a model that is free of modelling symmetry throughout the refinement process with one exception: unnamed types.
Unnamed type symmetry cannot be handled in the same way as the other modelling symmetries introduced by Conjure. Unnamed type symmetries must be removed first, because we cannot put a complete ordering on an unnamed type, or any type which contains an unnamed type, as by definition unnamed types are not ordered. However, breaking general unnamed type symmetry is extremely difficult. Consider the type set of set (size 2) of U for an unnamed type U -this type represents an undirected graph on a set of vertices U, and checking if two graphs are the same (allowing reordering of the vertices) is the famous "Graph Isomorphism" problem, whose complexity is unknown. Extending to two unnamed types with matrix indexed by [U1,U2] of bool produces a matrix where the rows and columns can be permuted, which is known to be NP-complete and there have been several papers investigating the best way to partially deal with this symmetry group [63,64]. As a final example, the type matrix indexed by [int] of U has value symmetry which can be broken in polynomial time for some problem classes [65]. In future work we will look at general methods of dealing with unnamed type symmetry, which will cover all the different symmetries which can arise from the use of unnamed types.

Types with Variable Size
Many domains in Essence have values of different sizes. A simple example would be a set domain with no attributes restricting the size of the set. If the set is a decision variable then deciding the size of the set becomes part of the decision problem. The Explicit representation selection rule only works for fixed cardinality sets, whereas variable cardinality sets are also commonly found in combinatorial optimisation problems. We define a representation (called Explicit-VariableSize) which uses a single integer decision variable to track the cardinality of the set, and creates a matrix that has sufficient entries of type T to represent the largest possible set. In this rule, Tsize is the smallest of the size of the domain T (which is calculated automatically) or the maxsize annotation for x, if one is given. The structural constraint for Explicit-VariableSize orders the elements for the first x_Card indices of the matrix, breaking the modelling symmetry on those elements. However, the remaining elements of the matrix are now free to take any value in T, therefore the representation has conditional symmetry [66]. Solvers may search over all possible assignments, both increasing the size of the search and producing many solutions which represent the same solution of the specification. Other abstract types that have a non-trivial refinement (multiset, function, relation, and partition) may also be of variable size, so this issue of unconstrained variables occurs in many representations. In general, dontCare constraints are used to fix the values of any free variables introduced by the representations. Another example of free variables occur in the representation of partial functions. In cases where a value is not defined in the function, the corresponding image variables are free.
We introduce a new operator named dontCare to break conditional symmetry caused by free variables. For any Essence decision variable x, dontCare(x) assigns all decision variables in the concrete representation of x to their smallest value. This prevents the target solver from searching on any of the decision variables in the concrete representation of x.
All dontCare operators are removed before the Essence Prime model is produced, so there is no need to extend other tools to support it. Removing dontCare operators is achieved with a small set of rewriting rules. A dontCare operator on a decision variable with an abstract domain is rewritten as dontCare on the representation of the decision variable. When dontCare is applied to a tuple or a matrix, it is rewritten to apply to each element of the tuple or matrix separately. When dontCare is applied to a Boolean or integer variable, it is rewritten to an equality constraint fixing the variable to its smallest value. These rules suffice to remove dontCare completely before an Essence Prime model is produced.
The assignment made by dontCare(x) may not correspond to a value in the abstract domain of x. For example, if the abstract domain is set (minSize 2) of int(1..3) and the Occurrence representation is used (as in Section 2.4), the current implementation of dontCare(x) assigns all variables to false, and therefore produces an empty set. This will conflict with the annotation minSize 2. Therefore the structural constraints of x will conflict with dontCare(x), and we ensure that Conjure avoids asserting both together.
Representation rules are required to ensure each abstract variable they introduce will have exactly one of dontCare or structuralCons placed on them in any assignment, to ensure both the removal of symmetries and correct answers. The Explicit-VariableSize rule is therefore written as follows: The dontCare constraint is refined using the standard expression refinement processes within Conjure, and it is used in some refinements of several other abstract types. In Section 5 we evaluate the impact of breaking conditional symmetry using dontCare.

Consistent Symmetry Breaking
A well known issue when using constraints to break multiple sets of symmetries in the same problem is that the constraints can conflict, leading to lost solutions (e.g. [63]). This problem does not occur when Conjure breaks symmetries and conditional symmetries introduced during refinement. The reason for this is simple: each symmetry is broken as soon as it is introduced, allowing us to handle each introduced symmetry group in isolation.
To elaborate, one important feature of Conjure is that during refinement we have a valid specification after the application of each refinement rule (these partially-refined specifications include some constructs internal to Conjure not in Essence). Therefore when we introduce a symmetry or conditional symmetry during refinement, and then immediately remove it by the addition of new constraints, at no point simultaneously are there two model symmetries that we have to break consistently. If, on the other hand, we delayed breaking symmetry until refinement was complete, we would then have to break all symmetries in a consistent manner.
The symmetry breaking constraints generated by Conjure cannot conflict with any constraints in the original specification either. Conjure only breaks the symmetry introduced by a representation selection rule. For this purpose, it posts symmetry breaking constraints on the concrete decision variables it generates. The concrete variables are not present in the original specification so it is impossible to write conflicting constraints in terms of them.
Refining any Essence specification using Conjure produces a model that has an identical number of solutions to the specification. Therefore we have broken all symmetries which would lead to one Essence solution mapping to multiple Essence Prime solutions. We only need to ensure each representation selection rule in isolation preserves exactly one assignment for each solution, and the application of any set of representation selection rules will also preserve the number of solutions.
We have focused in this paper on breaking modelling symmetry. While the abstraction of the Essence language naturally lends itself to writing Essence specifications without symmetry, we do expect that some Essence specifications will contain symmetries and conditional symmetries, for example representing the reflections and rotations of a chessboard. Assuming the symmetry in the specification has been detected (a topic not addressed in this paper) and broken consistently by adding constraints to the specification prior to refinement (for example via the lex leader method [61]) there will be no consistency issue with the way in which Conjure breaks modelling symmetry.

Viewpoint Selection
Choosing a representation selection rule to apply to a decision variable corresponds to a human modeller selecting a viewpoint. It is therefore crucially important to the efficiency of the model, affecting the ease of stating constraints, their propagation and ultimately the efficiency of the search for a solution.
Conjure makes all representation choices in one pass, separating the choice of representations from the actual application of the representation selection rules. It chooses a representation for each decision variable in the specification. Every reference to a decision variable is tagged with the name of its representation (which guides the application of expression refinement rules, as described in Section 4.2 below). For simplicity we assume here that each decision variable has one representation. However, in a channelled model a decision variable may have multiple representations. Section 4.3 describes how Conjure generates channelled models. Conjure  Table 5 gives a brief description of each of Conjure's representation selection rules. There are 17 representations in total, spread across the 6 abstract domain constructors. Three representations (FunctionAsRelation, RelationAsSet, PartitionAsSet) work by converting an abstract domain to another abstract domain, which is then converted to a concrete domain by subsequent representation rule applications. We briefly explain the remaining representations in this section.

Representation Selection Rules in
A common method shared by several representations is to use marker or flag variables to indicate the relevant members of a matrix. For example, in a variable size set representation (with a marker variable), Conjure creates a matrix with sufficient entries to represent the maximum number of elements of the set. In addition a marker variable is used to indicate the size of the set. Decision variables in the matrix that are not used are irrelevant to the final value of the abstract variable.
These are fixed to break symmetry using dontCare constraints, as described in Section 4.1.3.
There are two main kinds of representations for sets: the Occurrence representation and four flavours of explicit representations. The Occurrence representation creates a Boolean variable for every potential member of the set. This representation does not introduce modelling symmetry, but it can create a prohibitively large number of variables when given a large set domain. The basic Explicit representation works for fixed cardinality set variables. For a set with cardinality n, it creates a matrix indexed by {1..n} where each element of the matrix represents one member of the set. Symmetry breaking constraints ensure the matrix is in increasing order. The ExplicitVariableSizeMarker and ExplicitVariableSizeFlags representations work for variable cardinality sets. They both have a matrix similar to Explicit but with one matrix element per potential element of the set, using the maximum cardinality of the set as the limit. The former then uses a single integer variable to denote the cardinality of the set and the latter uses a Boolean variable per element of the matrix to indicate membership. Appropriate symmetry breaking constraints are added to enforce an increasing order among the elements of the set and to fix the irrelevant variables using dontCare constraints. ExplicitVariableSizeDummy is similar to Explicit but adds a dummy value to the domain of the elements of the matrix. 3 Representations of multisets are similar to those of sets. In contrast to sets, multisets allow repeated values. In order to accommodate this, the multiset Occurrence representation introduces an integer decision variable (instead of a Boolean) for each value of the set. The domain of this variable ranges from zero to the maximum number of occurrences allowed per value. The ExplicitFlags representation uses a decision variable per distinct value and a corresponding decision variable for the number of repetitions of that value. The ExplicitRepetition representation uses a matrix of decision variables bounded by the maximum cardinality of the multiset. Repeated values are allowed in this matrix and the resulting symmetry is broken by placing them in non-decreasing order.
Sequences in Essence are ordered collections of values of variable length (with an upper bound). Sequences are represented with a matrix and a length variable. Elements of the matrix which have an index greater than the length of the sequence are fixed using dontCare constraints.
There are four representations of functions. FunctionAsMatrix represents a total function τ 1 → τ 2 using a matrix indexed by τ 1 , containing τ 2 . The remaining three representations are used for partial functions. The FunctionAsMatrixPartial is the FunctionAsMatrix representation plus a Boolean variable corresponding to every value in τ 1 to indicate whether this value is defined in the function. Undefined values are fixed using dontCare constraints. FunctionAsMatrixDummy extends FunctionAsMatrix with a dummy value to indicate undefined values. 3 Some solvers support decision variables with 'set of int' domains. In addition to the representation options presented here, Conjure could trivially be made to output these variables without converting them to matrices.  The RelationAsMatrix representation has a Boolean matrix indexed by the components of the relation, where a true value indicates relation membership. The partition (from τ ) Occurrence representation has a matrix indexed by τ to represent the cell of the partition that each value belongs to. Cells are identified using integers. The cardinality and the first element of each cell are also represented for efficiency reasons. The modelling symmetry arising from this representation is broken by its structural constraints.

Expression Refinement Rules
Expression refinement rules are the second kind of rules in Conjure. They are used to translate Essence expressions to equivalent Essence Prime expressions. They may or may not depend on the representations of decision variables and parameters. Rules that do not depend on representations are called horizontal rules, and those that do are called vertical rules. Horizontal rules do not change the representation of decision variables, they merely translate Essence expressions to other Essence expressions. Horizontal rules are representation independent, and they reduce the need for a very large number of representation-dependent vertical rules.

Vertical Rules
Vertical rules replace references to abstract decision variables with their representations. There must exist vertical rules for the most basic operations on the abstract types. One of the most important classes of vertical rules are the comprehension generator rules that allow comprehensions to iterate over elements contained in an abstract decision variable. Suppose we have the following comprehension containing the abstract variable S, of type set of int. All items in S must be odd. In addition we have an in operator, one of the simplest binary operators on sets. If the Explicit representation is chosen, the resulting model is quite different. A vertical rule replaces the generator i <-S with q: int(1..3), and references to i with SExplicit[q], producing a straightforward model of the first constraint. For the second constraint, there is no vertical rule so a horizontal rule is applied first, producing or([q = 5 | q <-S]). From there, the same vertical rule is applied to the generator q <-S, producing the model below. To complete the refinement of this model fragment the .< would be refined as described in Section 4.1.2.

.3) ])
All comprehension generators i <-T have a vertical rule for every possible representation of T, and all abstract types are allowed in a generator (see Section 3.1). Other operators may have vertical rules for some representations and not others. In the example above, i in S was refined with a vertical rule when S took the Occurrence representation. Vertical rules take priority over horizontal rules.

Horizontal Rules
Horizontal rules are entirely independent of the chosen representation of the abstract decision variables. They allow Conjure to reformulate expressions, adding to the diversity of models that Conjure can produce and also avoiding the need for a huge number of vertical rules. When there is no vertical rule available for an expression, Conjure applies a horizontal rule to replace the expression with a simpler expression, often by decomposing an operator. Repeated application of horizontal rules always allows Conjure to reach a vertical rule.
For example, suppose we have two decision variables A and B of type set of int, and one constraint A = B. Refinement of equality is important for channelling constraints (as described in Section 4.3 below) and for cases where equality is part of a larger expression.

find A, B : set (size 3) of int(1..10) such that A = B
Suppose the Occurrence representation is chosen for A and Explicit is chosen for B in the constraint A = B. There is no vertical rule for equality between these two distinct representations of sets. A horizontal rule is applied to decompose A = B into A subsetEq B /\ B subsetEq A. However subsetEq also has no vertical rule. Another horizontal rule is applied to each of the subsetEq operators, resulting in the following specification.

3) . BExplicit[q]=i
Thanks to being representation-oblivious, horizontal rules allow Conjure to achieve full coverage of the Essence language using a manageable number of rules and without having to repeat similar rules for each new representation.

Channelling Multiple Representations
Combining multiple representations of one abstract decision variable in a channelled model can be remarkably powerful [67]. Constraints may be stated on the most appropriate of the chosen representations, allowing for more concise expression of constraints and in some cases improved propagation, both of which can improve efficiency of the search for a solution. However channelling also introduces overheads in the form of additional decision variables and constraints, which may outweigh their potential benefits.
Conjure chooses a representation for each reference to a decision variable in the specification, therefore it may choose multiple representations for one decision variable. All representation choices are made in one pass where every reference to a decision variable is tagged with the name of a suitable representation. In this way the choice of representations is separated from the actual application of the representation selection rules.
When a decision variable is represented in more than one way, channelling constraints are added to ensure consistency between the representations. A channelling constraint is simply an equality between two references to the same decision variable, where the two references are tagged with different representations. The equality is then refined using the standard refinement processes for expressions, described in Section 4.2. A channelling constraint is created for every pair of distinct representations. An example of refining a channelling constraint for the knapsack problem is given in Section 2.5.
By default Conjure produces multiple models by enumerating all possible ways of selecting representations (i.e. all ways of tagging every reference to a decision variable in the AST) and all possible ways of generating constraint expressions once a representation is selected. Depending on the specification, large numbers of models may be produced from one specification (as shown in Table 7). We discuss the issue of selecting an effective model in Section 6, and in Section 7 we evaluate Conjure by examining whether it can generate known good models from the literature for a wide range of specifications.

The Impact of Breaking Modelling Symmetries
In this section we evaluate the impact of breaking model symmetries automatically. Throughout, it is important to bear in mind that these symmetries are broken by Conjure at the problem class level, hence the benefit of symmetry breaking is automatically obtained for every instance of the problem class being refined. This approach is substantially more efficient than analysing an individual instance to identify symmetries [68]. At present there are two mechanisms for breaking modelling symmetries. The first breaks unconditional variable symmetries using ordering constraints (introduced in Section 4.1.2). In this case the number of symmetries can be represented with closed-form expressions and we give a detailed example in Section 5.1. The second mechanism breaks conditional symmetries that arise when parts of representations are unused in a solution (introduced in Section 4.1.3). Here the number of symmetries, and thus the impact of symmetrybreaking constraints, is not straightforward to represent mathematically so we have an experiment in Section 5.2 below.

Breaking Unconditional Variable Symmetries
In order to illustrate both the importance of symmetry breaking, and the way in which the high level of abstraction of Essence allows us to avoid the expensive step of detecting modelling symmetries, we will consider the refinement of the Social Golfers Problem specification presented in Fig. 1. The single abstract decision variable in the specification is a set (representing the weeks) of partitions (representing the groups of golfers). A standard refinement of a fixed-cardinality set, particularly when its elements are themselves complex objects, is into a matrix with the same number of elements as the cardinality of the set (the Explicit representation). Of course, since matrices have indices whereas sets do not, this immediately introduces a symmetry whereby any permutation of the matrix represents the same set. In this case, there will be w! such symmetries. However, the refinement rule employed by Conjure recognises this modelling symmetry and breaks it as it enters the model, by ordering the elements of the matrix, without the need for a costly symmetry-identification process in the final model.
The partition of the golfers can be thought of as a set of sets of golfers subject to the additional constraints that the outer set contains exactly g sets, each of size s, and the intersection of any pair of inner sets is empty. A natural refinement of this nested object is into a g × s matrix, introducing a symmetry on the g! possible arrangements of the groups and the s! arrangements of golfers within those groups. Since each group can be arranged independently, this results in g!(s!) g symmetries for each partition, which again Conjure identifies and breaks as they enter the model.
Since each partition forms one of the weeks, the final model derived as above has w!g!(s!) g symmetries in total. This is a vast number for even relatively small instances of the social golfers problem, which, if left in the model, could have a significant adverse influence on the performance of search. Conjure's ability to deal with these symmetries automatically at the class level (as described in Section 4.1.2) is therefore very valuable.

Breaking Conditional Symmetries
Conditional symmetries arise when refining abstract domains where the values have distinct sizes, and so parts of the representation may be redundant in a solution (depending on the chosen representation). As described in Section 4.1.3, dontCare constraints are used to assign redundant variables in order to break the conditional symmetry. We ran an experiment to illustrate the effectiveness of automated conditional symmetry breaking in Conjure by counting the number of solutions to Essence problem specifications with and without dontCare constraints. The experiment also demonstrates that arbitrary combinations of nested types can be handled, even with conditional symmetries in each. In these experiments Savile Row and Minion were run with their default settings on a 32-core AMD Opteron 6272 at 2.1 GHz.
First, we generated 25 Essence specifications. Each contains a single decision variable with a 3-level nested domain, but no constraints. The innermost domain is always an integer domain, and we generate all combinations of 5 Essence domain constructors for the other layers. The outer two layers have a bounded size of 2, so can also be empty or size 1, meaning that both layers will require conditional symmetry breaking using dontCare constraints. Moreover, the structural constraints of the inner layer will need to be posted conditionally. Conjure contains multiple refinement options for all of the domains in this experiment. In some cases it is able to generate thousands of models for one problem. However, since the conditional symmetry breaking constraints are needed in all of these models we chose one model per problem using the CompactEP heuristic (see Section 6). Table 6 presents the number of solutions for the same problem specification with and without conditional symmetry breaking constraints. The results are as expected: models with dontCare constraints have fewer solutions than those without. The most extreme cases involve partitions, and can produce hundreds of millions of solutions when there are only ten symmetrically distinct ones. When using dontCare constraints, these symmetric solutions are avoided and the solver need not waste effort searching through them.

Model Selection with the COMPACTEP Heuristic
Conjure is able to produce multiple models by enumerating all possible ways of selecting representations. If time is limited it is sensible to provide a rapid model selection method, avoiding both generating all models and training using instance data. In earlier work we proposed a method based on racing [34] to select a subset of the models that perform well on a given set of training instances. Racing methods allow comparing alternative algorithms without necessarily having to run all algorithms on all instances. Racing for model selection can be very computationally expensive. The focus of this paper is on refinement within Conjure so we omit model selection methods that are essentially external to Conjure such as racing.
Conjure contains greedy model selection heuristics that are used for making local decisions during model generation. These can be employed during both representation selection and expression refinement. The default heuristic is called CompactEP, which stands for "compact except parameters", and it is a combination of the Compact heuristic and the Sparse heuristic. We define these heuristics in the following.
The Compact heuristic favours transformations that produce simpler types of variables and smaller expressions at each point during refinement where multiple rules are applicable. We define the compact ordering on abstract types as follows: concrete domains (such as bool, matrix) are smaller than abstract domains; within concrete domains, bool is smaller than int and int is smaller than matrix. These rules are applied recursively, so that a one-dimensional matrix of int is smaller than any two-dimensional matrix. Abstract type constructors have the ordering set < mset < sequence < function < relation < partition, which is also applied recursively. At each stage of representation selection, the CompactEP heuristic will select the smallest domain according to this order. As an example, a set(size n) of int is represented as a matrix indexed by [int] of bool with the Occurrence representation, and as a matrix indexed by [int] of int with the Explicit representation. As bool is smaller than int under our ordering, Compact will always pick the Occurrence representation in this example.
During expression refinement Compact chooses the rule that produces the most shallow abstract syntax tree (AST) directly following its application. For example an expression like a subsetEq has a shallower AST (depth 1) than forAll i in a. exists j in b. i = j (depth 3). To break ties, an arbitrary total ordering is defined over all abstract syntax trees.
The Sparse heuristic is intended to enable small representations of parameter values. It employs a built-in ordering of representations that gives priority to those that take advantage of sparsity. For example, the Explicit representation would take priority over the Occurrence representation for a fixed-cardinality set because Explicit scales with the cardinality whereas Occurrence scales with the number of values potentially in the set. Consider a parameter with the domain relation of (int(1..100) * int(1..100)). A sparse member of this domain like relation((1,2), (3,4)) would require 10,000 Booleans with the RelationAsMatrix representation and only 4 integers with the RelationAsSet representation.
The default CompactEP heuristic is a combination of these two heuristics: during representation selection, Conjure uses the Sparse heuristic when representing problem class parameters and the Compact heuristic for everything else.

Evaluation: CONJURE Produces Kernels of Good Models
Conjure provides full coverage of the Essence language. It has at least one variable representation rule (typically several, see Table 5) for every abstract variable type, and horizontal and vertical expression refinement rules for all the operators defined on them. In this section we test the hypothesis that the kernels of constraint models written by experts can be automatically generated by refining a problem's abstract specification. For two CP models to have the same model kernel, they need to share the same viewpoint, the same representation of decision variables and the same formulation of the problem constraints, together with symmetry breaking. Expert models can have additional features such as implied constraints or dominance breaking [69] constraints but these are not considered to be in the kernel of the CP model for this evaluation. Some expert models contain global constraints that are not present in Essence Prime. In these cases, if Conjure generates an equivalent decomposition then we consider the two models to have the same kernel.
In order to test this hypothesis, we took a diverse set of 42 benchmark problems drawn from the literature and refined them with Conjure. Our main source for these problems is CSPLib [48]. We cover the entire CSPLib problem class collection (at the time of writing), except those problems that are naturally represented using only matrices of Booleans or integers, i.e. without the facilities that Essence provides in addition to those of lower level constraint modelling languages.
In Table 7 we present the set of problem classes and the abstract types of their decision variables in Essence. We also cite the papers that contain a kernel that Conjure is able to generate. We begin by noting the variety of decision variable types involved in the benchmark problems, representing further evidence that the current collection of rules, the rewrite Table 7 Running Conjure on 42 benchmark problems from CSPLib. We highlight the features of Essence used in the problem specification for each problem class and include a reference to at least one published model for each problem that is comparable to one of the models automatically generated by Conjure. In addition, we present the estimated number of models Conjure can produce using 6 configurations of the model selection heuristics. rule mechanism, and the Conjure system as a whole is capable of refining a wide variety of abstract problem specifications into concrete models. The number of models generated for a problem specification depends on the number of representation options for its decision variables.

Configurations of Conjure
In Table 7 we report a lower bound on the number of models that can be generated by Conjure with six configurations. One source of variation in models is the selection of different representations for decision variables, parameters, and quantified variables. We calculate the exact number of representations available for each by examining the domain. A second source of variation arises from expression refinement. We calculate a lower bound on the number of attainable models by taking the product of the number of representation options for each reference to a declaration in the model (where channelling is enabled for the declaration). For example, if a decision variable may be channelled, has two representations, and is referred to three times in the model, there are 2 3 = 8 ways of tagging the references with a representation. The result of this calculation is a lower bound because it ignores the potential for multiple expression refinement pathways after the selection of representations.
The six configurations represent different trade-offs between time taken and the ability to generate diverse models. One option is to prune the set of representations: • Pruning: Use a built-in heuristic to filter the list of representations. This heuristic only allows the use of one variable cardinality representation (ExplicitVariableSizeMarker) for sets created by the RelationAsSet and PartitionAsSet representations.
• No Pruning: Explore all applicable representations for every decision variable and parameter.
Pruning and No Pruning are combined with each of the following options for channelling: • NoCh: Decision variables and parameters each have only one representation (no channelling).
• VarsCh: Channelling is allowed for decision variables but not for parameters.
• FullCh: Channelling is allowed for both decision variables and parameters.
The number of models Conjure can generate for a problem class depends heavily on the abstract types used in the problem specification. In particular, decision variables and parameters that have abstract domains present an opportunity for using different representations. When channelling is enabled, the number of models also depends on the number of times each decision variable (or parameter) is mentioned in the constraints: each use of a decision variable presents an opportunity for a new representation. In addition to choosing one representation for each use of decision variables, we allow the addition of one extra representation, used mainly for providing a search order [70]. In Table 7 we present the numbers of models for the six configurations. In terms of the numbers of models, it is always the case that NoCh ≤ VarsCh ≤ FullCh, and Pruning ≤ No Pruning. In some cases channelling dramatically increases the number of models (Steel Mill Slab Design for example), and similarly turning off pruning can have a dramatic impact (for example, the Water Bucket Problem).

Comparing Generated Models to Published Models
In this section we briefly compare the models generated by Conjure to published models written by expert modellers for each of the 42 problem classes.
CSPLib 1 For the car sequencing problem using the default heuristic Conjure generates a model that uses an integer matrix to represent the function variable. This is the same viewpoint as the model published in [71]. In addition, Conjure generates a 2-dimensional Boolean representation of the same function variable which is commonly used when developing MIP or SAT models.

CSPLib 2
The template design problem has two function variables in its Essence specification. These variables are represented using integer matrices via the default heuristic since they are total functions. This model has the same viewpoint as [72].
CSPLib 3 For several variations of the quasigroup existence problem, Zhang [73] used a 2-dimensional matrix to represent an integer square grid. This model is produced as the default model by Conjure.
CSPLib 5 The low autocorrelation binary sequences problem contains a function variable which is modelled in a similar way to the template design problem (CSPLib 2), see [74].

CSPLib 6
The golomb ruler problem is naturally modelled as a set. From this set model Conjure generates an explicit representation with symmetry breaking and a Boolean occurrence representation. In the explicit model, the distinctness of the inter-tick distances are modelled using an all-different constraint (thanks to Savile Row). The explicit model is given in Section 2 of [75] and the occurrence model in Section 7 of the same paper.
CSPLib 7 The all-interval series problem is modelled as 2 bijective functions in Essence. Previous work on this model looks into breaking symmetry [76] and they use a 1-dimensional array-based representation for each function variable with appropriate constraints to enforce the bijection property. The default model generated by Conjure is the same as this model.

CSPLib 8
The vessel loading problem is described by Brown [77]. He used an array based model to represent the function variables that are found in the Essence problem specification. Among other representations, Conjure generates the same viewpoint as this published model.

CSPLib 9
The perfect square placement problem is specified using function variables in Essence. The default model generated by Conjure uses the same viewpoint as the model given in [78]. This published model uses the cumulative global constraint which is not generated by Conjure currently, instead Conjure generates an equivalent decomposition.
CSPLib 10 The social golfers problem is specified using a set of partitions in Essence. The set represents the weeks and each partition is the schedule for a week. This abstract domain gives rise to a 3-dimensional matrix model, with appropriate constraints posted to enforce the set and the partition structure. This is the viewpoint used by the default model generated by Conjure and it corresponds to the model presented in [31].
CSPLib 13 Progressive party problem includes a set of partitions in its problem specification. This domain can be refined into representations that have the same viewpoint as both of the models presented in [79].
CSPLib 15 Schur's lemma [80] is specified using a single partition variable. Using quantification over all sets of triples of potential members of this partition and an apart operator, the problem is stated using a single top level constraint. The default model Conjure generates uses a set of sets and includes automated symmetry breaking constraints.
CSPLib 16 The traffic lights problem is used as an example in [81] as a demonstration of higher-arity constraints. In Essence this problem can be specified using functions and set membership.

CSPLib 17
The Ramsey numbers problem is modelled using a function variable and universal quantification over fixed size subgraphs. The default model generated by Conjure uses a variable to represent each edge in the graph and its colour, which corresponds to the model presented in [82].

CSPLib 18
The water bucket problem is a planning problem. It is modelled using a sequence of states and actions in Essence. The sequence type allows modelling a list with a bounded (but not fixed) length. The default model generated for this problem represents the states and the actions explicitly and breaks the conditional symmetry [35] arising from the variable length of the data structures. A comparable model is given in [83].

CSPLib 21
The crossfigures problem benefits from bounded length sequences in a similar way to the water bucket problem. In addition, it uses a variant type to represent the 8 different kinds of clues succinctly. The generated Essence Prime model is comparable to a MiniZinc model published on the CSPLib page [84].

CSPLib 22
The natural language specification of the bus driver scheduling problem starts with the following sentence: "Bus driver scheduling can be formulated as a set partitioning problem". The problem specification in Essence is very succinct and has a single decision variable that represents this partition. Starting from this problem specification, one of the models generated by Conjure uses the same core viewpoint as the model published in [85].
CSPLib 24 A channelled Essence model for the Langford's number problem is presented in [86]. The output of Conjure for this problem corresponds very closely to the models presented by Smith [87].

CSPLib 26
The sports tournament scheduling problem is specified using an arity 3 relation of week, period and a set of two teams. Each entry in the relation indicates that the two teams are timetabled to play a game on the selected week and the period. This abstract type (and using the relation projection operator) allows a succinct specification of the problem in Essence. One of the generated CP models uses a viewpoint similar to the one published in [88].
CSPLib 28 The balanced incomplete design problem (BIBD) is typically modelled in CP using a 2-dimensional array with symmetry-breaking constraints [89]. In Essence, the problem is specified using a relation with arity 2 between two unnamed types. This domain allows us to break all of the symmetry in this problem (which is not the most computationally efficient approach), or produce the more commonly used double-lex constraints.
CSPLib 30 The balanced academic curriculum problem (BACP) is modelled using a relation to represent the prerequisites between courses and a function variable to represent the assignment of periods to courses. Using this problem specification, Conjure generates a model very similar to the one published in [90].

CSPLib 31
The rack configuration problem is modelled using a partial function that represents whether each rack is used in a solution or not, and if it is, the model and quantity of each card type in this rack. This representation captures the decision abstractly and allows the generation of the kernels of both models presented in [91].
CSPLib 32 For the maximum density still life problem, we obtain a comparable model to that of Bosch & Trick [92] with the difference that Conjure chooses an explicit instead of an occurrence representation. However, we obtain nothing like the dual model obtained by Smith [93] in which supercells in the original grid are used as variables: this is an example of a reformulation that could be implemented generally but remains outside the scope of this paper. Using even more advanced techniques, a complete solution to the problem has been found for all n [94].
CSPLib 33 The DNA word design problem is very succinctly specified in Essence using a set of functions. It is an optimisation problem where the objective is to minimise the cardinality of the abstract set. For this problem, Conjure produces a model similar to the one published in [95], together with symmetry breaking constraints between the (function) members of the set.

CSPLib 34
The warehouse location problem is a typical network flow problem. Function variables in Essence can be used to specify this problem at a high level of abstraction, and Conjure is able to generate the two main alternative viewpoints described in [27] as well as channelled versions of these two viewpoints.

CSPLib 36
The fixed length error correcting codes problem uses a partial function in its Essence problem specification and an abstract decision variable whose domain is a set of functions. Thanks to this abstract type, the models produced by Conjure include symmetry breaking constraints similar to those presented in [64].
CSPLib 38 For the steel mill slab design problem, Conjure obtains models where the set of orders assigned to each slab, and the set of colours on each slab, are refined using occurrence or explicit representations. If occurrence is used for both, then the model is similar to the model in [96]. The later model of [97] uses a different viewpoint (as well as exploiting dominance) and is not generated by Conjure.
CSPLib 39 The rehearsal problem uses 3 function variables, one of which is a bijection. When defining its objective function, this problem specification uses the list comprehension feature of Essence. Smith [98] presents a model which is very similar to one of the models generated by Conjure.
CSPLib 40 The distribution problem with Wagner-Whitin costs is a warehouse stock distribution problem where each warehouse may order stock from one other warehouse at each time step. The specification uses a partial function variable to represent the orders. Conjure generates the conventional viewpoint of Tarim and Miguel [99] where value 0 is used to represent no order, and a variation of it where additional Boolean variables indicate whether orders are made, however Conjure does not generate the echelon model [99].
CSPLib 41 The n-fractions puzzle is not a very complex problem to specify, but it still benefits from a surjective attribute in Essence. Frisch et al. [115] use this problem to explore implied constraints and one of the models generated by Conjure is the same as the one they present.
CSPLib 44 The Steiner triple systems problem is a special case of the balanced incomplete block design problem. Its problem specification uses set variables and the set intersection operator and Conjure is able to generate an explicit representation of these set variables, similar to a viewpoint used in [100].

CSPLib 45
The covering array problem is modelled using a 2-dimensional matrix variable. The statement of the constraints uses a quantified expression over all values of a fixed length sequence. The output models are similar to those published in [101].

CSPLib 49
The set partitioning problem is very directly specified in Essence using 2 set variables and a constraint to enforce the main sum constraint in the problem. Alternatively a partition with 2 parts can also be used. In either case, the output models will use an Explicit set representation based viewpoint or an Occurrence based set representation. Depending on instance size one or the other model is likely to be a better choice. Both of these viewpoints are explored in previous work in the context of mathematical programming [102].
CSPLib 51 Schaus et al. [103] present a viewpoint for the tank allocation problem that uses a single integer variable representing the product type per tank. The Essence problem specification follows this viewpoint closely, benefiting from Essence features to represent the parameters (a set of sets for representing incompatibilities) and when stating the constraints.
There are 4 variants of the graceful graphs problem on CSPLib 53: Wheel Graphs, Double Wheel Graphs, Gears, Helms. Smith and Puget [104] present two viewpoints for this family of problems: one primarily based on the nodes, and another primarily based on the edges. The Essence problem specification gives rise to viewpoints based on the nodes.

CSPLib 55
The equidistant frequency permutation arrays (EFPA) problem is specified using a single set variable. This allows Conjure to generate several alternative models, with symmetry breaking and channelling between representations. The generated models include the Boolean, Non-Boolean, and Channelled models that were presented in [105].
CSPLib 56 The synchronous optical networking (SONET) problem uses a single variable with a nested domain: multiset of set of nodes. In addition to the declaration of this variable, the problem specification has a single statement for the objective and a single statement for the problem constraint (to enforce that the demand is met). Starting from this high level problem specification, Conjure not only produces a model comparable to the one published in [106], but it also creates the same symmetry breaking constraints automatically.

CSPLib 65
The optimal financial portfolio design problem is specified in Essence using a set of set of integers as a single top level decision variable. The main constraint is written very succinctly using the universal subset quantification feature of Essence. Starting from this high-level specification, Conjure generates a model that uses the same matrix-based viewpoint that is published in [107].
CSPLib 83 The transshipment problem [108] is a network flow problem where the nodes are warehouses, transshipment points, or customers. The model in Essence uses a partial function from pairs of nodes to the amount of flow on the corresponding edge to model this structure. The default model generated by Conjure uses a 2-dimensional Boolean matrix to model edge existence and a 2-dimensional integer matrix to model the amount of flow on an edge. A second model which uses a list of triples (2 nodes and the flow amount) is also generated. The latter model is likely to be a good choice for sparse networks.
CSPLib 85 There is one common model for Van der Waerden numbers (using a 2-dimensional Boolean matrix viewpoint) used in both CP [109] and SAT [110]. This model is among the models generated by Conjure and it is chosen by the default heuristic.
CSPLib 86 There are two main approaches to modelling the capacitated vehicle routing problem: vehicle flow formulations and set partitioning formulations [111]. The first has an integer variable per edge representing the flow on that edge, and this is the default model produced by Conjure. The second uses Boolean variables to represent set partitioning, and Conjure also generates this model.

CSPLib 110
The Essence problem specification for the 'peaceable armies of queens' problem uses two set variables to represent the location of white and black queens. These sets are represented by Conjure in several ways, including the viewpoint given in [112], together with the symmetry breaking constraints presented there.
CSPLib 115 The tail assignment problem is defined using a single function variable. This function variable finds a partial mapping from flights to flights, representing a route, for every plane. The basic model in [113] uses 3 sets of decision variables to represent the same information. Conjure generates a comparable viewpoint automatically.

CSPLib 116
The Essence problem specification for Vellino's problem uses a partial function of multisets to represent the contents of each active bin. Bins that are not used are undefined in this function. The problem is stated using function operators (defined, range, quantification etc). A similar viewpoint to the default model generated by Conjure is published in [114].
In this section we have demonstrated that Conjure is able to generate models that are similar to published models produced by experts, for a wide range of problem classes drawn from a public repository. Also, all six of the abstract types in Essence (set, multiset, sequence, function, relation, and partition) are used in the specifications of the 42 problem classes, showing that each of these types is a necessary part of Essence. Moreover, in 30 out of these 42 problem classes Conjure's default heuristic is able to choose a model that is equivalent to a published model for the same problem class.

Related Work in Automated Constraint Modelling by Refinement
This section primarily surveys other languages and systems that have been employed in refinement-based approaches to automated constraint modelling, and compares them with our own work on Essence and Conjure. Beyond this body of work, there exists a variety of other approaches to automated modelling, which we discuss briefly before proceeding.
One such line of work is example driven. O'Casey [23] is a case based reasoning tool, which uses recordings of previous problem solving episodes. Problems are paired with problem instances to form a case. The experience obtained from cases are mainly the selection of propagators and search heuristics. Conacq [116] is a SAT-based version space algorithm to acquire constraint networks. Another approach is to transform an existing constraint model to improve solver performance. The CGrass [118] system explores the idea of reformulating CP models using a collection of rules in order to improve them. It is limited to integer variables, and arithmetic and logical operators on integer expressions, and does not change representations of decision variables, but it can rearrange constraint expressions and reduce domains of decision variables. Tailor [11] performs common-subexpression elimination (CSE) and its successor, Savile Row [16], extends CSE and adds other powerful transformations.
MiniZinc [12] is a medium-level constraint modelling language. It contains features common to many CP modelling languages such as Boolean and integer domains, and arrays for collections of these variables. MiniZinc can be used to describe problem class models, however it does not perform any reformulations at the class level. When presented with problem instance data, the class model is instantiated into an instance model which can be targeted to one of several solver backends. MiniZinc uses a solver-dependent instance level language called FlatZinc to interact with solvers.

Refining Abstract Constraint Problems
The NP-Spec language [119] allows the specification of NP-complete problems in a subset of existential second order logic. It provides a small number of high level domains, sets and partitions of integers, which are automatically refined into decision variables with simpler domains. NP-Spec provides only one way to refine each high-level domain and operator. Hence, it does not allow for the generation of alternative models.
The ESRA language [25] has a particular focus on decision variables with relation domains. It is translated to the language OPL [114,13], a constraint modelling language with similar facilities to Essence Prime, by refining relation domains and operators. Like NP-Spec, however, it does not consider multiple alternative refinement pathways. Moreover, the abstract domains offered by ESRA cannot be nested arbitrarily.
The F language [27] supports function variables. Problems modelled in F are refined into OPL using a system called Fiona. F supports function attributes such as total and bijective. Function domains in F cannot be nested arbitrarily, a function variable is simply a mapping between non-nested domains like integers or enumerations. Fiona does, however, support multiple alternative refinements for function domains, among which it selects using a number of heuristics. Fiona always generates a single output model using these heuristics. If the same function variable is refined in multiple ways within a single model, Fiona is able to generate channelling constraints automatically.
The constraint language most closely resembling Essence is Zinc [28]. Both languages support type constructors that can be nested to arbitrary depth, and they have a number of type constructors in common, such as sets, arrays and tuples. Essence supports more abstract decision variables than Zinc, for example via multiset, partition, function, and relation type constructors, which in Zinc must be modelled using a constrained collection of variables of a more primitive type.
Quantification over decision variables, which Essence supports, is vital for concision when dealing with variables that have nested domains. Essence and Zinc provide a similar selection of atomic types, although Zinc supports floats, which Essence does not, and unnamed types [120] are unique to Essence. Zinc is extensible via user-defined functions and predicates, a feature which Essence lacks.
Work on refining Zinc has focused on the production of models for different solving paradigms, such as mixed integer programming, constraint programming, and local search [121,29,30], rather than alternative refinement pathways for a particular type of solver. De Koninck et al. describe plans to use annotations to guide the use of alternative refinements manually [29].
Hernández [122] considers the problem of channelling different representations of high-level variables. She produces similar results to the implementation used in Conjure, which produces channelling constraints by refining X = X where the left and right X have different representations.

Encoding Constraint Models to Other Formalisms
A related body of work seeks to encode a given constraint model into another formalism, such as mixed integer programming (MIP), propositional satisfiability (SAT), or SAT modulo theories (SMT). In selecting how the variables and constraints of the constraint model are to be encoded, this approach shares many of the concerns of the refinement of abstract specifications described above. The substantial difference is in the lower level of abstraction of the input.
One popular method of solving constraint problems is to encode them to SAT and employ a SAT solver. The two key considerations are: first, the way in which the CP variables (i.e. variables of the constraint model) are encoded by a set of SAT variables and associated clauses; and second, how the constraints are encoded into SAT clauses (and additional SAT variables if necessary). The simplest such scheme introduces one SAT variable per domain value of each CP variable [123], adding clauses to ensure that every CP variable takes exactly one value. An important alternative is the log encoding [124], in which we represent a variable of domain size n by log 2 n SAT variables. Another alternative is the order encoding: each SAT variable indicates whether the CP variable is greater than a constant [125][126][127]. The order encoding can be useful when inequality reasoning is important. There is an extensive literature on encoding constraints into SAT, including generic encodings of arbitrary constraints (e.g. the direct or support encodings among others [123,124,128]). Special-purpose encodings for particular constraint types can vastly outperform generic encodings, for example cardinality networks for counting constraints [129], and the compact order encoding for sums [130] among many others. Picat is notable for using a log encoding throughout [131]. The availability of many encodings suggests that automatic selection of encodings is important. It is complementary to automatic generation and selection of models. Proteus [132], meSAT [133] and Satune [134] are examples of systems that automatically select SAT encodings.
SMT solvers have made remarkable progress in recent years, making SMT an attractive target for encoding constraint models. FZN2OMT [135] translates the FlatZinc language to SMT, while SR-SMT [136] is a component of Savile Row [16] that outputs SMT. Both systems are able to target multiple theories and multiple SMT solvers. Selection of encodings is similarly important when encoding to SMT as it is with SAT.
Encoding of constraint modelling languages to MIP (linearization) has a long history. OPL is an early example [137], however the OPL system was not able to linearize the entire OPL language. More recently, Rafeh and Jaberi [30] presented LinZinc, a library to linearize the Zinc language in its entirety, and Belov et al. [138] presented a linearization of MiniZinc. For several constraint types (such as allDifferent), linearization of the constraint requires a 0/1 variable for each domain value of each CP variable in scope, similar to the direct SAT encoding. In contrast to SAT and SMT, the issue of selection among multiple encodings is not discussed in any of these works [137,30,138].
A related field in which specification languages are extensively used is that of formalising computing systems. Specification languages such as Z [139] and VDM-SL [140] are used for describing general computing systems. These languages typically allow lambda expressions, set theoretic operators, and first-order logic. In comparison, Essence is a problem specification language in the domain of combinatorial problem solving and it offers specific features such as decision variables, a rich selection of finite domains and operators for posting a variety of constraints on these decision variables. Generic formal specification languages do not typically encode decision problems, instead they encode properties of a system that are required to be true, and enable formal proofs of those properties.

Conclusions and Future Work
In this paper we have presented the automated constraint modelling system Conjure. It employs a set of refinement rules to transform the specification of a parameterised problem class in the abstract constraint specification language Essence into a concrete constraint model. By varying the selection and application of these rules Conjure can produce a set of alternative models. We have demonstrated on a large set of problem classes that, in the vast majority of cases, the set produced includes those formulated by human experts in the literature. Furthermore, we have presented a heuristic by which an effective model can be selected.
A particular advantage of this approach is in the treatment of symmetry. Much of the symmetry typically present in a constraint model arrives through the process of modelling [33]. Conjure recognises and removes this symmetry as it enters a model, removing the need for an expensive symmetry detection step following model formulation, as used by other approaches [141,37]. Furthermore, the symmetry breaking constraints added to the model are valid for the entire problem class, rather than just a single instance. An important item of future work is the treatment of symmetry arising from unnamed types.
Another important item of future work is a more informed method of model selection to complement the data-free heuristic presented herein. Following the practice of algorithm selection [142], a set of training instances for a problem class could be used to learn how to select an effective model for an unseen instance from the same problem class.