Levelwise construction of a single cylindrical algebraic cell

Satisfiability Modulo Theories (SMT) solvers check the satisfiability of quantifier-free first-order logic formulas. We consider the theory of non-linear real arithmetic where the formulae are logical combinations of polynomial constraints. Here a commonly used tool is the Cylindrical Algebraic Decomposition (CAD) to decompose real space into cells where the constraints are truth-invariant through the use of projection polynomials. An improved approach is to repackage the CAD theory into a search-based algorithm: one that guesses sample points to satisfy the formula, and generalizes guesses that conflict constraints to cylindrical cells around samples which are avoided in the continuing search. Such an approach can lead to a satisfying assignment more quickly, or conclude unsatisfiability with fewer cells. A notable example of this approach is Jovanovi\'c and de Moura's NLSAT algorithm. Since these cells are produced locally to a sample we might need fewer projection polynomials than the traditional CAD projection. The original NLSAT algorithm reduced the set a little; while Brown's single cell construction reduced it much further still. However, the shape and size of the cell produced depends on the order in which the polynomials are considered. This paper proposes a method to construct such cells levelwise, i.e. built level-by-level according to a variable ordering. We still use a reduced number of projection polynomials, but can now consider a variety of different reductions and use heuristics to select the projection polynomials in order to optimise the shape of the cell under construction. We formulate all the necessary theory as a proof system: while not a common presentation for work in this field, it allows an elegant decoupling of heuristics from the algorithm and its proof of correctness.


Introduction
In this paper we present a new method to construct around a sample point a single cylindrical cell that is truth-invariant for a set of polynomial constraints. We demonstrate how the new method allows for improved decision procedures to determine the satisfiability of formulae involving such constraints. We use a proof system presentation for our method, which we consider an important contribution for such algebraic decision procedures. We take this opportunity to explain the benefit of such a presentation to the symbolic computation community. This introduction continues with a broad overview of context for the contribution, followed by the plan of the paper.  We are concerned with non-linear real arithmetic, formulae which are Boolean combinations of polynomial constraints with rational coefficients. This is a powerful logic that can express a wide variety of problems. This logic admits quantifier elimination [40], i.e. any quantified formula in the logic may be replaced by an equivalent quantifier-free one. In this paper we restrict our attention to the problem of determining the satisfiability of such formulae, that is quantifier elimination when all variables are existentially quantified.
The most commonly used complete methods here are based on the idea of the Cylindrical Algebraic Decomposition (CAD) introduced by Collins [16]. A CAD is a finite decomposition of R n into cells, traditionally produced relative to a set of polynomials in n variables such that each polynomial has constant sign on each cell. It thus allows us to use a finite set of sample points (one for each cell) to study sign-constraints on those polynomials over the infinite space R n . The CAD method offered the first tractable approach to real quantifier elimination and found numerous applications in the years that followed. However, its practical use is restricted by a doubly exponential worst case complexity in the number of variables [20], that is felt often in practice: the algorithm makes use of iterated resultants [16] leading to polynomials of doubly-exponential degree.
It was soon realised that a CAD encoded far more information than needed even for quantifier elimination: a CAD for a set of polynomials can be used to study any logical formula built from those polynomials, not just the one of interest. Some progress was since made in adapting the core CAD algorithm to the logical formula, e.g. [17,22], but these were only partial solutions.

NLSAT, MCSAT and single cylindrical cells
A novel framework for satisfiability modulo theories (SMT) solving was introduced with the NLSAT algorithm of Jovanović and de Moura [26] in 2012. This was since generalised into the model constructing satisfiability calculus (MCSAT) framework [36] and has been applied to other logics such as non-linear integer arithmetic [24]. In MCSAT the search at the Boolean and theory levels are carried out concurrently, mutually guided by each other away from unsatisfiable regions. Partial solution candidates for the Boolean structure and for the corresponding theory constraints are constructed incrementally in parallel, with Boolean conflicts generalised using propositional resolution and, for real algebra, theory conflicts by CAD technology.
For the latter, when a theory model (sample point) is determined not to satisfy all those constraints which should hold according to the current Boolean search, then we seek to guide the future search by an explanation to which is generalised from the sample point to a region containing the point on which the same constraints fail for the same reasons. This can be achieved by having the polynomials involved in the combination of constraints which cause the failure all have invariant sign upon this region. Such regions are constructed as cylindrical algebraic cells, but they are not necessarily cells from the CAD that would be built for the problem. Instead, they are usually larger since not all constraints are involved in every conflict. The exclusion of the cell is learned by adding a new clause: the negation of the algebraic description of the cell.
This motivates the optimisation of sub-algorithms to produce single cells from a point and a set of polynomial constraints. Savings can be made not just by building cells with only a subset of the constraints, but also by restricting the combinations of those constraints that we do consider in reference to the current model. Until now, the state-of-the-art approach is that in [11]. We continue this research in the present paper by developing a new method for single cell construction. 1.3. First contribution: Proof system presentation In contrast to [11], our new method allows for different choices how to construct the cell. We will describe and evaluate some of these choices, however, it is important to distinguish that area of work from the broader method to build the cell introduced in the next subsection. The choices do not effect the correctness of the cell produced: in all cases the cell meets the essential criteria of containing the sample and being invariant for the truth of the conflicting constraints. Nor do the choices have an affect on high level measures of complexity: they achieve similar reductions in algebraic work. However, it can be observed that the choices do greatly effect the cells produced and thus the performance of the algorithm and so it is worth to try to make an optimal choice. We do these choices heuristically, i.e. using methods not guaranteed to give an optimal answer but hopefully giving a reasonable answer quickly. To expedite and simplify future research on heuristics it is helpful to clearly separate out these heuristic choices from the broader algorithm and its proof of correctness.
To achieve that, we present our work as a proof system. Such a presentation clearly achieves the separation of heuristic decisions from action that need to take place for the correctness of the method. Essentially, we must find a chain of proof rules to prove our desired property, and where there is freedom in the chain to be built we can employ a heuristic method. The system is flexible, extensible, allows for detailed optimizations without changing the fundamental algorithm, and allows for correctness proofs to be portioned nicely. We note that this is not just a presentation for the purpose of the paper, but also present in the underlying implementation we report on . We acknowledge that a proof system presentation is uncommon in symbolic computation. It is more prevalent in the SAT and SMT communities where there is more intense work on the optimisation of such heuristic choices and proof systems are an established presentation method. However, such a system has not been used before for CAD theory, even when deployed in the SMT context. We view our proof system presentation as a contribution in its own right, which allowed for greater exploration of heuristic choices in our work. We also hope the wider symbolic computation community may find it interesting as a potential new tool to use elsewhere.

Second contribution: Levelwise single cells
The current state-of-the-art for single cylindrical cell construction is that of Brown and Košta [11]. This constructs the single cell gradually, processing one polynomial at a time, initialising the cell as the entirety of R n and then gradually refining it according to the sign of each polynomial considered. For each refinement the method needs to consider only the interaction of the next polynomial with the ones currently defining the cell, rather than all those that went before, which allows for savings compared to the original approach used in [26]. However, this approach introduces a sensitivity to the order in which polynomials are considered. A machine learning approach to select the order was considered in [10].
The alternative method contributed in this paper allows to produce the single cell incrementally by level (i.e. dimension / variable). This removed the direct sensitivity to the polynomial ordering, instead introducing at each level decisions about which polynomials to use first. This approach uncovers a greater range of decision that the polynomial ordering, and allows for more reasoned heuristics than black-box machine learning. By allowing for these better heuristic decisions we can produce more optimal cells in turn improving the performance of algorithms which use them.

Plan of the paper
We continue in Section 2 by introducing the necessary preliminaries and notation used, followed by background material on CAD. Then in Section 3 we present the existing state of the art in single cell construction and an informal motivation for our new levelwise approach to single cell construction.
In Section 4, a new CAD property (projective delineability) and the theory around it is presented: we need this to prove the correctness of optimisations utilised by our new algorithm. Then in Section 5 we establish the proof system, and in Section 6 we present our new algorithm and some heuristics that may be used with it.
In Section 7 we give some qualitative analysis on our new method and the heuristics. Then an experimental evaluation on the use of the new method for explanation generation in MCSAT is given in Section 8. Finally, we conclude in Section 9 with an outlook on further research and open questions.

Preliminaries
Let N denote the set of all natural numbers including 0, N >0 = N \ {0}, Q be the rational numbers, and R be the real numbers. For i, j ∈ N with i < j, we define the sets of integers [i..j] = {i, . . . , j} and [i] = [0..i]. For i, j ∈ N >0 , j ≤ i and r ∈ R i , we denote by r j the j-th component of r and by r [j] the vector (r 1 , . . . , r j ). For a tuple t = (a, b, c, . . .) we denote by t.a, t.b, t.c, . . . the corresponding tuple entries.
Let f : D → E be a function, then the domain D of f is denoted by dom(f ), and the restriction of f to A ⊆ D is denoted by f | A (i.e. dom(f | A ) = A). Let f, g : D → E and let < be a total order on E. We write f < g if f (d) < g(d) for all d ∈ D and f ≤ g if f (d) ≤ g(d) for all d ∈ D.

Variables and polynomials
We assume that the reader is familiar with the common definitions and terminology related to polynomials. We will introduce some notation in this section; for further reading, we refer to [19].
We work with the variables x 1 , . . . , x n with n ∈ N >0 under a fixed ordering x 1 ≺ x 2 ≺ . .. ≺ x n . A polynomial is built from a set of variables and numbers from Q using addition and multiplication. We use Q[y] to denote univariate polynomials in some variable y and Q[x 1 , . . . , x i ] for multivariate polynomials in those variables. We say that a polynomial p is of level j (denoted as level(p) = j) if x j is the largest variable appearing in p: i.e. either j = 0 and p ∈ Q; or j ∈ [1 ..n] and p ∈ Q[x 1 , . . . , x j ] \ Q[x 1 , . . . , x j−1 ].
Assume in the following some i ∈ [n] and polynomials p, q ∈ Q[x 1 , . . . , x i ].
We use realRoots(p) ⊆ R i to denote the set of real roots of p, deg xj (p) to denote the degree of p in x j , ldcf xj (p) the leading coefficient of p in x j , factors(p) to denote the irreducible factors of p, disc xj (p) to denote the discriminant of p with respect to x j , and res xj (p, q) to denote the resultant of p and q with respect to x j .

Real algebraic numbers, constraints and cells
Real algebraic numbers are real roots of univariate polynomials with rational coefficients. Although we will not distinguish between real and real algebraic numbers in the following for simplicity, the algorithms are complete when restricting all choices of constants to real algebraic numbers.
A subset of R i for some i ∈ [n] is called semi-algebraic if it is the solution set of a Boolean combination of polynomial constraints. A cell is a non-empty connected subset of R i for some i ∈ [n]. A cell is called algebraic if it is a semi-algebraic set.
For simplifying the notation throughout this paper, we define .n] and continuous functions f, g : , then these sets are cells (as a continuous image of a connected set is connected).
The sign of r ∈ R, denoted sgn(r), is defined to be 1 if r > 0, −1 if r < 0, and 0 otherwise.

CAD definition
A decomposition is called algebraic if its cells are algebraic. A decomposition D of R n is called cylindrical over a decomposition D of R m , m < n if all projections of cells R ∈ D onto R m are themselves cells in D . I.e. the cells in R n stack up in cylinders over the cells in R m . A cylindrical algebraic decomposition (CAD) is produced relative to a variable ordering: it is an algebraic decomposition D such that there exists a sequence of algebraic decompositions (D 1 , . . . , D n A decomposition inherits the invariance properties for polynomials and constraints defined above if they apply to all its cells. A CAD will usually be computed relative to an input polynomial set to ensure such an invariance property. Collins [16] first introduced the notion of a CAD and an algorithm for computing a sign-invariant CAD in the 1970s. A central notion for this algorithm is delineability. Definition 2.1 (Delineability [16]). Let i ∈ N, R ⊆ R i be a cell, and p ∈ Q[x 1 , . . . , x i+1 ] \ {0}. The polynomial p is called delineable on R if and only if there exist finitely many continuous functions θ 1 , . . . , θ • the set of real roots of the univariate polynomial p(r, x i+1 ) is {θ 1 (r), . . . , θ k (r)} for all r ∈ R; and • there exist constants m 1 , . . . , m k ∈ N >0 such that for all r ∈ R and all i ∈ [ 1..k], the multiplicity of the root ..k − 1], and in case k = 0 also R × (−∞, ∞), are called p-sectors over R.
These notions are extended to finite sets of polynomials P ⊆ Q[x 1 , . . . , x i+1 ] \ {0} such that P is delineable on R if the product of the polynomials in P is delineable on R. Accordingly, we define the real root functions of P , the P -sections and P -sectors; for the empty polynomial set there are no sections and a single sector R i .
A CAD projection operator is a function proj that maps a set of polynomials P to a set of lower-level polynomials such that the sign-invariance of proj(P ) on R implies the delineability of P on R. The operator proj induces a sign-invariant CAD of P , recursively defined as follows: In case all polynomials in P are of level 1, then the CAD is the set of all sections and sectors of P . If the polynomials are of higher level, the CAD contains all sections and sectors of P over each cell of the CAD of proj(P ).

McCallum's projection operator
Although Collin's original projection operator is complete, the projection set is large and thus relatively inefficient. McCallum presented an improved operator by making the projection set smaller [32]. To do so, his proof of correctness relies not on sign-invariance but on the stronger property of order-invariance. Although a stronger property, the induced cells in McCallum's CAD are actually bigger than in Collin's CAD. In the following, we present a simplified version of the McCallum CAD projection (simplified as we do not describe the optimisation using delineating polynomials).
McCallum's theory relies on some notions which we will mention here only on the level of intuition: for more details, we refer to [30,32]. An i-dimensional (analytic) submanifold of R n is a non-empty subset R ⊆ R n that "looks locally like R i ". Given an open subset U ⊆ R i , a function f : U → R is called analytic if it has a multiple power series representation [30] around each point of U . Given an i-dimensional submanifold R of R n , a function f : R → R is called analytic if for all r ∈ R, R looks locally like R i with respect to a coordinate system about r and f looks locally like an analytic function .n] is an analytic submanifold. To simplify notation, we say R 0 is an analytic submanifold. Given an analytic submanifold R ⊆ R i and analytic functions f, g : R → R with f < g, the sets R×(f, g) and R×f are analytic submanifolds as well [30,Theorem 2.2.3. and Theorem 2.2.4]. Note that analytic submanifolds are cells.
Let p ∈ Q[x 1 , . . . , x n ] be a polynomial and r ∈ R n be a point. Then the order of p at r is defined as ord r (p) = min({k ∈ N | some partial derivative of total order k of p does not vanish at r} ∪ {∞}).
We call p order-invariant on R ⊆ R n if ord r (p) = ord r (p) for all r, r ∈ R. Note that if p has no root in R, then ord r (p) = 0 for all r ∈ R and p is trivially order-invariant on R. Note also that order-invariance implies sign-invariance. Additionally, the notion of delineability is extended to analytic delineability, which is only defined on connected analytic submanifolds and the real root functions are required to be analytic.
McCallum's operator requires a smaller projection set than Collins' operator. It does so by maintaining the stronger property of order-invariance instead of sign-invariance; however, order-invariance can only be concluded if no polynomial is nullified on a point in the underlying cell.
The McCallum projection operator is thus incomplete. We note the recent validation of Lazard projection in [35] as an alternative to McCallum projection which is complete and is no bigger than McCallum except for those cases where McCallum cannot be applied [12]. We choose to formalise here with McCallum projection as it is extended to optimisations such as equational constraints [33] [22] which are very powerful in practice (see Section 5.6) and the existing single cell approach [11] is also based on McCallum. We expect that our work could be reformulated in Lazard projection if so desired.

Computing a CAD
In many applications, a decomposition of a set of polynomials P is not computed explicitly but instead the projection of P is used to generate sample points for every sign-invariant cell that would be formed in such a decomposition. The generation of cells / samples from the projection is called lifting. Polynomials can be evaluated at these sample points to check the satisfiability of constraints and formulae which involve them.
For representing cells explicitly, we need to give witnesses for the real root functions θ j : R → R from Definition 2.1 defining the sectors and sections over a cell R. . . . , x i+1 ], level(p) = i + 1, and j ∈ N >0 . The indexed root expression root xi+1 [p, j] : R i → R ∪ {undef} is the j-th real root of p in x i+1 at the given sample if it exists, and undef otherwise. That is for each s ∈ R i : )| or p(s, x i+1 ) = 0, and otherwise ξ j where realRoots(p(s, x i+1 )) = {ξ 1 , . . . , ξ k } and ξ 1 < . . . < ξ k .
Indexed root expressions of this form are also called indexed root expression of level i + 1. Assuming ξ denotes the above indexed root expression, we use ξ.p to refer to p and ξ.j to refer to j.
In the algorithms presented in this paper, we only need to evaluate the indexed root expressions for real algebraic numbers. Note that the existence and index of a real root function depends on the given sample. Thus, the same indexed root expression may refer to different real root functions at different sample points. Definition 2.4 (Symbolic intervals, single cell and CAD data structures). A symbolic interval I of level i is either (i) of the form I = (sector, l, u) where l is either an indexed root expression of level i or −∞, and u is either an indexed root expression of level i or ∞; or (ii) of the form I i = (section, b) where b is an indexed root expression of level i. The polynomials I.l, I.u respectively I.b are the defining polynomials of I (if they exist).
A single cylindrical cell data structure is a sequence R = (I 1 , . . . , I n ) of symbolic intervals I i of level i. For an empty single cell data structure, we define setOf(()) = {()}. For a single cell data structure (I 1 , . . . , I i ), i ≥ 1, we define the corresponding subset of R i as setOf(I 1 , . . . , I i the respective indexed root expressions are not undef, and setOf(I 1 , . . . , I i ) = undef otherwise. Similarly, we define setOf(R, I i ) for arbitrary subsets R ⊆ R i−1 and setOf(r, I i ) for points r ∈ R i−1 .
A CAD data structure D is a set of single cell data structures. We define setOf(D) = {setOf(R) | R ∈ D}.
In our algorithms, indexed root expressions occurring in a single cell data structure will be always defined; the above definition covers the undef case just for the sake of completeness. Furthermore, note that given a single cell data structure R = (I 1 , . . . , I i ), the restrictions of members I i .l| setOf(I1,. .., , I i .u| setOf(I1,. .., , ..., are real root functions of their defining polynomials as θ j in Definition 2.1.
The set of indexed root expressions of p at s is defined as ] be a set of polynomials of level i + 1. The set of indexed root expressions of P at s is defined as irExpr(P, s) = p∈P irExpr(p, s).
Let ξ ∈ R. The set of indexed root expressions of P for s and ξ is defined as A description of a sign-invariant CAD defined by a projection operator with respect to a set of polynomials can be computed as follows: First the projection operator is applied from level n to 1, called the projection phase. This is followed by the lifting phase where, starting from level 1, all cells from level i − 1 are extended to the level i such that all sections and sectors in the cylinder above every cell of level i − 1 are identified as cells. To achieve this, for each cell R of dimension i − 1, the polynomials on level i are partially evaluated up to their last dimension using a sample s from R. This results in univariate polynomials whose roots can be isolated and sorted to give intervals: point intervals at the roots of the polynomials, the open intervals between them, and the two open intervals below and above all roots. Delineability allows the conclusions drawn at the sample to generalise over R. For each interval a symbolic description is determined that together extend R to level i.

Single cell computation
For our problem of study, we are not interested in computing a full sign-invariant CAD, but only a single signinvariant algebraic cell. That is, given a set of polynomials P ⊆ Q[x 1 , . . . , x n ] and a sample s ∈ R n , we need to compute a cell R ⊆ R n such that s ∈ R and P is sign-invariant on R.
In this paper, we will focus on the computation of algebraic cells that adhere to Definition 2. 4. These cells have a triangular algebraic description with respect to the variable ordering (i.e. a condition on x 1 alone; then one on (x 1 , x 2 ), and so on). Such cells are called (locally) cylindrical [11,2] as this shape is implied by the cylindricity property of a full CAD. Although we do not construct entire decompositions in this paper, we make use of CAD theory which results in cells having this property. Such cells have the advantage of being easy to visualise and compute with. For example, projection with respect to the variable ordering is trivial, as is checking whether a point is inside such a cell, which is particularly important for our use of such cells.

Previous work on optimizing single cell computations
As described in Section 1.2, Jovanović and de Moura use a single cell construction for model explanation in their algorithm to decide satisfiability for non-linear arithmetic [26]. This was based on Collins' CAD projection operator [16] but did not compute a full CAD projection: it left out coefficients based on the current sample (as we only need consider subsequent coefficients when the leading coefficient vanishes).
Brown enhanced this single cell construction, first in the open case [7] (i.e. building only cells with maximal dimension) and then for the general case [11] in collaboration with Košta. This work was based on McCallum projection theory, and in addition to coefficients also avoided some computation of resultants and discriminants based on the current sample. The work led in turn to the consideration of entire decompositions built one cell at a time [8] and their use for quantifier elimination [9].
The approach in [8] started with a single cell data structure describing the whole of R n which was continually refined by merging in new polynomials one at a time. The process maintains several invariants: (1) the cell contains the sample point s; (2) the cell is cylindrical; and (3) all polynomials merged so far have constant sign (more precisely, constant order) in the cell. The polynomials are merged one-by-one, and the order in which they are merged affects the cell that gets produced. Figure 1 illustrates this process for Example 3.1 below, showing different results for two different orderings of the polynomials. From now on we will refer to this approach as refinement-based single cell construction. The merge operation, unless on the lowest level, projects the current polynomial p and then calls itself iteratively on the projection result. When the call returns to p, the polynomial is used to refine the bounds of the cell in the dimension that corresponds to the level of p. This is done by isolating the roots of the univariate polynomial resulting from partially evaluating p up to its last dimension using the sample s. If there is a root closer to s i than the current bound then it becomes the new bound of the sector. In the special case that a root coincides with s i , the sector collapses into a section described by said root. By this recursive refinement, the transitivity of the ordering on real root functions of polynomials induced by the order-invariance of resultants is exploited, so that at most two resultants are calculated per merge-operation and polynomial. Furthermore, as soon as a section is identified, some superfluous discriminants can be detected, and using the current sample irrelevant coefficients can be identified.
These are the polynomials studied in Figure 1 which demonstrates the refinement-based single cell construction process for two different orderings. We see that one produces a larger cell than the other. It is not obvious how to choose good orders for adding polynomials in this approach (a machine learning heuristic was presented in [10]). Moreover, because the method works by completely refining the cell by the chosen polynomial, the order in which lower-level polynomials resulting from projections are added is not very flexible.
Our approach will build a single cell levelwise, i.e. constructing the same level-by-level according to a variable ordering instead of polynomial-by-polynomial for all levels. We note there exists another levelwise single cell construction in an unpublished work on the arXiv, [29]. In comparison, our formulation using a proof system will enables the use of more sophisticated optimizations in the following.

Our levelwise approach to single cell computation
Our new levelwise approach is based on using information about the ordering of the roots of polynomials relative to the sample point s and to one another to make decisions during the cell construction process that are likely to lead to larger cells.
We will exploit said information and the transitivity of the ordering on the roots induced by order-invariance of resultants. Thus, before projection, the roots of the polynomials on the current level are isolated and ordered to see which resultants are required for maintaining sign-or order-invariance of the given polynomials.  Figure 2 shows how our new levelwise approach would operate on the formula from Example 3.1. We start by considering x 1 = s 1 , the first coordinate of the sample s. We evaluate the polynomials at this coordinate and isolate the real roots to find ξ 1 = root x2 [p 2 , 1], ξ 2 = root x2 [p 3 , 1], ξ 3 = root x2 [p 1 , 1], and ξ 4 = root x2 [p 2 , 2]. We order these, along with the second coordinate of s, as in the top-left image.
Our sample lies in the interval for x 2 denoted I 2 = (sector, ξ 1 , ξ 2 ) in the top-right image. We thus know that, local to s 1 , the cell we want will be bounded from below by p 2 and from above by p 3 . We also know that the other section of p 2 and the section of p 1 are above the cell. In this figure the graphs of the polynomials are greyed out because the algorithm is not aware of them: it knows only their roots when evaluated at the sample. However, this is enough to allow the algorithm to infer behaviour around the sample, as illustrated by the partial thick lines.
As we generalise from the sample we must ensure than ξ 1 and ξ 2 remain well-defined and that no root function crosses the symbolic interval described by I 2 . To achieve this we compute a projection consisting of p 4 = disc x2 (p 2 ) x 1 x 1 x 2 s Figure 1: Refinement-based construction of a single cell for Example 3.1. In the upper row the original cell (the entire plane) is refined first by polynomial p 1 , then by p 2 , then by p 3 . In the lower row a different order of refinement is used: p 3 , p 2 , p 1 giving a larger cell. Note that in the figures of this paper we label the varieties of polynomials with their name, p 1 labels p 1 = 0.
and p 5 = res x2 (p 3 , p 2 ) (also disc x2 (p 1 ) and disc x2 (p 3 ) but they do not have any real roots and so play no further role in this example). Crucially, because p 1 has only one section and that section lies above the cell of interest, the resultant of p 1 and the lower-boundary polynomial p 2 is not required or computed. The bottom-left figure shows how the zeros of the projection map onto the geometric features.
So we see that for Example 3.2 the new levelwise approach produces the second of the two possible outcomes from the refinement-based construction ( Figure 1). The example illustrates how the levelwise approach avoids "mistakes" that the refinement-based construction can make due to poor choices of orderings for the polynomials.
The levelwise approach also allows greater flexibility in how polynomials resulting from projection are handled, but that aspect of the algorithm requires more variables than in this simple example in order to be observed.

Projective delineability
It is already known that if a polynomial is not nullified at any point of a cell, then a projection consisting solely of the leading coefficient and the discriminant suffices to guarantee its delineability. In this section, we will explore a new notion called projective delineability (Definition 4.1) for which the order-invariance of the discriminant alone is enough to guarantee. It is similar to delineability bu allows for the existence of an additional special root function below or above all other roots that has a singularity at points (exactly where the leading coefficients vanishes) where it approaches plus or minus infinity.
This new property is often enough for constructing a sign-invariant cell, as in the example is depicted in Figure 3. We will utilise this notion in Section 5 to save leading coefficients.
The proofs of the lemmas in this section may be found in Appendix B.

Definition 4.2 (Analytic projective delineability).
Analytic projective delineability is defined as projective delineability in Definition 4.1, with the modification that it is only defined on connected analytic submanifolds (instead of general cells) and the functions θ 1 , . . . θ k are required to be analytic and θ * analytic on the connected components of its domain (instead of only continuous).
The key property, that order-invariance of the discriminant alone, without looking at the leading coefficient, is enough to guarantee projective delineability, is captured in Lemma 4.1 below.
If p is not nullified on any point in R and disc xi+1 (p) is order-invariant on R, then p is analytically projectively delineable on R.
The following theorem states the relationship between delineability and projective delineability.
If p is projectively delineable on R and ldcf xi+1 (p) is sign-invariant on R, then p is delineable on R.
Note that we already defined real root functions of delineable polynomials on arbitrary cells in Definition 2.1; here, we define real root functions for the case of analytic projective delineability.
x 1 x 2 s Figure 3: In this situation, projective delineability of the polynomial is enough for constructing a sign-invariant cell, i.e the leading coefficient could be omitted from the projection.
, level(p) = i + 1, and θ : R → R be an analytic function. We call θ a real root function of p on R if p(r, θ(r)) = 0 for all r ∈ R.
If p is analytically projectively delineable on R and there exists a real root function θ of p on R such that θ(r) = ξ(r), then θ ξ,r denotes this (unique) real root function.
The following results are analogous to the known properties for delineability, showing how to lift order invariance and ensure delineability of sets of polynomials.
We first give a lemma proving order-invariance of polynomials for the non-trivial case where a polynomial is zero (remembering that a polynomial is trivially order-invariant on sets where it is non-zero). 1] is an analytic submanifold, p is not nullified on any point in R ↓ [i−1] , and p is analytically projectively delineable on R ↓ [i−1] , then p is order-invariant on R.
We next present a lemma for ensuring the order of two real root functions from different polynomials; if they stem from the same polynomial, projective delineability of this polynomial already suffices. Lemma 4.4. Let i ∈ N, R ⊆ R i be a connected analytic submanifold and p 1 , p 2 ∈ Q[x 1 , . . . , x i+1 ] irreducible and coprime such that level(p 1 ) = level(p 2 ) = i + 1, θ 1 , θ 2 : R → R be real root functions of p 1 and p 2 respectively, and ∼∈ {=, <, >}.

Proof system for single cells
In Section 6 we will present our new levelwise procedure to construct a single cell. We lay the basis for this by first formalising our work as a proof system in this section. The system expresses how the properties we want to conclude rely on other "smaller" properties. From this, the procedure in the next section gains maximal freedom to make heuristic choices. Before we go into the formal presentation, we outline the idea for our running example.  Figure 2 where we sought to construct a cell R ⊆ R 2 around s = (s 1 , s 2 ) such that the sign-invariance of p 1 , p 2 , p 3 hold on R. We started by eliminating x 2 : we isolated real roots of p j (s 1 , x 2 ) to define ξ 1 , . . . ξ 4 , sorted them, and observed that ξ 1 (s 1 ) < s 2 < ξ 2 (s 1 ), thus we determined that in the second level the cell was represented by This root ordering and the representation should now be generalized such that the underlying cell R 1 is restricted by specifying certain properties ensuring ξ 1 and ξ 2 remain as cell boundaries: they must remain well-defined over R 1 ; no further root of p 1 , p 2 , p 3 should appear that intersects I 2 over R 1 ; and the other roots, ξ 3 and ξ 4 , should remain outside of the symbolic interval I 2 over R 1 . For the latter, we extend to a partial ordering with ξ 2 ξ 3 and ξ 2 ξ 4 .

Motivating example
To conclude that p 1 , p 2 , p 3 are sign-invariant in R, the proof system shows we need to maintain the properties that: the sample s is included in R; I 2 describes the cell's boundaries on the current level; the underlying cell R 1 is a connected analytic submanifold and; the that p 1 , p 2 , p 3 are projective delineability. For some polynomials we also need to ensure full delineability. Furthermore, we maintain that the partial ordering is maintained over R 1 .
To maintain these properties the proof system concludes that we must also prove order-invariance and signinvariance of some coefficients, resultants, and discriminants of the input polynomials. After simplification, we find that on level 1 the polynomial p 4 = disc x2 (p 2 ) and p 5 = res x2 (p 3 , p 2 ) must be ensured sign-invariant. This leads to a representation I 1 = (sector, ξ 2 , ξ 3 ) and then since all polynomials are univariate and their zeros are algebraic numbers, no further projection or proof steps are necessary. The constructed cell is described by R = (I 1 , I 2 ).
The graph of the proof constructed is shown in Figure 4, in which the sign-invariance properties sgn inv(p 1 ), sgn inv(p 2 ), sgn inv(p 3 ) lead to R = setOf(R ↓ [1] , I 2 ) and R ↓ [1] = setOf(I 1 ). The exact proof rules used in that figure will be defined in the rest of this section.
Note how in the example there were multiple choices for . Choosing ξ 2 ξ 3 and ξ 3 ξ 4 would have been a valid choice but that leads to a smaller cell (the one depicted in the top right of Figure 1). We will discuss ordering heuristics to make this choice in Section 7.1.

Proof system: Properties and rules
Our system is for constructing cells. We will define properties of the cell that is being constructed and rules of inference to infer that a property holds for a cell given some other properties and conditions. As indicated in the previous example, there may be different rules that can be used to prove a property.
The set Prop denotes the set of all properties of any level. For a set Q ⊆ Prop, we denote by Q| i the set of properties of level i and by Q| [i] the set of properties of level at most i.
Note that the concept of levels of polynomials has been extended to properties in the sense that the satisfaction of a property of level i cannot be influenced by values of the variables x i+1 , . . . , x n .
We denote a rule of inference in the form P 1 , . . . , P k Q, where A 1 , . . . , A k are the antecedents and Q is the consequent.
For convenience, we allow to lift properties from lower levels to higher levels as follows.
for any R ⊆ R j , j ≥ i. Accordingly, we introduce the following rule of inference: So, for example, this rule allows us to say that because the property of sign-invariance holds for x 2 − 2 in the one-level cell −1 < x < 1, sign-invariance holds for x 2 − 2 in the two-level cell which is the interior of the unit circle, since this projects down to −1 < x < 1.
The set of inference rules will be defined in such a way that the property of sign-invariance of a polynomial is reduced to the property of sign-invariance of polynomials at a lower level: thus the chain of rules is finite. On each level, we will additionally determine a representation describing the symbolic boundaries of the cell on this level as well as an ordering of the indexed roots of some polynomials assuring their sign-invariance which will be used later for heuristic decisions.

Basic cell properties
We will first define the basic properties on polynomials and cells that follow from the original work on McCallum CAD projection. Throughout, i will refer to the level of the given property and s refers to algebraic points. It is important to recall that an i-level property is a function that maps subsets of R i to Boolean values. Consider for example the first element of Property 5.1: "sample(s)" where s ∈ R i is a property, meaning that sample is a function that maps a point (and a level i that is given implicitly by the point) to a function mapping subsets of R i to Booleans. Thus, sample(s) is not a true/false value. Rather, it is the function sample(s) applied to a given subset of R i that yields a true/false value. Note that for the fifth and sixth items in Property 5.1 the level i must be given explicitly, while with the others the level is implicit in other arguments.
The property non null(p) holds on R if and only if p is not nullified on any point in R. 5. The property an sub(i) holds on R if and only if R is an analytic submanifold. 6. The property connected(i) holds on R if and only if R is connected. 7. Let p ∈ Q[x 1 , . . . , x i+1 ] be a polynomial of level i + 1. The property an proj del(p) holds on R if and only if p is analytically projectively delineable on some connected superset R ⊇ R.
Note that an proj del(p) is defined such that it holds on a subset R ⊆ R i if and only if there is a connected superset of R on which p is analytically projectively delineable. This is due to the fact that in general, we cannot assume that R is a connected set, but projective delineability is only defined on connected sets. We will use this trick in the following for other properties as well.
Further note that for some properties such as sgn inv(p), if they hold on R ⊆ R i , then they also hold on all subsets of R. For others such as an sub(i) or connected(i), this is not the case: a subset of R is not necessarily an analytic submanifold nor connected! This is one of the reasons why the proof rules below cannot cover all subsets where a given property holds, but they do cover sufficiently many subsets for our purposes.
For the remainder of this sub-section we will present proof rules for these properties (with proofs of correctness available in Appendix C). For the ease of navigation we give references to the rules which apply to prove each property above (and provide similar references after introducing further properties in later sub-sections).

References.
sample(s) Rule 5.13 ord inv(p) Rule 5.3,Rule 5.4,Rule 5.5 sgn inv(p) Rule 5.3,Rule 5.4,Rule 5.6,Rule 5.8,Rule 5.11 non null(p) Rule 5.2 an sub(i) Rule 5.7 connected(i) Rule 5.12 an proj del(p) Rule 5.1 In the following rule definitions we list assumptions before the actual inference rule. These assumptions are formally part of the inference rule, but we keep them separated to improve readability.
We start by defining rules that relate to the material in Section 4.
Note that the condition deg xi+1 (p) > 1∧disc xi+1 (p)(s) = 0 already implies that p(s, x i+1 ) = 0, as does c j (s) = 0. In practice it is good to delay choosing which rule to use because we may observe that one of the non-zero c j 's is already required to be sign-invariant from some other part of the projection process (e.g. leading coefficients are often added to the projection). Moreover, the second case is trivial if the c j is constant.

Sign invariance for polynomials without roots
For polynomials without roots over the current sample, we must ensure that no further root appears over the underlying cell. Whenever the leading coefficient of such a polynomial is not zero at the underlying sample, projective delineability ensures that no further root appears. Otherwise, if the leading coefficient is zero, its signinvariance must be maintained to prevent the special root function θ * from Definition 4.1 popping up over the underlying cell. Figure 5 illustrates the two cases.

Cell boundary representations
We are aiming to generate the description of a sign-invariant cell for the input polynomials. In the language of our proof system: at every level we identify a symbolic interval upon which we prove sign-invariance of the polynomials of that level. To do this, we take their real roots over the current sample into account.
We determine whether we are in a section or sector (i.e. whether a polynomial has a root exactly at the current sample point or not) and pick indexed root expressions (as given by Definition 2.5) to be symbolic descriptions of the cell boundaries that generalize a concrete interval around the sample point: either the root at the current sample or the largest root below and the lowest root above. Note that the choice of such boundaries might not be unique in the cases where two root functions cross over the underlying sample. We will discuss more general choices of the symbolic intervals in Section 6.2.
Our next property is designed to state that a designated representation describes the cell boundaries on the current level.
The property repr(I, s) holds on R if and only if I.l ∈ irExpr(I.l.p, s) (if I.l = −∞), I.u ∈ irExpr(I.u.p, s) (if I.u = ∞) respectively I.b ∈ irExpr(I.b.p, s) and one of the following hold: (for the real root functions θ l,s , θ u,s respectively θ u,b according to Definition 4.4). References. repr(I, s) Rule 5.14 Since the cell's boundaries are described by indexed root expressions, we can prove it is an analytic submanifold. 5.6. Equational constraint projection An optimisation to CAD which tailors it to the logical structure of the problem is the theory of equational constraints. A polynomial equation is an equational constraint if it is logically implied by the truth of the input formula. If the input to CAD has an equational constraint then we may perform fewer projection and lifting operations to achieve truth-invariance. This optimisation dates back to [33] with the recent paper [22] summarising the state of the art. The examples there demonstrate the drastic savings that are achievable in these cases (the analysis in [22] shows the double exponent in the complexity bound decreases for each equational constraint).
In the context of building a single cell we have even greater scope to use equational constraint savings. Here, any time we are generalising in the section case, we can apply the reduced projection and lifting. I.e., if a polynomial is zero at our sample then it must also be zero throughout the cell we are generalising the sample to, and so for the purposes of this cell, it is an equational constraint. In this case we need only make the section defining polynomial delineable. All other constraints can be made sign-invariant simply by including resultants with the section defining polynomial (but not other projection polynomials). . . , x i ], level(p) = i, and I be a symbolic interval of level i. Assume that p is irreducible, and I = (section, b). Let

Root orderings
We define a partial ordering on the indexed root expressions of the given set of polynomials, which we will use to ensure that none of their roots cross the cell's boundary on the current level.
We first give a general definition of indexed root orderings.

Definition 5.3 (Indexed root ordering).
Let i ∈ N, and Ξ be a set of indexed root expressions of level i + 1. An indexed root ordering on Ξ is a relation ⊆ Ξ × Ξ such that its reflexive and transitive closure t is a partial order on Ξ. Indexed root orderings of this form are also called indexed root ordering of level i + 1, and we define Let s ∈ R i . An indexed root ordering of level i+1 matches s if and only if ξ ∈ irExpr(ξ.p, s) for all ξ ∈ dom( ) and ξ(s) ≤ ξ (s) for all (ξ, ξ ) ∈ .
In our algorithm we will pick an indexed root ordering such that, in the sector case, roots lower than the intervals's lower bound remain lower, and roots greater than the intervals's upper bound remain greater. In the section case, the lower and upper bounds in that condition both refer to a single bound. We do not give an explicit definition here yet, as it will be part of the rules defined below.
The motivation for introducing the general concept of indexed root orderings is to allow for choices between different root orderings. We will discuss them in Section 6.2. Now, we must maintain that an indexed root ordering holds over the underlying cell. First, we will define this property formally. it is not straight forward as we do allow an optimisation whereby the referred root functions may disappear (i.e. not to be defined) over parts of the cell, so long as this does not change the ordering.
Property 5. 3. Let i ∈ N, R ⊆ R i , s ∈ R i , be an indexed root ordering of level i + 1, and t be the reflexive and transitive closure of .
The property ir ord( , s) holds on R if and only if matches s and for all ξ, ξ ∈ dom( ) it holds that In the simple case all three real root functions are defined over the underlying cell.
Only two resultants are necessary to maintain the ordering of the root functions, e.g. resx 2 (ξ1.p, ξ2.p) and resx 2 (ξ2.p, ξ3.p). We conclude the ordering between ξ1 and ξ3 by transitivity.
x 1 x 2 s (b) It is also possible that some real root functions disappear over the underlying cell. If this were the highest or lowest then it would not change the ordering. Subsequent ones could also disappear in addition without changing the ordering.
x 1 x 2 s (c) If a real root function in the middle were to disappear alone (here ξ2) then the ordering is maintained by the same resultants and transitivity: i.e. it must converge to ±∞ when disappearing and thus cross a neighbouring root function, which was protected by the resultant. and (for the real root functions θ ξ,s , θ ξ ,s according to Definition 4.4). References. ir ord( , s) Rule 5.9 Note that the semantics of an indexed root expression ξ is only defined in combination with the sample s, as this uniquely determines the real root function θ ξ,s ; thus, we add s to the signature of the property. To clarify this further: the sample s is only used to identify the referred real root functions and is not necessarily contained in the constructed cell by definition of the above property (although in the following proof rules, this will be required).
The property is maintained by projective delineability and adding resultants for the defining polynomials of a pair of indexed root expressions to the projection. Note that the property is defined in such a way that no leading coefficients need to be projected. We essentially prove that the ordering of root functions ensured by the order-invariance of resultants is transitive. Figure 6 shows the different cases whereby transitivity is maintained for three root functions.
Rule 5. 9. Let i ∈ N, R ⊆ R i , s ∈ R i , and be an indexed root ordering of level i+1. Assume that ξ.p is irreducible for all ξ ∈ dom( ), and that matches s. sample(s)(R), an sub(i)(R), connected(i)(R), ∀ξ ∈ dom( ). an proj del(ξ.p)(R), In the following, we need to make sure that certain indexed root expressions are well-defined over the underlying subset R ⊆ R i . To do so, we define a property for an indexed root expression stating that its graph is continuous on some connected superset of R.
The property well def(ξ, s) holds on R if and only if ξ ∈ irExpr(ξ.p, s) and dom(θ ξ,s ) ⊇ R for the real root function θ ξ,s according to Definition 4.4. References. well def(ξ, s) Rule 5.10 For maintaining this property, we do an elaborated case distinction resulting from the definition of projective delineability, and in some cases we need to fall back to making the corresponding polynomial delineable by adding its leading coefficient to the projection. The different cases are illustrated in Figures 7a to 7c. We note that this case distinction does not detect all superfluous leading coefficient as shown in Figure 7d.
(a) The real root function that rootx 2 [p1, 2] at s refers to is already well-defined by projective delineability of p1, since it is neither the first nor last root of p1 at s and thus cannot be the special θ * function from Definition 4. 1. x 1 As the leading coefficient of p1 is zero at s, no indexed root expression refers to θ * . Thus all real root functions defined at s are welldefined over the underlying cell by projective delineability of p1.
x 1 1] does not refer to θ * at s. However, this cannot be distinguished from the previous case and so a superfluous leading coefficient is added to the projection. Rule 5. 10. Let i ∈ N, R ⊆ R i , s ∈ R i , and ξ be an indexed root expression of level i + 1. Assume that ξ ∈ irExpr(ξ.p, s).
sample(s)(R), an proj del(ξ.p)(R), ldcf xi+1 (ξ.p)(s) = 0 well def(ξ, s)(R) sample(s)(R), an proj del(ξ.p)(R), sgn inv(ldcf xi+1 (ξ.p))(R) well def(ξ, s)(R) Now we are ready to define the rule for ensuring the sign-invariance of a polynomial p, given a representation I and an indexed root ordering . The latter will be used to ensure that no real root function of p that is defined at the current sample s (i.e. which is "visible" at s by real root isolation) crosses the cell's boundaries. Additionally to that, we make sure that no further root function of p emerges over the underlying cell. Thus, the many case distinctions follow from the fact that we allow the number of roots of a polynomial to vary over the underlying cell, maintaining only that the appearing roots do not cross the cell's boundary. The cases are depicted in Figure 8. . . . , x i ], level(p) = i, I be a symbolic interval of level i, be an indexed root ordering of level i, and t be the reflexive and transitive closure of .
x 1 x 2 s (a) The indexed root ordering ensures that real root functions remain outside the cell. In the case where the green polynomial is delineable, this is sufficient to prove sign invariance.
x 1 In the case where the green polynomial has a "θ * " root function over the underlying cell that disappears and reappears, the indexed root ordering does not say anything about "θ * " on other connected components of its domain.
For preventing them to cross the cell's boundary (e.g. the upper right root function), we might need to add the leading coefficients to the projection.
x 1 x 2 s (c) However, if we know that there are well-defined root functions of the green polynomial between the cell's boundaries and "θ * ", the leading coefficient is not needed, as by projective delineability, these intermediate root functions already protect the cell.
x 1 x 2 s (d) Alternatively, an additional resultant with a polynomial with a (well-defined) root function below or above the cell can be added to prevent "θ * " root functions from crossing the cell's boundary. As this is expensive, this rule is only applied when this additional resultant is added anyway for maintaining the indexed root ordering, for instance if the cell's lower and upper bounds are defined by the same polynomial.
We note that this rule hides some of the complexity for an efficient decision: in the Q l and Q u cases, it is wise to choose the closest root to the lower respectively upper bound, as if there is more than one bound below/above the cell, their root functions are well defined on the underlying cell without adding the leading coefficient. The Q l other and Q u other cases do not add any extra properties if resultants are picked that are added due to the indexed root ordering anyway.

Connectedness
For maintaining properties (in particular the order-invariance of some polynomials) on higher levels, we need to maintain connectedness. Besides some preconditions, i.e. that the cell's boundaries are described by two welldefined real root functions, we need to ensure that the lower and upper bound of a sector do not cross as illustrated in Figure 9.
be an indexed root ordering of level i, and t be the reflexive and transitive closure of . Assume that matches s.
connected(i)(R) 5.9. Generalization of the current sample We define a mapping for generalizing the sample to the cell on the current level.
x 1 x 2 ξ 1 ξ 2 s Figure 9: Let I = (sector, ξ 1 , ξ 2 ), then repr(I, s 1 ) holds in the whole blue area, thus forming a non-connected subset of R 2 . By adding resx 2 (ξ 1 .p, ξ 2 .p) to the projection, the subset is made connected by restricting the underlying cell as indicated by the dashed vertical lines.

Cell descriptions
Now, we define a property that states that the cell is described by its representation.
Property 5. 5. Let i ∈ N >0 , R ⊆ R i , and I be a symbolic interval of level i.

The property holds(I) holds on R if and only if
This property is the only one that cannot be mapped to a set of other properties, i.e. properties of this kind are the assumptions in our proof system.
Given holds(I), we can maintain the weaker property repr(I, s). To do so, we need to ensure that the indexed root expressions describing the cell's boundaries always refers to the same root on the underlying cell by making their defining polynomials delineable (such that the referred real root function is defined and the number of roots below the referred root function is constant over the underlying cell).
Note that repr(I, s) could be maintained by a weaker property, namely that the roots defining the cell boundaries are well-defined and some analogous guarantee as given by holds(I). For the purpose of describing single cells, this is too weak; however, for other scenarios, this could avoid some more leading coefficients in the projection.

Ordering of properties
We observe that the rules of inference defined in Rules 5.1 to 5.14 are cycle-free in the sense that all properties in the antecedents are smaller than the consequent property according to some ordering : Definition 5. 4. The ordering is defined such that properties of level i are greater than any property of level i − 1 w.r.t. and such that it satisfies the following partial order of properties of each level i (starting with the greatest element):  5. ord inv(p) for all reducible p of level i 6. ord inv(p) for all irreducible p of level i 7. sgn inv(p) for all reducible p of level i 8. sgn inv(p) for all irreducible p of level i 9. connected(i) 10. an sub(i) 11. sample(s) for all s ∈ R i of level i 12. repr(I, s) for all I of level i and s ∈ R i−1 13. holds(I) for all I of level i The proof rules introduced in this section are visualised in Figure 10 which shows the relationships between rules of different levels.

Levelwise construction of a single cell
So far, we presented an abstract proof system, i.e. a set of rules which allow us to construct a proof for signinvariance of some polynomials in a cell. In the following, we will give an algorithm for constructing a single cell in a levelwise manner by specifying how the proof rules are applied. We first give the algorithmic framework and then define heuristics needed to instantiate the algorithm.

Algorithm to construct single cell
Algorithm 1 constructs a single cell. The input to the algorithm is a set of polynomials for which a cell is to be computed on which these polynomials are sign-invariant. This initialises the set of properties Q on Line 1 which is maintained to contain properties that need to be fulfilled by the yet to be chosen representations on the lower levels. I.e. at the beginning of iteration i, the data structure (I 1 , . . . , I i ) needs to maintain all properties in Q. To do so, Algorithm 1 iteratively calls Algorithm 2 to process the properties and construct the interval on level i, taking note of any failure cases due to nullifications that might occur.
Algorithm 3 is a sub-algorithm that replaces a property in Q by a set of properties Q induced by a proof rule if no such set is already contained in Q, and otherwise simply removes the property from Q. FAIL is returned here if and only if no proof rule is applicable; which is the case only when a polynomial is nullified (i.e. Rule 5.2) and equational constraint projection does not allow us to ignore that fact. The sets Q induced by a proof rule are determined in Line 1 as follows. Assume we want to derive the property non null(p) of R ⊆ R i for some polynomial p of level i + 1 with respect to the current sample s ∈ R i . The algorithm would check if the conditions of the proof rules in Rule 5.2 are satisfied, and then determine the corresponding properties of R which form the set Q . In the first case, we check whether p is irreducible, deg xi+1 (p) > 1, and disc xi+1 (p)(s) = 0 hold; if yes, we add the set Q = {sample(s), sgn inv(disc xi+1 (p))} to the set of choices. In the second case, we check whether p is irreducible holds and there exists coefficient c j of p such that c j (s) = 0 holds; if yes, we add the set Q = {sample(s), sgn inv(c j )} to the set of choices.
Algorithm 2 is used to extend the cell construction by a level. It applies the proof rules according to the ordering on the properties from Definition 5.4 until the only properties of level i remaining are the sign-invariance of some irreducible polynomials (note that all smaller properties w.r.t. are provable with only the sample s present except if a nullification occurs). Then, a representation consisting of a symbolic interval I, a set of polynomials E and an indexed root ordering is chosen. The roles of I and are already discussed above; the polynomials in E are meant to be excluded from the indexed root ordering, and thus the application of the equational constraint projection rule (Rule 5.8) is enforced. The formal requirements on the representation will be defined in Definition 6.1. Finally, the remaining properties on level i are eliminated by application of the proof rules, until only a property of the form holds(I) is left on the level (which is the greatest property of the current level w.r.t. ). . . , x n ], and s ∈ R n . The method single cell in Algorithm 1 either returns a cell data structure R such that all polynomials in P are sign-invariant on setOf(R) and s ∈ setOf(R), or returns FAIL. Proof. Note that the algorithm will terminate as Algorithm 3 replaces properties by smaller properties according to the ordering as defined in Definition 5.4, there does exist a smallest property, and the CAD projection of a set of polynomials is finite. The correctness follows from the correctness proofs of the rules of inference.

Heuristic choices
Our algorithm has two points where a choice must be made: (1) when applying a rule of inference in Algorithm 3 Line 5 and (2) choosing the representation in Algorithm 2 Line 7.
For the first choice, the rules of inference may define multiple possible property sets for a property that each imply that property and so we have the freedom to choose which to use. We aim to pick the "best" such set, noting that this choice is heuristic (made for example with respect to estimated computational effort, or degrees of polynomials, or number of polynomials / real roots). One could compute each of these sets individually and compare them, or, to avoid superfluous heavy computations of resultants or discriminants, prefer sets which seem easier to compute (which is the approach taken by our implementation). For instance, in Rule 5.2, we always pick a coefficient rather than computing a discriminant. Note that even if sets involving resultants or discriminants are avoided where possible, the algorithm should first check whether one such set is already maintained before computing any new set.
For the second choice, we formalize what we mean by a representation. As already discussed, there might be multiple choices for the cell description I on the current level and the indexed root ordering for fulfilling the requirements for Rule 5.11. Additionally, if I is a section, we can make use of the equational constraints projection in Rule 5. 8. A representation determines all these parameters. Definition 6.1 (Representation for Ξ). Let i ∈ N >0 , s ∈ R i , and Ξ be a set of indexed root expressions of level i such that ξ ∈ irExpr(ξ.p, s [i−1] ) for every ξ ∈ Ξ.
A representation for Ξ with respect to s is a tuple (I, E, ) where I is a symbolic interval of level i, E is a set of polynomials of level i, and is an indexed root ordering of level i such that Algorithm 1: single cell(P, s) Input : finite P ⊆ Q[x 1 , . . . , x n ], s ∈ R n Output: single cell data structure R such that s ∈ setOf(R) and all polynomials in P are sign-invariant on setOf(R); or FAIL 1 Q := {sgn inv(p) | p ∈ P } 2 for i = n, . .
Output: an interval data structure I such that s ∈ setOf(s [i−1] , I), and a set Q ⊆ Prop| [i−1] such that for any R ⊆ R i−1 it holds (∀q ∈ Q . q (R)) =⇒ (∀q ∈ Q. q(setOf(R, I))); or FAIL 1 foreach q ∈ Q| i where q is the greatest element with respect to (from Definition 5.4) and q = sgn inv(p) for an irreducible p do or FAIL 1 let choices ⊆ 2 Prop such that each Q ∈ choices is a set of properties from a proof rule which can be applied according to parameters to derive q (see explanatory text above for details) 2 if choices = ∅ then This definition is quite general by allowing additional indexed root expressions in Ξ compared to the set Ξ describing the roots of the present polynomials. This for example enables heuristics to under-approximate the constructed cell to reduce heavy computations in future work (see Section 9.1).
In the following however, we will assume that Ξ = Ξ, that is, I and will only consider roots from Ξ corresponding to the polynomials for which we need to derive sign-invariance.
Note that we omit the requirements of Rule 5.12 here yet for readability reasons and as the adaption is trivial: If required, we only need to add the pair (I.l, I.u) to the indexed root ordering.
For the symbolic interval I, we minimize the degrees in the main variable of the defining polynomials. . . . , x i ] be a set of irreducible polynomials of level i, s ∈ R i be a sample such that no p ∈ P is nullified on s [i −1] , and Ξ ⊆ irExpr(P, s [i−1] ).
We define the sets Ξ lo and Ξ up of the closest lower respectively upper bounds of s i as follows: We define the lowest degree interval I ldeg of s with respect to Ξ as For the indexed root ordering , there are two possibilities: aim to make the underlying cell as big as possible; or aim to avoid heavy resultant computations. In theory we could compute the results for all (or all promising) possible indexed root orderings and pick the best one. However, as this is infeasible in practise, we define below several alternative heuristics with different rationales.
To achieve this we employ a somewhat idealistic view on the problem. Recall the property that an indexed root ordering holds is ensured by making resultants order-invariant, which is often a stronger ordering on the roots than required by the picked indexed root ordering. We ignore this fact and base our heuristics on the set of indexed roots without considering common defining polynomials between them (for now). This is further discussed in Section 7.
We start with the following observation: as the given derivation rules always require projective delineability for all polynomials whose sign-invariance is proven using an indexed root ordering, a fixed ordering of all real root functions defined at s [i−1] of such polynomials is guaranteed anyway. Thus, we can restrict the heuristics for the choice of by computing the ordering on the setΞ containing for each polynomial only the closest lower and upper roots to s i , and extending an ordering˜ ofΞ to an ordering on Ξ. This is formalized in the following definition.
For any indexed root ordering˜ X onΞ matching s [i−1] , we define the ordering X on Ξ matching s [i−1] as In the following definitions we give different heuristics to choose an indeed root ordering˜ X which in each case can be extended to a corresponding ordering X .
First, in the section case, the below equational constraint heuristic enforces simply the application of the equational constraint rule to all polynomials. Definition 6.4 (Equational constraint representation). Let i, P, s, Ξ, ξ lo , ξ up , I ldeg be as in Definition 6.2. If ξ lo = ξ up , we define the Equational constraint representation as the tuple (I ldeg , P, ∅).
Next, the biggest cell heuristic defines the weakest ordering on the indexed roots according to Definition 6.1 and thus defines the biggest possible underlying cell (under the assumption that this ordering can be realized perfectly, i.e. the resultants have roots only below the crossing of a polynomials' root with a cell boundary) as visualized in Figure 11a. Note that for the section case, the biggest cell heuristic is the equational constraint heuristic plus some discriminants and coefficients; thus, the application of biggest cell only makes sense in the sector case. Definition 6.5 (Biggest cell representation). Let i, P, s, Ξ, ξ lo , ξ up , I ldeg be as in Definition 6.2. We define the indexed root ordering biggest on Ξ according tõ and the Biggest cell representation as the tuple (I ldeg , ∅, biggest ).
The below lowest degree barriers heuristic minimizes the degrees of the defining polynomials (locally per level), and thus also the degree of the computed resultants (under the above assumption) as visualized in Figure 11b. Furthermore, it enforces the equational constraints rule whenever possible.
This heuristic has two motivations. First recall that the polynomial degrees grow doubly exponentially during the CAD projection, see e.g. [5,Table 1]. Second, that the running time of the resultant computation depends quadratically on the degree in the main variable of the input polynomials, see [21].
In the following definition, we use the lexicographical ordering on tuples, that is t Let Ξ ⊆ Ξ. For ξ ∈ Ξ , we define the barrier of ξ w.r.t. Ξ , that is a root in Ξ between ξ(s [i−1] ) and s i with minimal degree in the main variable: If ξ lo = ξ up , we define the indexed root ordering barriers on Ξ according tõ 1] ) and ξ = barrierΞ(ξ)} and the Lowest degree barriers representation as the tuple (I ldeg , ∅, barriers ).
If ξ lo = ξ up , we exclude the polynomials with roots around s i which do not qualify as a barrier for some other roots from the indexed root ordering (thus enforcing the application of the equational constraint rule). To do so, for a set of polynomials P ⊆ P , we define the setΞ| P = {ξ | ξ ∈Ξ s.t. ξ.p ∈ P } and let P eq ⊆ P be the result of the following fixed point computation (i.e. P eq = P j = P j+1 for some j): We define the indexed root ordering barriers on Ξ \ Ξ| Peq according the definition of˜ barriers as above but only considering the rootsΞ \Ξ| Peq and the Lowest degree barriers representation as the tuple (I ldeg , P eq , barriers ). The below chain heuristic fixes the total ordering on the roots as visualized in Figure 11c.
Definition 6.7 (Chain representation). Let i, P, s, Ξ, ξ lo , ξ up , I ldeg be as in Definition 6.2. Let We define the indexed root ordering chain on Ξ according tõ and the Chain representation as the tuple (I ldeg , ∅, chain ).
Finally, the full heuristic fixes the same total ordering as it is the unique transitive closure of the chain heuristic as visualized in Figure 11d. Note that we include this heuristic only for illustrative purposes: it resembles the cells constructed naively by making a full projection without adapting to the behaviour at the sample. Let We define the indexed root ordering full on Ξ according tõ and the Full representation as the tuple (I ldeg , ∅, full ).

Ordering heuristics
In this section we will explain the intuition behind the various ordering heuristics. They are designed to optimise desirable characteristics. We acknowledge their heuristic nature, and in particular that the intuition is made under the idealistic view that any indexed root ordering can be perfectly realized: that is, the resultants calculated only have roots which indicate a crossing of two root functions considered in when in reality they also have "spurious" roots which do not have relevance for the problem at hand.
From the full to the chain heuristic. As mentioned above, both the full and the chain heuristic fix the same ordering on the real root functions over any cell containing the current sample as depicted in Figure 12a.
Obviously, the chain heuristic is more efficient here as it uses a strict subset of the work done by the full heuristic. Unlike the full heuristic, the chain heuristic takes the underlying sample into account, and thus the resulting projection is only valid for a single cell, i.e. locally delineable. The full heuristic is independent from the sample and makes the set of polynomials fully delineable.  From the chain to the biggest cell heuristic. The chain heuristic still fixes a stronger ordering on the root functions than necessary. As defined above, the biggest cell heuristic is the minimal requirement on the ordering to maintain sign-invariance on the constructed cell. This way, we hope that the size of the underlying cell is maximized, as in Figure 12b. Note that while the chain heuristic only takes the sample in one dimension less into account, the biggest cell heuristic also considers the highest dimension of the sample.
From the biggest cell heuristic to the lowest degree barriers heuristic. The lowest degree barriers heuristic minimizes the degrees of the resultants, as illustrated in Figure 12c. The rationale here is that resultant computations are heavy and their complexity depends on the degrees of the input polynomials, and that polynomials with lower degrees have fewer roots. Thus reducing degrees may reduce the case of resultants having real roots that do not actually correspond to relevant points. Note that despite the naming, the lowest degree barriers heuristic could in theory lead to bigger cells than the biggest cell heuristic in certain cases.
Equational constraint projection. The equational constraint projection allows us to leave out discriminants and coefficients of all polynomials except the section-defining polynomial by only adding resultants of all polynomials with the defining polynomial. Thus, we expect this rule to be more efficient than the presented heuristics in most cases. However, we added the possibility for the application of heuristics for cases where the section-defining polynomial has high degree and thus computing resultants with that polynomial is desirable to be avoided.
On the gap between the idealised view and reality. For the lowest degree barriers heuristic, this might lead to the computation of a redundant set of resultants regarding the minimal requirement on the indexed root ordering, as shown in Figure 13, where the relation between the first roots of p 2 and p 3 (depicted in red) is superfluous but its corresponding resultant is added because all roots are considered individually and not their connection via the defining polynomials.

Comparison with the refinement-based approach
The refinement-based approach to single cell construction [11] saves resultants compared to full CAD projection in the same way as our levelwise variant, by exploiting the transitivity of the induced ordering on real root functions. Complexity-wise, the approaches are the same, as both add up to two resultants per polynomial on a level.
However, the influence on the ordering of constraints on the refinement-based variant means the quality of the constructed cell varies. In the worst case, the resulting ordering corresponds to the chain heuristic: in Figure 12a, this is achieved for the example when the polynomials are merged in the ordering p 4 , p 3 , p 2 , p 1 . Then, the upper cell boundary is updated in every step to a lower boundary. In the best case, the biggest cell heuristic is achieved: for our example in Figure 12b, this is when the polynomial p 1 defining the sector's boundary is merged first.
x 1 For the section case, it should also be noted that the levelwise approach can always apply the equational constraints rule as the cell description is known before the projection. The refinement-based approach only starts applying the equational constraints rule when the cell collapses to a section, until then it adds discriminants of all polynomials, which may not actually be needed. The illustrating examples may be used to explain this observation. Consider Figure 12a but with the sample s moved to the upper root of p 1 . When merging polynomials in the ordering p 4 , p 3 , p 2 , p 1 , the polynomials p 4 , p 3 , p 2 are merged as in the sector case until the cell collapses to a section when merging p 1 . When merging p 1 first, then the section case is identified directly and the reduced projection applied when merging p 4 , p 3 , p 2 , meaning the discriminants and coefficients for those polynomials need not be added.

Potential for non-connected "cell" descriptions
Recall that we aim to compute a single cell on which the input polynomials are sign-invariant, but McCallum's CAD projection theory uses the stronger property of order-invariance. It has been observed before that this can allow for small optimisations at the top layer. For example, when using equational constraint projection we must take discriminants if the projection is in a middle layer (to ensure order-invariance is provided which is a hypothesis of the next lifting) but can avoid these at the top layer as we need never lift over that [22].
The visualisation of our proof rules in Figure 10 alerted us to another such optimisation, which if enacted has some strange consequences. Note that a cell being connected at level i is required for it to be order-invariant, but not sign-invariant. Thus we need not ensure connectivity at the top level, i.e. we could avoid taking resultants of the upper and lower bound polynomials to satisfy Rule 5. 12. Without connectedness it would not be accurate to describe what we are constructing as a cell, rather it is a semi-algebraic set (see also Figure 9). However, it is still describing a portion of space on whose points the respective polynomials are all sign-invariant. Thus, in the context of the MCSAT search which we discuss in the next section, the set is still describing a portion of space on whose points the constraints are all unsatisfiable for the same reason and so its negation is still a valid explanation clause to further that search.
We note that the resultants saved by this optimisation may still have to be computed later, if the cell is used in further propagation. However, this will not always be the case and so often this optimisation may save computation. Discovery of this optimisation illustrates the advantages of the proof system presentation used in this paper.

Factorization of polynomials and cell size
To ensure the output of CAD is correct we must compute a square-free basis of the current set of polynomials P (i.e. a set of square-free polynomials without common factors which define the same varieties as P ) before the application of a projection operator.
The approach of the rules presented above is to fully factorize each polynomial, resulting in what is called the finest square-free basis. This is a fairly standard choice made in CAD implementations as the effort of computing a full factorization pays off compared to the heavy resultant, discriminant, and real root isolation computations, which are all simpler for smaller polynomials.
When building a full CAD the choice of square-free basis does not effect the decomposition computed, just the time taken to compute it. However, for the single-cell construction, this makes a difference also in the size of the resulting cell, specifically the cell can be larger if we factor. This can easily be observed by considering  Example 3.1. Here polynomial p 1 · p 2 · p 3 is already square-free. Using this directly in the one-cell construction algorithm without factorization as a whole would simply result in computing the discriminant of the polynomial (and some coefficients): the discriminant must have as factors all the cross resultants of p 1 , p 2 , p 3 by definition. So in this case, no improvement over a full projection is achieved, and we would find the smaller cell from Figure 1. However, if we factor to consider the set {p 1 , p 2 , p 3 } instead then we can obtain the larger cell from Figure 1.
The limits of factorization. We note that this described gain in size of the computed cell from factorization does not mean we build optimal cells for all problems where there is a geometric separation. Figure 14a shows an example with two circles and a linear polynomial for which the constructed cell we ideally build is the inside of the circle defined by p 1 . Even if we were originally presented with p 1 · p 2 , factorisation would allow a one-cell algorithm to ignore the intersections of p 2 and p 3 to construct the entire inner circle. But consider the similar example from Figure 14b, in which we have perturbed the problem to consider the irreducible polynomial p 1 · p 2 + 1. Graphically, the two problems seem to be similar, and the delineation of the roots are identical. But in the second case the single cell construction cannot treat the two ovals separately: we must consider the irrelevant intersection and thus build a smaller than ideal cell. This shows some of the limits of the single cell construction by way of a well observed truth in computer algebra: problems which appear similar to human beings (i.e. when viewed geometrically) are not always equally hard algebraically.

Experimental evaluation
The presented proof rules are, due to their generality, potentially applicable with only small additions to a variety of problems and algorithms related to non-linear arithmetic, such as quantifier elimination by CAD [16], various CAD optimisations e.g. [33,17], non-uniformly cylindrical decompositions for quantifier elimination [8,9], the generation of explanations when using cylindrical algebraic coverings for a traditional SMT solver [2], and the generation of explanations in MCSAT [26,25,11]. As the latter was the main motivation for our work, the evaluation of this paper will focus on generating theory explanations in MCSAT for non-linear arithmetic.

Generating explanations for MCSAT
Recall the description of MCSAT in Section 1.2. We are interested in when MCSAT resolves theory conflicts. I.e. when there is a set of constraints C in real variables x 1 , . . . , x n , x n+1 that should be satisfied according to the Boolean model and an assignment s : {x 1 , . . . , x n } → R such that s cannot be extended to a value for x n+1 satisfying C. The task then is to exclude a cell around s that generalizes this conflict, i.e. a region cell where the reason for unsatisfiability of C is invariant.
This reason of unsatisfiability is maintained when all input polynomials are sign-invariant on the generalized cell. To achieve this, we could do a full McCallum projection step, obtaining a set of properties of one level below allowing to construct a cell around s.
However, this is already too strong, as we need the set of input polynomials P = {p | (p ∼ 0) ∈ C} ⊂ Q[x 1 , . . . , x n , x n+1 ] to be delineable over a cell containing the current sample s ∈ R n for maintaining the desired property. We achieve this by determining the indexed root expression of the real roots of the set {p ∈ factors(P )| level(p) = n + 1} over s in x n+1 and ordering them such that ξ 1 (s) ≤ ξ 2 (s) ≤ . . . ≤ ξ k (s). Finally, we ensure that all lower level factors {p ∈ factors(P )| level(p) < n + 1} are sign-invariant, check that each polynomial is not nullified (if not, we stop), make each polynomial individually delineable and add the resultants of the pair of polynomials (ξ j .p, ξ j+1 .p) for j ∈ [1..k − 1]. Thus, the input of the presented one-cell algorithm is given as This approach is similar to the chain heuristic for indexed root orderings presented in Definition 6.7. Note that this approach could be embedded nicely into our system as a proof rule, taking over some optimizations. Furthermore, there are alternative approaches for elimination in the first levels, i.e. by computing a covering of unsatisfying intervals of input constraints. These possibilities are part of our plans for future work.
Further note that in MCSAT, conflicts might also depend on previously computed cells which are expressed by conjunctions of extended constraints where a variable is compared with an indexed root expression; these constraints can also occur in the input. To handle an extended constraint x n+1 ∼ root xi+1 [p, j] with p ∈ Q[x 1 , . . . , x n+1 ], we simply add p to the set P of input polynomials.

Implementation
For the evaluation of the presented algorithm, we employ the SMT-RAT [1,18] solver, which provides an MCSAT engine allowing the combination of multiple explanation backends. Several incomplete and complete methods are combined in the sense that these backends are called sequentially until one returns an explanation.
Currently available backends are the Fourier-Motzkin variable elimination (FM) [25], interval constraint propagation (ICP) [27], virtual substitution (VS) [3], the complete model-based CAD cell construction algorithm from NLSAT [26] using Collin's projection operator as well as the refinement-based single cell construction algorithm [11]. Furthermore, SMT-RAT employs a fully dynamic activity-based variable ordering heuristic for scheduling theory variable assignments and Boolean decisions [39].
All variants of the presented levelwise one-cell construction algorithm are implemented as backends in SMT-RAT. This is a preliminary version not fully exploiting the power of all proof rules, in particular, it is not checked whether a property is already implied by some properties in the projection. Most notably, the presence of resultants (the second and third respectively the third and fourth case) is not checked for in Rule 5.11; thus more leading coefficients than necessary are added to the projection set. For the evaluation, we compare the following solver variants: NL The model-based projection using Collin's operator from NLSAT [26] as a complete explanation backend. I.e. to use when one of the following variants which are all based on McCallum projection hits a nullification which they cannot handle (see Definition 2.2).
OC-* The refinement-based one-cell construction algorithm [11]. We use the same MCSAT embedding as described above. Furthermore, the refinement-based method is able to return an explanation when a polynomial is nullified in the sector case in some special cases which our levelwise approach cannot handle yet; for better comparability, these special cases are excluded from the following tests. In case of failure, the complete explanation from NLSAT is called. To further specify the algorithm's behaviour and make it reproducible, we implemented heuristics for the order of merging of initial polynomials. These specify the * in the variant name as follows.
ASC The merge-operation is called on the initial polynomials in ascending order by their total degree.
DSC The merge-operation is called on the initial polynomials in descending order by their total degree.
LW-*-* The new levelwise one-cell construction algorithm with different heuristics applied in the section and sector case. In case of failure, the complete explanation from NLSAT is called. The first * in the variant name dictates the employed heuristic for the section case and the second for the sector case. Possible substitutions for the stars are as follows.  Table 1: Details on the instances solved by each solver: the number solved (first column), the number of satisfiable and unsatisfiable solved instances (the second and third columns) and the number of solved instances where the solver made at least one call to the single-cell construction and the number that did not require any such call to solve (the final two columns). Note that the last column is, for the first block of solvers, the number of instances that may be solved with Boolean reasoning and construction of sample points alone, i.e. without any theory calls. For the second block of solvers this also includes the instances that are solved using the additional incomplete theory backends.
EQ equational constraint heuristic is applied (only for section case).
BC biggest cell heuristic is applied (only for sector case).
CH chain heuristic is applied.
LDB lowest degree barriers heuristic is applied.
[solver]+ with [solver] being one of the solver variants above, uses the FM-, ICP-and VS-based backends serially, in this order, before resorting to [solver]. Furthermore, we apply general preprocessing to the input before calling the main solver [18].
All variants are executed on the SMT-LIB benchmark library [4] for quantifier-free non-linear real arithmetic, abbreviated as QF NRA. This set contains 11552 problem instances.
The machine used for testing has four 2.1 GHz AMD Opteron CPUs with 12 Cores each. In the created test series, each instance was solved with 15 minutes timeout and 6 GB of memory.
For reproducibility, the implementation which generated the following results is available at https://doi.org/ 10.5281/zenodo.5764569.

Results
Examined solvers. First of all, we observed that OC-ASC solves as many instances as OC-DSC but needs slightly less time; thus, we omit OC-DSC. Furthermore, we observe that LW-EQ-CH and LW-EQ-LDB solve more instances than LW-CH-CH and LW-LDB-LDB. In our basic implementation, saving leading coefficients and discriminants pays off compared to the other heuristics. Thus for further examination, we focus on the LW-EQ-* variants which always use the equational constraints projection in the section case. A brief summary of solved instances of all solvers can be seen in Table 1. Furthermore, from now on, VB-LW (respectively VB-LW+) is the virtual best of the LW-EQ-* (respectively LW-EQ-*+) solvers. VB and VB+ are the corresponding virtual bests with respect to LW-EQ-* and OC-ASC.    General observations. Before we compare the different approaches, we make some general comments on the results based on exemplary solvers in Table 2.

LW-EQ-BC
As already observed in Table 1 and confirmed by Table 2, large parts of the benchmark set are relatively easy. Around half of the benchmarks do not involve a single explanation call; and when enabling the additional incomplete backends, 79% of the benchmarks can be solved without a single call to the single cell construction. Considering the total number of explanation calls made, only 3.14% of them use the single cell construction for the VB-LW+ solver, meaning that even for the problems where single cell is needed, it is only needed rarely.
The fail rate of the levelwise backend (the cases where a nullification occurs) is smaller on the VB-LW+ solver (6%) than on the VB-LW solver (17.98%). That means, that nullifications are more probable on the simple parts of the problem.
It should also be noted that VB-LW+ needs significantly less explanation calls than VB-LW; which means, that the incomplete backends have explanations of higher quality.
To summarize, we can only make meaningful statements on our heuristics based on the solver variants without the additional backends, as otherwise, there are too few calls to the single cell construction in solved instances. The possible reasons for this are twofold. On the one hand, our procedure and its implementation may not yet be suitable to solve the harder instances in the benchmark set. On the other hand, the benchmark set might not contain enough interesting or diverse benchmarks for an evaluation. Overall results. All solvers are depicted in the performance profile in Figure 15.
First of all, the NL and NL+ solvers perform significantly worse than the single cell variants, which justifies the investigation of this new levelwise cell construction approach.
The refinement-based approach OC-ASC as well as LW-EQ-* perform similarly. Among the levelwise variants, the LW-EQ-CH solves the most; however, these differences are not significant and depend on the implementation and heuristics chosen in the MCSAT solver. Considering OC-ASC+ and LW-EQ-*+, these differences vanish even more when combining them with incomplete methods handling simple sub-problems. These results are summarized in Table 1.
Virtual best and orthogonality of heuristics. VB-LW performs significantly better than the LW-EQ-* solvers. This means that although the number of solved instances is similar for all heuristics, each heuristic solves instances that the others do not solve. Thus these heuristics are orthogonal to some degree. We depict the number of instances solved by differing combinations of solvers in Figure 16. Note that the same holds for VB, meaning that OC-ASC is orthogonal to LW-EQ-* as well.
Note that the VB-LW+ solver does not solve significantly more instances than any of the LW-EQ-*+ solvers. That means the differences of the heuristics only become noticeable in the simple parts of the instances. An explanation could be that harder parts of instances require heavy resultant computations which could quickly shift the instance to unsolvable within the timeout. Now focusing on the "pure" solvers without additional backends, we observe that on simple instances the refinement based approach OC-ASC and the virtual best of the levelwise approaches VB-LW behave similar on simple instances while they are more "orthogonal" on harder instances, as indicated by Figure 17a.
The SMT-LIB benchmark set is split up into families which each have a similar structure. We note that all variants LW-EQ-* and OC-ASC solve roughly the same amount of benchmarks in each family individually, as seen in Table 3, while the virtual best solvers do solve more. That means that the orthogonality of the variants is split up onto all families.
Number of constructed cells. Figure 18 depicts a performance profile where we consider the number of constructed cells (that is, the number of times the explanation function is called by MCSAT) instead of the running time. We can clearly see that the solvers that solve more instances tend to compute fewer cells during a run. That is, the quality of the explanations is better. Again, considering the refinement based approach and the levelwise approaches, differences are only significant without the other backends. In particular, the virtual best solver clearly solves more instances with fewer cells. Figure 17b makes this even more clear by comparing the number of cells constructed by VB-LW and OC-ASC for every instance.
Comparison of heuristics. For further comparison of the three heuristics, we collected two more statistics for each instance: the dimension of constructed cells (i.e. the number of levels considered during its construction), the maximum degree in the main variable of the polynomials occurring in the computation, and the size of the       (6) LassoRanker (7) UltimateAutomizer (8) hong (9) hycomp (10) kissing (11) meti-tarski (12) zankl.
computed projection (i.e. number of resultants, discriminants and coefficients). A summary of these statistics is shown in Table 4. First note that the virtual best has the lowest value or close to the lowest value in all categories, which means that they might indicate the performance of a solver to some degree. However, interpretation needs to be careful, as the differences are relatively small, confirming the previous observations. Further note that we do consider fewer instances as in previous analysis. However, we do have two minor observations to make here.
Regarding the heuristic LW-EQ-LDB: its idea was to minimize the degrees of the polynomials in the projection, however, the average maximum degree is slightly higher than for the other two heuristics. LW-EQ-BC needs slightly less projection steps, but is similar to the other solvers in terms of number of cells created. This means that the size of the cells are similar, although again, the intention was to produce the biggest possible cells. To emphasize, in the analysis in Figure 18 which is based on all solved instances (including the ones where the NLSAT backend was used as fallback), it needs more cells than LW-EQ-LDB.
To summarize, this simple analysis can not confirm that the ideas behind the different heuristics take effect on the benchmarks or prove different behaviour on all benchmarks. However, as stated above, on individual benchmarks, they do behave differently, as shown by the performance of the virtual best solver. 9. Conclusions and future work 9. 1

. Future work
The formulation of the presented proof system allows heuristic to influence the shape of the constructed cells. The experimental evaluation shows potential for further development of those. Furthermore, for some applications such as incremental linearization [15], under-approximations of the constructed cell might be beneficial if they can    be computed more efficiently; more concretely, the resultant computations get trivial if the lower and upper bounds are replaced by one or more linear polynomials, resulting in boxes or polyhedra.
The proof system could be applied in the future in contexts other than MCSAT, such as the cylindrical algebraic coverings method [2] or quantifier elimination algorithms such as NuCAD [8,9].
By relying on the theory of McCallum projection, the presented proof system is incomplete. Recently, there has been progress on the Lazard projection operator [28,34,35,12,37,38], which is complete while maintaining the advantages of the McCallum operator. Thus, it is promising to extend our framework to exploit the Lazard projection.
Finally, the presented set of inference rules allows to produce fine-grained proof graphs, which could enable the certification and external automated verification of results.

Conclusion
We introduced the new concept of levelwise single cell construction, motivated by maintaining the savings of the existing refinement based approach of Brown [11] while allowing for more flexibility and new optimisations.
The theoretical part of this paper consists of the introduction of the novel notion of projective delineability and the presentation of a proof system in order to enable fine-grained projections based on a given sample. To demonstrate the possibilities of the proof system, we gave a simple algorithm which builds upon this proof system as well as several heuristics for the application of the given rules. Finally, we gave a qualitative evaluation as well as some notable observations. We evaluated our algorithm by an implementation applied to explanation generation in MCSAT. We showed that our basic heuristics yield different performances for different instances indicating that there is room for further algorithmic development, as well as a more elaborated implementation of the proof rules.
Importantly, our proof system allows for a wide range of improvements through experimentation with heuristics, approximations and on theoretical matters as well as extensions to other algorithms than the single cell construction. The particular feature to keep in mind is the discontinuity of θ * at the point x = 0, and how, as we move through the discontinuity from negative x to positive x, θ * switches from being below θ 1 and θ 2 to being above them. Note also that if R were expanded either right or left, we would violate the requirement that the disc y (p) is order-invariant.
Proof of Theorem A. 1. If ldcf xn (p) is sign-invariant in N r , then the theorem holds. The interesting case is when ldcf xn (p) is not sign-invariant in N r , no matter how small we make N r . If this is the case, then there must be points in R at which ldcf xn (p) vanishes that are arbitrarily close to r.
Choose a real number α such that all roots of p(r 1 , ..., r n−1 , x n ) are greater than α, and choose N r to be sufficiently small so that p has no zeros in N r × R that intersect with the hyperplane x n = α. Since p, being a polynomial, is continuous and non-zero at (r 1 , ..., r n−1 , x n ), such an N r exists. Let p * = p(x 1 , . . . , x n−1 , x n − α). Note that the discriminant is invariant under this transformation (see for example [23]) so disc xn (p) = disc xn (p * ). The mapping that shifts x n by α is a homeomorphism that maps the roots of p to the roots of p * . So we can study the roots of p by understanding the roots of p * . But for p * we are guaranteed that p * (r 1 , . . . , r n−1 , 0) = 0. In fact, over N r , no zero of p * lies in the hyperplane x n = 0. To put another way, the constant coefficient with respect to x n of p * does not vanish on N r .
Let d be the degree of p * in x n . We now consider the polynomial p = x d n p * (x 1 , . . . , x n−1 , 1/x n ). We note that, over N r , the transformation from p to p defines an analytic homeomorphism from N r × (R \ {0}) to itself that maps the zeros of p off the x n = 0 hyperplane to the zeros of p (none of which, recall, lie on the x n = α hyperplane). Specifically, the homeomorphism is given by the mapping (x, y) → (x, 1/y − α). The transformation leaves the discriminant unchanged as well, i.e. disc xn (p) = disc xn (p * ). This also follows from [23], but is also addressed explicitly as Lemma 8.1 of [6]. Thus, the discriminant of p is order-invariant in R. Moreover, its leading coefficient, which is the same as the constant coefficient (in x n ) of p * , is non-zero throughout N r . This means that p is not nullified anywhere in N r and its leading coefficient (in x n ) is sign-invariant on N r . Thus, by the Brown-McCallum projection, p is analytically delineable over N r . It must have a section that passes through the point (r 1 , . . . , r n−1 , 0), since its constant coefficient (in x n ) vanishes at r. So let the sections of p be θ 1 ≺ · · · ≺ θ k with multiplicities m 1 , . . . , m k , and let i be the index such that θ i is the section passing through (r 1 , . . . , r n−1 , 0). Note that it must be unique because delineability guarantees non-intersecting sections. Since none of the remaining sections pass through (r 1 , . . . , r n−1 , 0) we can refine N r to N r that is sufficiently small so that none of the remaining sections intersect the hyperplane x n = 0 over N r .
So, over N r the p-section θ j , where j = i, maps to p-section θ j . Moreover, since θ i is the only p-section that crosses the x n = 0 hyperplane, the graphs of θ 1 , . . . , θ i−1 lie in N r × R <0 , and the graphs of θ i+1 , . . . , θ k lie in N r × R >0 . This means these p-sections are non-intersecting and ordered as follows θ i−1 < · · · < θ 0 < θ k < . . . θ i+1 . Finally, consider a connected region M in N r on which the leading coefficient of p or, equivalently, the constant coefficient of p is non-zero. Section θ i either lies entirely above the x n = 0 hyperplane on this region, in which case it maps via the homeomorphism to a section over M that lies above section θ i+1 , or entirely below x n = 0, in which case it maps to section over M that lies below section θ i−1 . So the θ * required by the theorem is the image θ i via the homeomorphism restricted to the points in N r at which the trailing coefficient of p is non-zero, which is equivalent to N r \ realRoots(ldcf xn (p)). Note that because θ i passes through x n = 0 at precisely the points that map back to points in realRoots(ldcf xn (p)), lim t→1 θ * (σ(t)) = ±∞. Thus points 2.a and 2.c are proven.
Next we show that the multiplicities of the sections are constant, i.e. that point 2.b of the theorem holds. We will show that the multiplicities of sections of p carry over to the sections of p * , which suffices since the sections of p * are just shifts of the sections of p. Let M d be the map of a polynomial in x n of degree at most d to a polynomial of degree at most d in x n given by M d (h(x n )) = x d n h(1/x n ). Note that M d (p * ) = p and M d (p) = p * . First, consider point r in N r \ realRoots(ldcf xn (p)). In this case, p(r 1 , . . . , r n−1 , where the β i s are the distinct zeros (over C) and the e i s are the multiplicities of those zeros, and γ is some non-zero constant. Hence we have p * (r 1 , . . . , r n−1 , x n ) = M d (p(r 1 , . . . , r and we see that the multiplicities of the roots are preserved. Thus, we are left with the case r ∈ realRoots(ldcf xn (p)). In this case, p * (r 1 , . . . , r n−1 , x n ) = q * (x n ) for some polynomial q * ∈ R[x n ] of degree less than d (because the leading coefficient, and possibly some others, vanish at r ). The leading coefficient of p does not vanish at r , but the trailing coefficient does. In fact, the coefficients of the degree zero up to m * − 1 all vanish at r . So we have p(r 1 , . . . , r n−1 , x n ) = x m * n · q(x n ), where q(x n ) = M d−m * (p * (x n )). So the roots of q(x n ) are θ 1 (r ), . . . , θ k (r ) with multiplicities m 1 , . . . , m k , and these roots and multiplicities carry over to q * , which can be proven with the same argument used in the previous case.
To finish we need to justify the claim of order-invariance for the sections θ 1 , . . . , θ k and for θ * . Lemma A.1 (below) shows that the orders of p and p are the same at associated points. (Note that Theorem 2.1 of [31] essentially shows that order is preserved under general analytic coordinate transformation, which includes Lemma A.1 as a special case.) Given that, we note that polynomial p is order-invariant in the sections θ 1 , · · · , θ k and θ * (by McCallum's theorem on projection), and when we map these sections back to zeros of p, the orders of p in θ 1 , . . . , θ k and θ * will be the same as the orders of p in θ 1 , · · · , θ k and θ * , which concludes the proof.
Lemma A. 1. Let p ∈ R[x 1 , . . . , x n ] be of degree d in x n . Let r ∈ R n with r n = 0 be a point at which p is non-zero.
Define the function f as f (p) = x d n p(x 1 , . . . , x n−1 , 1/x n ), and the function g(r) = (r 1 , . . . , r n−1 , 1/r n ), noting that f (p) ∈ R[x 1 , . . . , x n ] is of degree d in x n , and g(r) is a point in R n with non-zero nth coordinate. The order of p at r is the same as the order of f (p) at g(r). Proof. We will show that ord r (p) ≤ ord g(r) (f (p)). Since f (f (p)) = p and g(g(r)) = r, and f (p) and g(r) satisfy the hypotheses of the theorem, we may reverse the roles of p and f (p), and r and g(r) and get that ord g(r) (f (p)) ≤ ord r (p).
So what remains to be proven is (A.1), which we prove by induction on t = ||ω|| 1 . When t = 0, we have that p = x d n p(x 1 , . . . , x n−1 , 1/x n ), which is of the form required by (A.1) -the c ω s are all zero, c * = 1 and d * = d. So assume (A.1) holds for all values up to and including t and consider a multi-index ω such that ||ω|| 1 = t. Let µ i be the n-vector that is 1 in the ith component and zero everywhere else. We will show that for any i, (A.1) holds for ∂ ω+µi .
B. Correctness of the theorems of projective delineability Lemma 4.1. Its proof relies on the more technical Theorem A.1 from Appendix A.
Proof. If ldcf xi+1 (p) is sign-invariant in R, then p is analytically delineable on R by Theorem 3.1 in [6] and Theorem 2 of [32]. Thus, we assume that ldcf xi+1 (p) is not sign-invariant in R. By Theorem A.1, for any point r in R, there is a neighbourhood N r around r such that all the properties required for analytic projective delineability hold in N r ∩ R. Since R is connected, this suffices to prove the lemma. To see why, consider two points in R, r and r . Since R is connected there is a path (a continuous function with domain [0, 1]) L : [0, 1] → R such that L(0) = r and L(1) = r . According to Theorem A.1, in some neighbourhood containing L(0) the definition of projective delineability is satisfied, i.e. there is some k and root functions θ 1 , . . . , θ k , θ * with all the desired properties. We need to prove that the root functions can be extended to cover the entire path while maintaining those properties. Let c be a real number in the range (0, 1) such that θ 1 , . . . , θ k , θ * can be extended over the subpath L = {L(β) | β ∈ [0, c)}. By Theorem A.1 there is a neighbourhood N L(c) in which the properties required for projective delineability hold, i.e. there is some k and root functions θ 1 , . . . , θ k , θ * with all the desired properties. However, because N L(c) ∩ L = ∅, k = k and the functions θ 1 , . . . , θ k , θ * agree with θ 1 , . . . , θ k , θ * on the path up to L(c). Moreover, choosing a value c greater than c such that L(c ) ∈ N L(c) , they provide an extension of those functions through L = {L(β) | β ∈ [0, c )}. Thus, the functions we require can be extended to cover the whole path L from r to r , maintaining all the properties required for projective delineability. In fact, the same argument shows that p will be order-invariant in the sections defined by the root functions. Proof. If ldcf xi+1 (p)(r) = 0 for all r ∈ R, then there is no θ * root function for the definition of projective delineable, meaning the definition collapses to that of ordinary delineability. Otherwise ldcf xi+1 (p)(r) = 0 for all r ∈ R, which means that there is a θ * root function, but it is defined everywhere on R. As R is connected, the definition of projective delineability collapses once again to that of ordinary delineability. Proof. If p is analytically projectively delineable on R ↓ [i−1] , then by Theorem A.1, p is order-invariant in each R ↓ [i−1] ×θ 1 (s), . . . , R ↓ [i−1] ×θ k and each R ↓ [i−1] ×θ * where θ * is a restriction of θ * to a connected component of its domain. As R is a subset of one of these sets by definition, p is order-invariant on R. Lemma 4.4. Proof. The hypotheses state that θ 1 and θ 2 are real root functions over R, which means that neither can be θ * in Definition 4.1, the section that is not defined over the whole of R, and that both are continuous and analytic. If the two functions are identical over R, then the theorem holds. So assume they are not, i.e. assume there is a point r ∈ R at which the two functions have different values. We will show that the two functions differ everywhere in R which, because they are continuous, proves the theorem. Suppose, by way of contradiction, that there are points in R at which θ 1 and θ 2 are equal. Choose such a point, s, and path L : [0, 1] → R such that L(0) = r, L(1) = s and for all x ∈ [0, 1) we have θ 1 (L(x)) = θ 2 (L(x)). Note that we are guaranteed to be able to find such s and L by the continuity of θ 1 and θ 2 . Choose a value γ ∈ R that, over s, is not a zero of p 1 neither of p 2 . Let N be a neighbourhood of s over which neither p 1 nor p 2 are zero at γ. Let U be the connected component of N ∩ R that contains s. There must be a value µ ∈ (0, 1) such that {L(x) | x ∈ (µ, 1]} ⊆ U . Set r = L ((µ + 1)/2), noting that θ 1 (r ) = θ 2 (r ). Let p * 1 = p 1 (x 1 , . . . , x i , x i+1 −γ) and p * 2 = p 2 (x 1 , . . . , x i , x i+1 −γ). By [23], the resultant is invariant under this transformation. Let p 1 = x d1 i+1 p * 1 (x 1 , . . . , x i , 1/x i+1 ) and p 2 = x d2 i+1 p * 2 (x 1 , . . . , x i , 1/x i+1 ), where the d 1 = deg xi+1 (p 1 ) and d 2 = deg xi+1 (p 2 ). The resultant is unchanged by this transformation, once again by [23], so res xi+1 (p 1 , p 2 ) is orderinvariant in U . Moreover, since neither p 1 nor p 2 is nullified in U , p 1 and p 2 are non-nullified in U . The proof of Theorem A.1 examines this same transformation, and shows that the root functions of p 1 and p 2 map to the root functions of p 1 and p 2 (with, of course, some special caveats for the θ * section, which don't apply to θ 1 and θ 2 ).