Enhancing SMT-based Weighted Model Integration by Structure Awareness

The development of efficient exact and approximate algorithms for probabilistic inference is a long-standing goal of artificial intelligence research. Whereas substantial progress has been made in dealing with purely discrete or purely continuous domains, adapting the developed solutions to tackle hybrid domains, characterised by discrete and continuous variables and their relationships, is highly non-trivial. Weighted Model Integration (WMI) recently emerged as a unifying formalism for probabilistic inference in hybrid domains. Despite a considerable amount of recent work, allowing WMI algorithms to scale with the complexity of the hybrid problem is still a challenge. In this paper we highlight some substantial limitations of existing state-of-the-art solutions, and develop an algorithm that combines SMT-based enumeration, an efficient technique in formal verification, with an effective encoding of the problem structure. This allows our algorithm to avoid generating redundant models, resulting in drastic computational savings. Additionally, we show how SMT-based approaches can seamlessly deal with different integration techniques, both exact and approximate, significantly expanding the set of problems that can be tackled by WMI technology. An extensive experimental evaluation on both synthetic and real-world datasets confirms the substantial advantage of the proposed solution over existing alternatives. The application potential of this technology is further showcased on a prototypical task aimed at verifying the fairness of probabilistic programs.


Introduction
There is a growing interest in the artificial intelligence community in extending probabilistic reasoning approaches to deal with hybrid domains, characterized by both continuous and discrete variables and their relationships. Indeed, hybrid domains are extremely common in real-world scenarios, from transport modelling [2] to probabilistic robotics [3] and cyber-physical systems [4].
Weighted Model Integration (WMI) [5,6,7] recently emerged as a unifying formalism for probabilistic inference in hybrid domains. The paradigm extends Weighted Model Counting (WMC) [8], which is the task of computing the weighted sum of the set of satisfying assignments of a propositional formula, to deal with SMT formulas (e.g., [9]) consisting of combinations of Boolean atoms and connectives with symbols from a background theory, like linear arithmetic over rationals (LRA).
Although WMC can be made extremely efficient by leveraging component caching techniques [10,11], these strategies are difficult to apply for WMI due to the tight coupling induced by arithmetic constraints. Indeed, existing component caching approaches for WMI are restricted to fully factorized densities with few dependencies among continuous variables [12]. Another direction specifically targets acyclic [13,14] or loopy [15] pairwise models. Approximations with guarantees can be computed for problems in disjunctive normal form [16] or, in restricted cases, conjunctive normal form [17].
Exact solutions for more general classes of densities and constraints mainly take advantage of advances in SMT technology [9] or knowledge compilation (KC) [18]. WMI-PA [6,19] relies on SMT-based Predicate Abstraction (PA) [20] to both reduce the number of models to be generated and integrated over and efficiently enumerate them, and was shown to achieve substantial improvements over previous solutions. Nevertheless, in this paper we show how WMI-PA has the major drawback of ignoring the conditional structure of the weight function, which prevents from pruning away lots of redundant models. The use of KC for hybrid probabilistic inference was pioneered by [21] and further refined in a series of later works [22,23,24,25]. By compiling a formula into an algebraic circuit, KC techniques can exploit the structure of the problem to reduce the size of the resulting circuit, and are at the core of many state-of-the-art approaches for WMC [8]. Nevertheless, in this paper we show that even the most recent KC solutions for WMI [23,24] have serious troubles in dealing with densely coupled problems, resulting in exponentially large circuits that are impossible to store or prohibitively expensive to evaluate.
To address these problems, in this paper we introduce SA-WMI-PA-SK, a novel algorithm for WMI that aims to combine the best of both worlds, by introducing weight-structure awareness into PA-based WMI. The main idea is to build a formula, which we call the conditional skeleton, which mimics the conditional structure of the weight function, to drive the SMT-based enumeration algorithm preventing it from generating redundant models. An extensive experimental evaluation on synthetic and real-world datasets shows substantial computational advantages of the proposed solution over existing alternatives for the most challenging settings.
Parallel to the introduction of weight-structure awareness, we highlight how PA-based algorithms are agnostic to the integration algorithm chosen to compute the volume of each enumerated assignment. To this extent, we extend SA-WMI-PA-SK to support both exact numerical integration (the one implicitly embedded in [6,19]) and approximate integration. The advantages of using approximate integration are twofold: (i) it positively affects scalability of SA-WMI-PA-SK with complex instances when integration is the major bottleneck; (ii) it allows applying SA-WMI-PA-SK to problems with non-polynomial weight functions (e.g., Gaussian densities), increasing the applicability of WMI to real-world scenarios.
Our main contributions can be summarized as follows: • We analyse existing state-of-the-art of WMI and we identify major efficiency issues for both PA and KC based approaches.
• We introduce SA-WMI-PA-SK, a novel WMI algorithm that combines PA with weight-structure awareness by generating explicitly a conditional skeleton of the weight function to drive the search of assignments.
• We show how SA-WMI-PA-SK achieves substantial improvements over existing solutions in both synthetic and real-world settings.
• We demonstrate how PA-based approaches can deal with different integration techniques, both exact and approximate, substantially expanding the set of problems that can be tackled by WMI technology.
• We present a prototypical application of SA-WMI-PA-SK to verifying the fairness of probabilistic programs.
The rest of the paper is organized as follows. We start by presenting the related work in §2. In §3 we introduce the background, focusing on the formulation of Weighted Model Integration. In §4 we analyse the WMI approaches based on Knowledge-Compilation and on Predicate Abstraction, identifying some weaknesses for either of them. In §5 we address the gaps mentioned in the previous section, showing theoretical ideas to make WMI-PA structure aware and their implementation into our novel algorithm, SA-WMI-PA-SK. §6 is devoted to an empirical evaluation of SA-WMI-PA-SK with respect to the other existing approaches, considering both synthetic and real-world settings. In §7, we highlight how PA-based approaches are agnostic to the integrator chosen to compute the volume of the polytopes and propose an adaptation of the algorithm to deal with approximated integration. Our conclusion and final considerations are presented in §8.

Related work
Traditionally, inference algorithms in hybrid graphical models either disallowed complex algebraic and logical constraints [26,27,28], or solved the problem approximately [29,30]. The use of piecewise polynomial densities when dealing with algebraic constraints was pioneered by Sanner and Abbasnejad [21]. The algorithm, called Symbolic Variable Elimination (SVE), first used a compact representation called eXtended Algebraic Decision Diagrams (XADD) [31] in probabilistic inference tasks. As described in detail in §4, these representations lie at the core of modern compilation-based WMI techniques. Probabilistic inference modulo theories [32] used similar representations together with a modified DPLL procedure. The resulting algorithm, called PRAiSE, showed superior performance with respect to both SVE and the original WMI [5] solver, which didn't exploit any SMT enumeration technique. Follow-up works on SMT-based WMI capitalized on the advanced features of SMT solvers, obtaining state-of-the-art results in many benchmarks [6,19].
The substantial efficiency gains obtained by caching partial results in WMC [10,11] motivated their applications in the hybrid case. When the dependencies induced by the algebraic constraints are limited, similar ideas can be applied to WMI [12]. Alternatively to SMT-or compilation-based approaches, WMI can be reduced to Belief Propagation when the algebraic dependencies involve at most two variables. Then, an efficient message passing algorithm on an acyclic pairwise factor graph can solve the problem exactly [14]. This idea was later extended for approximating problems with circular dependencies [15]. In contrast with the work above, this paper focuses on the general WMI problem, where the above assumptions cannot be leveraged.
Approximate WMI algorithms exploit both SMT-based [17] and knowledge compilation-based [23] approaches. While generally applicable, the latter does not provide any guarantee on the approximation error. The former provides practical statistical guarantees in limited scenarios when constraints are expressed in conjunctive form. Instead, if the constraints can be expressed in disjunctive form, WMI admits a fully polynomial randomized approximation scheme (FPRAS) under mild assumptions [16].

Background
In this section, we introduce the theoretical background which is needed for a fully-detailed comprehension of this paper. Notice that this section contains several SMT-related technicalities which are needed only to understand some technicalities in the encodings, and as such could be skipped in a first reading.

SMT, AllSMT and Projected AllSMT
We assume the reader is familiar with the basic syntax, semantics, and results of propositional and first-order logic. We adopt the notation and definitions in [19] -including some terminology and concepts from Satisfiability Modulo Theories (SMT)-which we summarize below.
Assignment Enumeration. Given a theory T ∈ {LRA, LRA ∪ EUF}, TTA(ϕ) def = {µ 1 , . . . , µ j } denotes the set of T -satisfiable total assignments over both propositional and T -atoms that propositionally satisfy ϕ; TA(ϕ) def = {µ 1 , . . . , µ j } represents one set of T -satisfiable partial assignments over both propositional and T atoms that propositionally satisfy ϕ, s.t. (i) every total assignment in TTA(ϕ) is a super-assignment of some partial ones in TA(ϕ) and (ii) every pair µ i , µ j ∈ TA(ϕ) assigns opposite truth value to at least one atom. We remark that TTA(ϕ) is unique (modulo reordering), whereas multiple TA(ϕ)s are admissible for the same formula ϕ (including TTA(ϕ)). The disjunction of the truth assignments in TTA(ϕ), and that of TA(ϕ), are Tequivalent to ϕ. Thus, given ϕ(x, A∪B), TTA(∃x.∃B.ϕ) denotes the set of all total truth assignment µ A on A s.t. µ A |= ∃x.∃B.ϕ, and TA(∃x.∃B.ϕ) denotes one set of partial assignments µ A on A s.t. µ A |= ∃x.∃B.ϕ complying with conditions (i) and (ii) above where ϕ is replaced by ∃x.∃B.ϕ. The disjunction of the assignments in TTA(∃x.∃B.ϕ) and that of TA(∃x.∃B.ϕ) are T -equivalent to ∃x.∃B.ϕ. (As above, the definitions of TTA(∃x.ϕ(x, A))/TA(∃x.ϕ(x, A)) and of TTA(∃B.ϕ(A ∪ B))/TA(∃B.ϕ(A ∪ B)) follow with B = ∅ and x = ∅ respectively.) Notice that, if ϕ(x, A ∪ B) is the result of applying one of the "labelling" CNF-izations [35,36] [20]. All these functionalities are provided by the SMT solver MathSAT [37]. In a nutshell, MathSAT works as follows. Given ϕ and a subset of "relevant" atoms Ψ ⊆ Atoms(ϕ), TA(. . .) generates one-by-one T -satisfiable partial assignments µ i The Weighted Model Integral of w(x,A) over ϕ(x, A) is defined as [19]: where the µ A s are total truth assignments on A, WMI nb (ϕ, w|x) is the integral of w(x) over the set {x | ϕ(x) is true} (" nb " means "no-Booleans"). We call a support of a weight function w(x, A) any subset of R N × B M out of which w(x, A) = 0 and we represent it as a LRA-formula χ(x, A). We recall that, consequently, WMI(ϕ ∧ χ, w|x, A) = WMI(ϕ, w|x, A).
We consider the class of feasibly integrable on LRA (FI LRA ) functions w(x), which contain no conditional component and for which there exists some procedure able to compute WMI nb (µ LRA , w|x) for every set of LRA literals on x. (E.g., polynomials are FI LRA .) Then we call a weight function w(x, A), feasibly integrable under LRA conditions (FIUC LRA ) iff it can be described in terms of a support LRA-formula χ(x, A) ( if not provided) and a set Ψ of LRA conditions, in such a way that, for every total truth assignment µ Ψ to Ψ, w [µ Ψ ] (x) is total and FI LRA in the domain given by the values of x, A which satisfy (χ ∧ µ Ψ ). 4 Intuitively, each µ Ψ describes a portion of the domain of w inside which w [µ Ψ ] (x) is FI LRA , and we say that µ Ψ identifies w [µ Ψ ] in w. In practice, we assume w.l.o.g. that FIUC LRA functions are described as combinations of constants, variables, standard arithmetic operators +, −, ·, condition-less real-valued functions (e.g., exp(.), . . . ), conditional expressions in the form (If ψ i Then t 1i Else t 2i ) whose conditions ψ i are LRA formulas and terms t 1i , t 2i are FIUC LRA .

WMI via Projected AllSMT
WMI-PA is an efficient WMI algorithm presented in [6,19] which exploits SMT-based predicate abstraction. Let w(x, A) be a FIUC LRA function as above. WMI-PA is based on the fact that . . , B K } are fresh propositional atoms and w * (x, A ∪ B) is the weight function obtained by substituting in w(x, A) each condition ψ k in Ψ with B k . 5 The pseudocode of WMI-PA is reported in Algorithm 1. First, the problem is transformed (if needed) by labelling all conditions in Ψ occurring in w(x, A) with fresh Boolean atoms B, as above. After this preprocessing stage, the set M A * def = TTA(∃x.ϕ * ) is computed by projected AllSMT [20]. Then, the algorithm iterates over each Boolean assignment µ A * in M A * . ϕ * [µ A * ] is simplified by the Simplify procedure, which propagates truth values and applies logical and arithmetical simplifications. Then, if ϕ * [µ A * ] is already a conjunction of literals, the algorithm directly computes its contribution to the volume by calling is computed by AllSMT to produce partial assignments, and the algorithm iteratively computes contributions to the volume for each µ LRA . See [19] for more details. Remark 1. Importantly, in the implementation of WMI-PA the potentially large sets M A * and M LRA are not generated explicitly; rather, their elements are generated, integrated, and then dropped one-by-one, so as to avoid space blow-up. For example, steps like "M A * ← TTA(. . . ); for µ A * ∈ M A * . . . " in Algorithm 1 should be read as "for µ A * ∈ TTA(. . .). . . ". (We keep the first representation for readability.) The same applies to all our novel algorithms in this paper. 10: for µ LRA ∈ M LRA do 11: vol ← vol + WMI nb (µ LRA , w * [µ A * ] |x) 12: return vol

Hybrid Probabilistic Inference via WMI
The main application of WMI is marginal inference on weighted SMT theories. Similarly to WMC, inference can be reduced to the computation of two weighted model integrals: The denominator above is akin to computing the partition function on probabilistic models with unnormalized factors. Crucially, the formulas ∆ and χ are arbitrary, possibly encoding complex non-convex regions of a hybrid space. This goes beyond the kind of queries that are supported by more traditional algorithms like Belief Propagation, being particularly beneficial in contexts where it is necessary to compute probabilities of complex properties, like those arising in (probabilistic) formal verification domains. Furthermore, the use of specialized software to deal with constraints yields state-of-the-art results on standard queries when the support of the distribution is highly structured, as shown by the competitive results obtained by reducing inference on discrete models to WMC [38].

Analysis of State-Of-The-Art WMI Techniques
We start by presenting an analysis of the KC and WMI-PA approaches, which represent the current state-of-the-art in WMI.

Knowledge Compilation
We start our analysis by noticing a major problem with existing KC approaches for WMI [23,24], in that they can easily blow up in space even with simple weight functions. Consider, e.g., the case in which where the ψ i s are LRA conditions on {x, A} and the w i1 , w i2 are generic functions on x. First, the decision diagrams do not interleave arithmetic and conditional operators; rather, they push all the arithmetic operators below the conditional ones. Thus, with (7) the resulting decision diagrams consist of 2 N branches on the ψ i s, each corresponding to a distinct unconditioned weight function of the form Figure 1 for an example for N = 3 and ψ i def = A i , i ∈ {1, 2, 3}. Second, the decision diagrams are built on the Boolean abstraction of w(x, A), s.t. they do not eliminate a priori the useless branches consisting of LRA-unsatisfiable combinations of ψ i s, which can be up to exponentially many.
With WMI-PA, instead, the representation of (7) does not grow in size, because FIUC LRA functions allow for interleaving arithmetic and conditional operators. Also, the SMT-based enumeration algorithm does not generate LRA-unsatisfiable assignments on the ψ i s. This fact has been empirically confirmed and is shown in Figure 2, where we plot in logarithmic scale the number of nodes of a KC-based encoding of (7) using XADD for problems of increasing size, compared with the size of the encoding used by WMI-PA. This graph clearly shows the exponential blow-up of XADD size, whereas the size of the encoding used by WMI-PA grows linearly. We stress the fact that (7) is not an artificial scenario: rather, e.g., this is the case of the real-world logistics problems in [19].

WMI-PA
We continue our analysis by noticing a major problem also for the WMI-PA algorithm: unlike with the KC approaches, it fails to leverage the conditional structure of the weight function to prune the set of models to integrate over. We illustrate the issue by means of a simple example (see Figure 3). (Figure 3(a)) and let w(x, A) be a tree-structured weight function defined as in Figure 3  compute WMI(ϕ ∧ χ, w|x, A), only six integrals need to be computed: (Figure 3(c)), and the weight function w * (x, A ∪ B) shown in Figure 3(d). Then, by applying TTA(∃x.ϕ * ) (line 2) we obtain 24 total assignments M A * on A ∪ B, as shown in Figure 3 The key issue about WMI-PA is that the enumeration of TTA(∃x.ϕ * ) in (4) (line 2 in Algorithm 1) is not aware of the conditional structure of the weight function w * , in particular, it is not aware of the fact that often partial assignments to the set of conditions in w * (both in A and B, i.e., both original Boolean and LRA conditions in w) are sufficient to identify the value of w * , so that it is forced to enumerate all LRA-satisfiable total assignments extending them. (E.g., in Example 1, This has two consequences. First, to make the enumerator split also on the conditions in Ψ, WMI-PA renames them with fresh Boolean atoms B, conjoining K k=1 (B k ↔ ψ k ) to ϕ ∧ χ (line 1). Unfortunately, the above equivalences force the enumerator to assign a truth value to all the B i s in B (hence, to all LRA-conditions ψ i s in Ψ \ A) in every assignment, even when the conditional structure of w does not need it. This is what forces the unnecessary partitioning of integrals into subregions. (E.g., in Example 1, the integral of x 2 1 x 2 is unnecessarily split into two integrals for (x 2 ≥ 2) and ¬(x 2 ≥ 2) due to the unnecessary branch on Second, not knowing w * , the enumerator is forced to always assign also all original Boolean atoms in A, even when the combination µ A ∪ µ B of a partial assignment µ A on A and a total assignment µ B on B suffices to satisfy ϕ * and to make w * FI LRA , that is, TTA(∃x.ϕ * ) is used instead of TA(∃x.ϕ * ) in (4) and in line 2 of Algorithm 1. This is what causes the unnecessary duplication of integrals. (E.g., in Example 1, each of the first 6 integrals is computed twice, for {A 1 , A 2 , . . .} and {A 1 , ¬A 2 , . . .}, due to the unnecessary branching on {A 2 , ¬A 2 }). Although the latter problem could in principle be fixed by caching all the values of the integrals, this could be expensive in terms of both time and memory.
To cope with these issue, we need to modify WMI-PA to make it aware of the conditional structure of w. (One further issue, dealing with the fact that the conditions in Ψ could not be literals, will be addressed in §5.3.2.)

Making WMI-PA Weight-Structure Aware
In order to address the issues described in §4, we aim to combine the best of KC approaches -i.e., weight-structure awareness-with the best of PA-based approaches -i.e., SMT-based enumeration-by introducing weight-structure awareness into PA-based WMI. In §5.1 we present the general idea for making WMI-PA weight-structure aware, by introducing and exploiting the notion of conditional skeleton for w. In §5.2, for the sake of compactness, we only summarize the first preliminary approach and algorithm (namely "SA-WMI-PA"), based on an implicit enumeration of a skeleton, which we presented in the conference version of this paper [1] and which is described in full detail in Appendix A. In §5.3 we propose our novel and current best approach and algorithm (namely "SA-WMI-PA-SK"), which is both much simpler and more effective than that from [1], in which the skeleton is generated explicitly.

General Idea: Exploiting a Conditional Skeleton of w
We start by introducing the following concept.
Definition 1 (Conditional skeleton of w, sk(w)). Let ϕ and χ be as above; let w(x, A) be FIUC LRA on the set of conditions Ψ. We call a Conditional Skeleton of w, written sk(w), any LRA formula s.t.: (a) its atoms are all and only those occurring in the conditions in Ψ; Example 2. Consider the problem in Example 1. Then the following formula is a conditional skeleton sk(w) for w(x, A): Assume we have produced one such formula sk(w). Unlike with WMI-PA, we do not rename with B the conditions Ψ in w(x, A), and we conjoin to ϕ∧χ , which needs total assignments to Ψ (i.e., to B) to be satisfied, for sk(w) it suffices to produce partial assignments to Ψ which verify condition (c), i.e., those assignments µ s.t. w [µ] is FI LRA . Thus, we can rewrite (4)-(5) into: (Notice that in (8) µ is in TA(ϕ * * * ), not TTA(ϕ * * * ), that is, we enumerate partial assignments µ for ϕ * * * and, in particular, partial assignments for Ψ.) Condition (c) in Definition 1 guarantees that WMI nb (ϕ * * * [µ] , w [µ] |x) in (8) can be directly computed even though µ is partial, without further partitioning due to irrelevant conditions. The 2 |A\µ| factor in (8), where |A \ µ| is the number of Boolean atoms in A that are not assigned by µ, resembles the fact that, if A i ∈ A is not assigned in µ, then WMI nb (ϕ * * * [µ] , w [µ] |x) must be counted twice, because µ represents two assignments µ ∪ {A i } and µ ∪ {¬A i } which would produce two identical integrals, because their LRA component and the branch of the weight function they identify are identical. When |A\µ| > 0, this allows to compute only one integral rather than 2 |A\µ| ones (!).
Notice that the latter is only one of the computational advantages of (8)-(9) w.r.t. (4)-(5): even more importantly, (8)-(9) allow for enumerating partial assignments also on the LRA-conditions in Ψ \ A, reducing the number of integrals to compute for each partial assignment µ by a 2 j factor, j being the number of unassigned LRA atoms in µ. (In this case, no multiplication factor should be considered, because an unassigned LRA condition ψ i in µ causes the merging of the two integration domains of µ ∪ {ψ i } and µ ∪ {ψ i }.) Example 3. Consider the problem in Example 1 and the corresponding sk(w) formula in Example 2. Let ϕ * * * def = ϕ ∧ χ ∧ sk(w), as in (9). We show how TA(ϕ * * * ) in (8) can be produced by the SMT solver. Assume nondeterministic choices are picked following the order A 1 , A 2 , (x 1 ≥ 1), (x 2 ≥ 1), (x 2 ≥ 2), assigning positive values first. Then in the first branch the following satisfying total truth assignment is generated: } is assigned deterministically to satisfy the ϕ ∧ χ part) from which the solver extracts the minimal partial truth assignment which evaluates sk(w) to true: In the next branch, the following minimal partial assignment is produced: s.t. µ 1 , µ 2 assign opposite truth value to (at least) one atom. Overall, the algorithm enumerates the following collection of minimal partial assignments: Notice that logic-wise sk(w) is non-informative because it is a LRA-valid formula (condition (b) in Definition 1). Nevertheless, the role of sk(w) is not only to "make the enumerator aware of the presence of the conditions Ψ" -like the K k=1 (B k ↔ ψ k ) in WMI-PA-but also to mimic the conditional structure of w, forcing every assignment µ to assign truth values to all and only those conditions in Ψ which are necessary to make w [µ] FI LRA , and hence make WMI nb (ϕ * * * [µ] , w [µ] |x) directly computable, without further partitioning. In principle, we could use as sk(w) (a LRA formula encoding the conditional structure of) an XADD or (F)XSDD, which do not suffer from the lack of structure awareness of WMI-PA. However, this would cause a blow up in size, as discussed in §4.1. To this extent, avoiding sk(w) blow up in size is a key issue in our approach, which we will discuss in the following steps.

Implicit Generation of a Conditional Skeleton
Here we briefly summarize the first preliminary approach we proposed in the conference paper [1]. A fully-detailed description can be found in Appendix A.
Encoding. The conditional skeleton sk(w) we used in [1] is not generated explicitly. Rather, we have defined it as an existentially-quantified formula sk(w) def = ∃y. y = w EU F , where y = w EU F is a LRA ∪ EUF-formula on A, x, y s.t. y def = {y, y 1 , . . . , y k } is a set of fresh LRA variables. Thus, we can compute TA(ϕ ∧ χ ∧ ∃y. y = w EU F ) as TA(∃y.(ϕ ∧ χ ∧ y = w EU F )) because the y do not occur in ϕ ∧ χ, with no need to generate sk(w) explicitly.
In a nutshell, y = w EU F is obtained by taking (y = w), s.t. y is fresh, and recursively substituting bottom-up every conditional term (If ψ i Then t i1 Else t i2 ) in it with a fresh variable y i ∈ y, adding the definition of (y . Terms representing non-linear arithmetic operators (e.g., multiplication) or functions (e.g., sin, exp) are then substituted with uninterpreted functions, in order to avoid non-linear terms in the formula. Intuitively, each partial assignment satisfying ∃y. y = w EU F represents a branch of the weight function, since it allows to uniquely identify a value for y. (See Appendix A for an example and more details.) Notice that the idea of renaming if-then-else terms with fresh quantified variables is what avoids the space blow-up of KC approaches, s.t. y = w EU F grows linearly in size w.r.t. the size of w. E.g., with (7), )) -where the products and other non-linear functions are abstracted into uninterpreted functions-whose size grows linearly rather than exponentially w.r.t. N .
The SA-WMI-PA Procedure. We briefly summarize how the enumeration procedure used by SA-WMI-PA works (see Algorithm 5 in Appendix A for details). We first extend ϕ ∧ χ into ϕ * * def = ϕ ∧ χ ∧ y = w EU F . Then, as with WMI-PA, we enumerate the assignments in two main steps: 1. We first generate a set M A * of partial assignments µ A * over the Boolean atoms A, s.t. ϕ * * [µ A * ] is LRA-satisfiable and does not contain Boolean atoms anymore. (About assignment enumeration, recall Remark 1.) Unlike with WMI-PA, these assignments are generated in two steps. We first enumerate partial Boolean assignments by invoking TA(∃x.∃y.ϕ * * ) and, for each assignment µ A , we build the (simplified) residual ϕ * * [µ A ] . Since µ A is partial, however, ϕ * * [µ A ] is not guaranteed to be free of Boolean atoms A. 6 If this is the case, we simply add µ A to M A * , otherwise we further invoke TTA(∃x.∃y.ϕ * * [µ A ] ) to assign the remaining Boolean atoms, ensuring that the residual now contains only LRA atoms. Each assignment µ A residual to the remaining atoms is then conjoined to µ A , and µ A ∧ µ A residual is added to M A * . 2. The second step is nearly identical to the "for" loop in WMI-PA. For . Unlike with WMI-PA, since µ A * can be partial and thus represent 2 |A\µ A * | total assignments, we multiply the integral by a 2 |A\µ A * | factor (as in (8)) and add it to the result.

Encoding a Conditional Skeleton Explicitly
We propose a novel approach, which is both much simpler and more effective than that from [1], in which the skeleton is generated explicitly.

Encoding
Algorithm 2 shows the procedure for building explicitly a formula sk(w). Intuitively, the output consists of a conjunction of formulas, each stating "if all the conditions ψ 1 , .., ψ i of the current sub-branch in w hold, then split on the next condition ψ i+1 in the branch". For example, for the problem in Example 1, Algorithm 2 produces as sk(w) the formula reported in Example 2, whose satisfying partial assignments can be enumerated as in Example 3.
The recursive procedure Convert SK (w i , conds) takes as input the current subtree w i of the weight function w and the set of literals conds representing the conditions under whose scope w i occurs, and returns the CNF representation of ψ i ∈conds ψ i → sk(w i ) (i.e., it returns ψ i ∈conds ¬ψ i ∨ sk(w i ) ). return Convert SK (w 1 , conds) ∧ Convert SK (w 2 , conds) 5: if (w == g(w 1 , . . . , w k ), g unconditioned) then 6: return k i=1 Convert SK (w i , conds) 7: if (w == (If ψ Then w 1 Else w 2 )) then 8: if ψ is a literal then 9: branch ← ψ i ∈conds ¬ψ i ∨ ψ ∨ ¬ψ  14: let B be a fresh Boolean atom 15: branch 16: Hence, Convert SK (w, ∅) returns sk(w). Notice that the set conds contains the conditions that need to be true in order to activate the current branch. We first focus on lines 1-12, in which we assume that the weight conditions ψ i are literals (the behaviour in case of non-literal conditions, lines 14-18, will be explained in §5.3.2). The recursive encoding of sk(w) works as follows: (Lines 1-2): A constant or a polynomial does not contain weight conditions. Thus, we can simply encode it with . Notice that here conds has no role, because ( ψ i ∈conds ψ i ) → reduces to .
(Lines 3-4): A term representing an arithmetic operator w 1 w 2 , with ∈ {+, −, ·} must ensure that the conditions in both branches w 1 , w 2 are enumerated. Thus, we encode it by conjoining the results of the conversion procedure on w 1 and w 2 . (Line 9): When all conds are true, then ψ must be split upon. This fact is encoded by the valid clause: (Line 10): When (all conds and are true and) ψ is true, all the branches of w 1 must be enumerated. This is encoded by recursively calling the conversion procedure on w 1 adding ψ to the conditions conds that need to be true to activate the branch of w 1 , which returns ψ i ∈conds ¬ψ i ∨ ¬ψ ∨ sk(w 1 ) . (Line 11): Similarly, when (all conds and are true and) ψ is false, all the branches of w 2 must be enumerated. This is encoded by recursively calling the conversion procedure on w 2 adding ¬ψ to conds, which returns ψ i ∈conds ¬ψ i ∨ ψ ∨ sk(w 1 ) .
As with ∃y. y = w EU F and unlike with KC approaches, it is easy to see that sk(w) grows linearly in size w.r.t. w. E.g., with (7), sk(w) is N i=1 (ψ i ∨ ¬ψ i ), whose size grows linearly rather than exponentially w.r.t. N .

Dealing with non-literal conditions
The above description of the skeleton algorithm assumes that the conditions in Ψ are single literals, either Boolean or LRA ones. In general, this is not necessarily the case, for instance it is common to have range conditions in the form (x i ≥ l) ∧ (x i ≤ u) or even more complex conditions. In these cases, the skeleton formula is not in CNF form. In fact, CNF( i ψ i → ϕ) can be straightforwardly computed out of CNF(ϕ) only if the ψ i s are literals, because this reduces to augment each clause in CNF(ϕ) with i ¬ψ i .
If this is not the case, then the CNF-ization either may cause a blow-up in size if equivalence-preserving CNF-ization is applied, or it may require some variant of Tseitin CNF-ization [35]. SMT-solvers typically rely on the second option for satisfiability checks. For what enumeration is concerned, however, introducing new atomic labels that define complex sub-formulas can force the enumeration procedure to produce more partial assignments than necessary.
x 1 This is what happens with WMI-PA, and also with SA-WMI-PA, when conditions in Ψ are non-literals. We illustrate this problem in Example 4. Our proposed solution will be eventually illustrated in Example 5. Figure 4a. Notice that some conditions of the weight function are not literals. Thus, if we wrote ϕ * * * def = ϕ ∧ χ ∧ sk(w) without concerning about this fact, then we would obtain the following non-CNF formula to feed to the SMT solver:

s.t. the f i (x)s are condition-less functions, which is graphically represented in
The solver would apply Tseitin CNF-ization, labelling the subformulas by fresh Boolean atoms {B 1 , B 2 , B 3 , B 4 } as above, producing the CNF formula φ 7 : and then compute TA(ϕ * * * ) in (8)  Consider, e.g., the partial truth assignment: which would be sufficient to make w [µ] FI LRA because it suffices to identify f 2 (x). Unfortunately, µ does not suffice to evaluate (11) to true because of the last six clauses, which represent 2 ))). Thus, when computing TA(∃B.φ), the solver is forced to assign truth values also to other atoms, in particular to B 2 and B 4 . Whereas B 4 can be consistently assigned only to false due to the clause (¬B 4 ∨ (x 1 > 5)) and the fact that (x 1 ≤ 4) is true, B 2 can assume consistently both truth values, so that the solver is forced to branch on B 2 and ¬B 2 , which respectively force (x 2 > 2) and ¬(x 2 > 2), causing the unnecessary generation of two distinct integrals on 3].(Notice that the atom (x 2 > 2) belongs to a condition in the "(x 1 ≤ 4) = ⊥" branch of w, whereas with µ we are actually considering the "(x 1 ≤ 4) = " branch.) This issue is graphically represented in Figure 4c, where we show the regions enumerated by the SA-WMI-PA procedure on the above problem, by using either the implicit version of sk(w) as in the original algorithm or the non-CNF explicit version of sk(w): 11 convex regions are enumerated. Notice that the brown, red and grey areas are each split into tree regions, whereas it is possible to split each into two convex regions only.
The source of the problem is that, with Tseitin CNF-ization, each "labelling definition" CNF ts (B i ↔ ψ i ) is conjoined to the rest of the formula, so that the SMT solver is forced to enumerate the different ways to satisfy it even when the rest of the assignment selects a branch which does not involve the condition ψ i . Notice that the fact that B i is implicitly treated as existentially-quantified, so that it is not in the relevant-atom set for the projected enumerator (see §3.1) does not help, because ψ i and the atoms in it are in the relevant-atom set, s.t. the enumerator is forced to split on their truth values anyway. (E.g., in Example 4, although B 2 is implicitly existentially quantified, the enumerator is forced to split on (x 2 > 2) anyway.) To cope with this problem, we propose a variant of the Tseitin CNFization for the skeleton where each "labelling definition" CNF ts (B ↔ ψ) and all occurrences of B are "local", that is, they occur in the formula only implied by the conditions in conds, so that they are simplified to true unless the assignment selects a branch which satisfies all conditions in conds. Thus, in Algorithm 2, lines 14-18, we substitute each non-literal condition ψ in branch and defs i from lines 9-12 with a fresh Boolean atom B and we add ψ i ∈conds ¬ψ i ∨ CNF pg (B ↔ ψ) 8 instead of CNF ts (B ↔ ψ) to the main formula, where CNF pg (. . .) is the CNF-ization in [36] 9 , so that branch is tautologically equivalent to ( ψ i ∈conds ψ i ) → ((B ∨ ¬B) ∧ CNF pg (B ↔ ψ)). (We recall that all ψ i s in conds are literals by construction, see Algorithm 2.) Example 5. With the problem of Example 4, ϕ * * * def = ϕ ∧ χ ∧ sk(w) is 10 : 2 )) respectively, as in Example 4, whereas Here we use CNF pg instead of CNF ts because the former generates fewer clauses and behaves generally better, although we could use CNF ts without affecting correctness. 10 Here we colour the literals deriving from the labelling clauses introduced by CNF pg (. . . ) as the corresponding regions in Figure 4. We recall that   (12) evaluates to true all the clauses of ϕ * * * , so that the solver considers it as satisfying assignment for the formula.
In Figure 4d we show the regions enumerated by the SA-WMI-PA-SK procedure ( §5.3.3), using as skeleton the "local" CNF version of sk(w). We see that the regions enumerated are 8, instead of the 11 of Figure 4c. In this case 8 is also the minimum number of convex areas into which we can split the problem, because the 4 distinct coloured areas in Figure 4a are non-convex, and as such their integral cannot be computed with single integrations.
We stress the importance of the locality of labelling clauses in the enumeration of truth assignments: the SMT enumerator needs branching on B (hence, on ψ) only in those branches in which all conds are true, that is, only in those branches where a truth value for condition ψ is needed to define a branch of w. Notice that, if CNF pg introduces some labelling definitions, then also the latter are active only when all conds are true.

The SA-PA-WMI-SK Procedure
We devised a new algorithm, namely SA-WMI-PA-SK, that explicitly encodes the weight function as sk(w) ( §5.3.1), handles non-literal conditions effectively ( §5.3.2), and introduces some further improvements in the enumeration, which we describe below. The outline of the procedure is shown in Algorithm 3. (To simplify the narration, here we assume that the encoding of sk(w) did not cause the introduction of fresh Boolean variables B, see §5.3.2; if not so, then TA(∃x.ϕ * * * ) and TA(ϕ * * * [µ A ] ) in Algorithm 3 should be replaced with TA(∃x.∃B.ϕ * * * ) and TA(∃B.ϕ * * * [µ A ] ) respectively.) • (Lines 1-3) After producing sk(w) by Algorithm 2, we first enumerate by projected AllSMT a set M A of partial truth assignments µ A over A, s.t. M A def = TA(∃x.(ϕ ∧ χ ∧ sk(w))).
• (Lines 5-14) For each µ A ∈ M A we compute and simplify the residual formula ϕ * * * [µ A ] . As with WMI-PA and SA-WMI-PA, if ϕ * * * [µ A ] is already a conjunction of literals, then we can simply compute and return its integral, which is multiplied by a 2 k factor, k being the number of unassigned Boolean atoms in µ A . 11 If this is not the case, we proceed by enumerating a set M of partial truth assignments µ = µ A ∧ µ LRA over the remaining atoms of the formula, s.t. M def = TA(∃x.ϕ * * * [µ A ] ). Notice that, since µ A is partial, ϕ * * * [µ A ] could still contain Boolean atoms (see Examples 6 and 8), so that µ could contain some non-empty Boolean component µ A . We then compute the integral of the FI LRA function w [µ A ∧µ] over the convex area µ LRA . As above, we need to multiply this integral by a 2 k factor, k def = k − |µ A | being the number of unassigned Boolean atoms in µ A ∧ µ.
As with WMI-PA and SA-WMI-PA, in the implementation (Lines 3-5 and 10-12) the sets M A and M are not generated explicitly; rather, their elements are generated, integrated and then dropped one-by-one, so as to avoid space blow-up (see Remark 1).
We remark that producing smaller partial assignments over the Boolean atoms, as done with SA-WMI-PA-SK, has a positive effect regardless of the structure of the weight function. In Example 6 we give an intuition of the benefits introduced by the improvements of the enumeration phase in SA-WMI-PA-SK.
(a) Assignments enumerated by WMI-PA and SA-WMI-PA   Example 6. Consider the following WMI problem: (w is constant and Ψ = ∅, thus i B i ↔ ψ i , ∃y. y = w EU F , sk(w) reduce to and ϕ * = ϕ * * = ϕ * * * def = ϕ∧χ, so that the kind of conditional skeleton used is irrelevant.) Both WMI-PA and SA-WMI-PA enumerate the 7 assignments listed in Table 1a, whereas with SA-WMI-PA-SK the number of assignments enumerated shrinks to 5, as shown in Table 1b. With WMI-PA, TTA(∃x.ϕ * ) directly generates in one step the 7 total truth assignments µ A ∧ µ A residual in Table 1a.

Experimental Evaluation
In the following experiments, we evaluate the performance gain of SA-WMI-PA-SK over the current state of the art in WMI. Specifically, we explore the effect of the novel structure encoding using the same setting of our recent conference paper [1]. To this extent, we compare the proposed algorithms, SA-WMI-PA and SA-WMI-PA-SK, with the previous SMT-based approach WMI-PA [19], and with the state-of-the-art KC-based approaches: XADD [22], XSDD and FXSDD [24]. For WMI-PA, SA-WMI-PA and SA-WMI-PA-SK, we use'a slightly-modified version of MathSAT [37] 14 for SMT enumeration. 15 . 12 Here ∃x.ϕ * * [µ A ] corresponds to valid Boolean formula on {A 1 , A 3 }. 13 Here the truth value of A 1 is irrelevant because (x ≥ 1) is forced to be true by (x ≥ 2). 14 We have added one option to MathSAT allowing to disable the simplification of valid clauses, which would otherwise eliminate all the clauses "(... ∨ ψ i ∨ ¬ψ i )" from sk(w). The binary file integrating these modifications is attached to the code of SA-WMI-PA-SK. 15 To compute TA(. . .), MathSAT is invoked with the options: -dpll.allsat_allow_duplicates=false (all assignments differ for at least one truth value), -dpll.allsat_minimize_model=true (assignments are partial and are minimized), -preprocessor.simplification=0 and -preprocessor.toplevel_propagation=false (disable several non-validity-preserving preprocessing steps). To compute TTA(. . .), the second option is set to false. [39] in the SMTbased approaches, whereas KC-based solvers use PSI Algebra for both integration and inconsistency checking [40]. This difference is not penalizing the KC-based approaches, which were shown to achieve worse results when using LattE [24]. All experiments are performed on an Intel Xeon Gold 6238R @ 2.20GHz 28 Core machine with 128 GB of RAM and running Ubuntu Linux 20.04. The code of both SA-WMI-PA and SA-WMI-PA-SK is freely available at https://github.com/unitn-sml/wmi-pa, and the benchmarks are available at https://github.com/weighted-model-integration/wmi-benchmarks in pywmi [41] format. We report our findings using cactus plots. Each tick on the x-axis corresponds to a single instance, and results on the y-axis are sorted independently for each method. Steeper slopes mean lower efficiency.

Synthetic experiments
Dataset description. As in our previous paper, random formulas are obtained using the generator introduced by Morettin et al. [19]. In contrast with other recent works [22,24], these synthetic benchmarks do not contain strong structural regularities, offering a more neutral perspective on how the different techniques are expected to perform on average. We fix to 3 the number of Boolean and real variables, while varying the depth of the weights in the range [4,7]. Timeout is set to 3600 seconds.
Results. Consistently with previous papers, in this setting SMT-based solvers obtain superior performance with respect to KC-based ones ( Figure 5 left), which suffer from the complex algebraic dependencies between continuous variables. Our structure-aware algorithms, SA-WMI-PA and SA-WMI-PA-SK, further drastically improve over WMI-PA in terms of time and number of computed integrals ( Figure 5). These results are aligned with our expectations. Indeed, the advantage of structure-awareness given by the introduction of either form of skeleton is eye-catching.
Additionally, SA-WMI-PA-SK is able to gain some advantage over its previous version SA-WMI-PA, although the order of magnitude remains quite the same. Notice that in this setting, the number of Boolean atoms is limited to 3, so that the revised enumeration technique of §5. WMI-PA) will be evident in the next experiment, where more-complex logical structures are involved.

Inference on Density Estimation Trees
Dataset description. We then consider the problem of computing arbitrary algebraic queries over Density Estimation Trees (DETs) [42]. DETs are the density estimation equivalent of decision or regression trees, encoding piecewise constant distributions over mixed continuous/discrete domains. Having only univariate conditions in the internal nodes, DETs support efficient density estimation and marginal inference over single variables. Answering algebraic queries over multiple variables, like Pr(X ≤ Y ), requires instead marginalizing over an oblique constraint, which is reduced to: This type of query commonly arises when applying WMI-based inference to probabilistic formal verification tasks, when properties involve multiple continuous variables. The support χ DET is obtained by conjoining the lower and upper bounds of each continuous variable, while w DET encodes the density estimate by means of nested if-then-elses and non-negative constant leaves. Consistently with our previous conference paper [1], we train DETs on a selection of hybrid datasets from the UCI repository [43]. The datasets and resulting models are summarized in §Appendix B. Following the approach of [44], discrete numerical features are relaxed into continuous variables, while n-ary categorical features are one-hot encoded with n binary variables. DETs are trained on each dataset using the standard greedy procedure [42] with bounds on the number of instances for each leaf set to [100,200]. We investigate the effect of algebraic dependencies by generating increasingly complex queries involving a number of variables k = max(1, H · |x| ), where H ∈ [0, 1] is the ratio of continuous variables appearing in the inequality. For each value of H, 5 random inequalities involving k variables are generated, for a total of 25 problem instances for each dataset. Given the larger number of problems in this setting, the timeout has been decreased to 1200 seconds.
Results. Figure 6 depicts the runtime of the algorithms for H ∈ {0.25, 0.5, 0.75, 1}. Thanks to the absence of arithmetic operations in the internal nodes, DETs are more suited for KC compared to the weight functions used in 6.1. While in the simplest inference cases (H ≤ 0.5) substantial factorization of the integrals is possible, when the coupling between variables increases this advantage is overweighted by the combinatorial reasoning capabilities of SMT-based approaches.
The efficacy of SA-WMI-PA-SK with respect to both SA-WMI-PA and WMI-PA is particularly evident on this batch of problems. Analyzing the number of integrals computed by PA-based algorithms (Figure 7) we notice a drastic reduction in the number of integrals generated by SA-WMI-PA-SK. This behaviour is expected: the number of Boolean atoms is typically very high, so that we are able to drastically reduce the number of partial assignment over the Boolean component with respect to SA-WMI-PA thanks to the revised enumeration strategy. Crucially, in contrast with the recent SA-WMI-PA, the novel algorithm is able to compete or even outperform KC-based solvers even when the oblique queries involve a low number of variables.

Extending WMI-PA with different integration approaches
A key feature of SMT-based solvers like WMI-PA, which has been inherited by our structure-aware procedures, is that of being agnostic of the integration procedure used as backend. Besides developing a novel structure encoding, we implemented a modular approach to PA-based solvers, allowing the use of different integration strategies, including approximate ones. This provides multiple advantages. First, the general integration procedure can be substituted with a specialized one, targeting specific weight families such as piecewise-constants or leveraging structural properties like full factorizability. Second, approximate integration procedures can be applied when the number of continuous variables makes exact integration of FIUC LRA functions infeasible. Noticeably, the enumeration of convex polytopes is still exact, with the advantage that the approximation error does not grow multiplicatively. Third, it broadens the use of these solvers beyond the FIUC LRA family, to functions whose integral over convex polytopes can be approximated with a finite sample, like multivariate Gaussian distributions or even the output of a simulation or any other black-box function.
We explored the practical benefits of this approach with a variant of SA-WMI-PA-SK using Monte Carlo (MC) approximate integration [45]: In practice, our MC integration procedure is implemented via VolEsti [46], a library for volume approximation and sampling in convex polytopes. VolEsti is used for both approximating Vol(µ LRA ) and for sampling w. We showcase the resulting solver, dubbed SA-WMI-PA-SK (VolEsti), in two scenarios: (i) scaling up weighted model integration with piecewise polynomial weights and (ii) solving problems with Gaussian weights.
For the first scenario, we consider the problems described in §6.1. Figure 8 displays the computational gain obtained with the above approximate MC integrator with respect to LattE's exact numerical approach (the one we used in §6). In all problems where an exact solution could be computed, the approximated version SA-WMI-PA-SK(VolEsti) consistently returned an approximation with less than 10% relative error at a fraction of the query execution time. We additionally equipped SA-WMI-PA-SK with an exact symbolic integrator. Noticeably, whereas symbolic approaches were shown to be preferable in conjunction with KC-based algorithms (see Figure 3 in [24]) and do not assume convex integration bounds, numerical integration performs substantially better with our approach.
For the second scenario, we consider the task of verifying fairness properties of probabilistic programs, borrowing the setting and examples from FairSquare [47]. In their setting, a (deterministic) decision program dec has to decide whether to hire a candidate or not, according to some features. The input of dec is distributed according to a Gaussian probabilistic program, called the population model pop, which additionally models sensitive conditions that are not accessed by dec, such as whether an individual belongs to a minority group. The goal is verifying if dec satisfies a group fairness criterion under the distribution pop, that is, if the probability of a decision being based on the sensitive condition is low enough: Similarly to our approach, FairSquare encodes the verification problem in SMT(LRA) and reduces probabilistic inference to (unweighted) volume computations. In contrast with our approach, which directly approximates the ratio in (15) by sampling from Gaussians, FairSquare iteratively computes tighter lower and upper bounds by computing increasingly accurate axisaligned approximations of the true distribution until convergence. We report our results on the same introductory examples used in the Fairsquare paper, motivating future research efforts in the WMI-based verification of probabilistic programs. The detailed description of the programs considered here is reported in Appendix C.
With SA-WMI-PA-SK (VolEsti), we could indeed show that the initial dec does not respect the fairness criterion in (15) with ε = 0.1. This problem was addressed by making dec less sensible to a feature that was directly influenced by the sensitive condition. This second version of the program is fair with high probability, as reported in Figure 9. Computing the ratios with SA-WMI-PA-SK (VolEsti) took negligible time (< 3s).
An advantage of the WMI-based approach with respect to FairSquare is that it is not limited to Gaussian priors on pop. Moreover, SMT-based approaches are a natural fit for highly combinatorial problems, such as those arising when dec is a complex classifier learned from data. A current limitation of this approach is that VolEsti does not directly offer statistical guarantees on the relative error. In problems like volume approximation [48] and sampling [49], practical statistical bounds on the error were obtained using diagnostics like the empirical sample size (ESS) or the potential scale reduction factor (PSRF) [50]. The adoption of similar mechanisms in our MC integrator, a fundamental step towards the application of WMI in formal verification, and the evaluation of WMI-based verifiers on more realistic programs, is left for future work.

Conclusion
The difficulty of dealing with densely coupled problems has been a major obstacle to a wider adoption of hybrid probabilistic inference technology beyond academia. In this paper, we made a significant step towards removing this obstacle. Starting from the identification of the limitations of existing solutions for WMI, we proposed a novel algorithm combining predicate abstraction with weight-structure awareness, thus unleashing the full potential of SMT-based technology. An extensive experimental evaluation showed the advantage of the proposed algorithm over current state-of-the-art solutions on both synthetic and real-world problems. These include inference tasks over constrained density functions learned from data, which highlight the potential of the approach as a flexible solution for probabilistic inference in hybrid domains.
Whereas the improvements described in this work drastically reduce the number of integrals required to perform WMI, the integration itself can be a bottleneck for scalability. To deal with this issue, we showed how SMT-based approaches can seamlessly incorporate multiple integration strategies, including symbolic, numeric and approximate ones, allowing to choose the most appropriate strategy for the problem at hand. In particular, e.g., using approximate MC integration techniques could drastically improve performances at the cost of limited losses in terms of precision.
We further showcased the potential utility of our approach on a prototypical application aimed at verifying the fairness of probabilistic programs, as a first step towards the application of WMI technology to the probabilistic verification of hybrid systems and machine learning models.

16:
( ψ i ∈conds ¬ψ i ∨ ¬(y = w 1 ) ∨ ¬(y = w 2 )) 17: y = y 1 ∪ y 2 ∪ {y} 18: return y, defs, y we are interested only in the truth values of the conditions Ψ in µ which make w [µ] FI LRA , regardless of the actual values of w [µ] ). Therefore, we can safely substitute condition-less arithmetic functions and operators with fresh uninterpreted function symbols, obtaining a LRA ∪ EUF-formula y = w EU F , which is relatively easy to solve by standard SMT solvers [9]. It is easy to see that a partial assignment µ evaluating y = w to true is satisfiable in the theory iff its corresponding assignment µ EU F is LRA ∪ EUF-satisfiable. 18 Thus, we can modify the enumeration procedure into TA(∃x.∃y.(ϕ ∧ χ ∧ y = w EU F )).
Finally, we enforce the fact that the two branches of an if-then-else are alternative by adding a mutual-exclusion constraint ¬(y i = t i1 ) ∨ ¬(y i = t i2 ) to (A.1), so that the choice of the function is univocally associated to the set of decisions on the ψ i s. The procedure generating y = w EU F is described in detail in Algorithm 4.
(Notice that, following the chains of true equalities, we have y = y 5 = y 3 = y 1 = f 11 (x).) Then the SMT solver extracts from it the subset {A 1 , (x 1 ≥ 1), (x 2 ≥ 1)} restricted to the conditions in Ψ. Then the blocking clause ¬A 1 ∨ ¬(x 1 ≥ 1) ∨ ¬(x 2 ≥ 1) is added to the formula, which prevents to enumerate the same subset again. This forces the algorithm to backtrack and generate: Notice that in general the assignments µ A * are partial even if the steps in lines 10-11 are executed; the set of residual Boolean atoms in ϕ * * [µ A ] are a (possibly much smaller) subset of A \ µ A because some of them do not occur anymore in ϕ * * [µ A ] after the simplification, as shown in the following example.
We stress the fact that in our actual implementation, like with that of WMI-PA, the potentially-large sets M A * and M LRA are not generated explicitly. Rather, their elements are generated, integrated and then dropped one-by-one, so that to avoid space blow-up. We highlight two main differences w.r.t. WMI-PA. First, unlike with WMI-PA, the generated assignments µ A on A are partial, each representing 2 |A\µ A | total ones. Second, the assignments on (non-Boolean) conditions Ψ inside the µ LRA s are also partial, whereas with WMI-PA the assignments to the Bs are total. This may drastically reduce the number of integrals to compute, as empirically demonstrated in the next section. Table B.2 describes the datasets used to train the DETs considered in 6.2. For each dataset, we report the number of Boolean and continuous variables, the size of the training/validation splits and the size of the resulting DET in terms of number of internal nodes (conditions).