Selfadhesivity in Gaussian conditional independence structures

Selfadhesivity is a property of entropic polymatroids which guarantees that the polymatroid can be glued to an identical copy of itself along arbitrary restrictions such that the two pieces are independent given the common restriction. We show that positive definite matrices satisfy this condition as well and examine consequences for Gaussian conditional independence structures. New axioms of Gaussian CI are obtained by applying selfadhesivity to the previously known axioms of structural semigraphoids and orientable gaussoids.


Introduction
In matroid theory, the term amalgam refers to a matroid in which two smaller matroids are glued together along a common restriction, similar to how four triangles can be glued together along edges to form the boundary of a tetrahedron. This concept is meaningful for conditional independence (CI) structures as well. The bridge from the geometric (matroid-theoretical) concept to probability theory (conditional independence) is built by Matúš (2007) who defines a special kind of amalgam, the adhesive extension, for polymatroids and proves that such extensions always exist for entropic polymatroids with a common restriction.
The purpose of this article is two-fold: First, it is to abstract further than polymatroids and to introduce a derived collection of amalgamation properties known as selfadhesivity for general conditional independence structures. Second, this general treatment of selfadhesivity is driven by its applications to Gaussian instead of discrete CI inference. The main result, Theorem 1, shows that also in the Gaussian setting adhesive extensions (of covariance matrices) exist and are even unique. We use the non-trivial gluability conditions implied by this result to derive new axioms for Gaussian conditional independence structures. These results are heavily based on computations. All source code and further details on computations are provided on the MathRepo website: https://mathrepo.mis.mpg.de/SelfadhesiveGaussianCI/.

arXiv:2205.07667v2 [cs.IT] 27 Jan 2023
Gaussian conditional independence. Let N be a finite ground set indexing jointly distributed random variables ξ = (ξ i : i ∈ N ). By convention, elements of N are denoted by i, j, k, . . . and subsets by I, J, K, . . . . Elements are identified with singleton subsets of N and juxtaposition of subsets abbreviates set union. Thus, an expression such as iK is shorthand for { i } ∪ K as a subset of N . The complement of K ⊆ N is K c .
We are mostly interested in Gaussian (i.e., multivariate normal) distributions. These distributions are specified by a small number of parameters, namely the mean vector µ ∈ R N and the covariance matrix Σ ∈ PD N . Throughout this article, "Gaussian" means "regular Gaussian", i.e., the covariance matrix is strictly positive definite. For positive semidefinite covariance matrices, which lie on the boundary of the PD cone, the CI theory is algebraically more complicated and valid inference properties for regular Gaussians can fail to be valid for singular ones; see Studený (2005), Section 2.3.6.
The following result summarizes basic facts from algebraic statistics relating subvectors of ξ and their (positive definite) covariance matrices. It can be found, for instance, in §2.4 of Sullivant (2018). For Σ ∈ PD N and I, J, K ⊆ N , let Σ I,J denote the submatrix with rows indexed by I and columns by J. Submatrices of the form Σ K := Σ K,K are principal. Dual to a principal submatrix is its Schur complement Principal submatrices and Schur complements of positive definite matrices are also positive definite.
Theorem. Let ξ be distributed according to the (regular) Gaussian distribution with mean µ ∈ R N and covariance Σ ∈ PD N . Let K ⊆ N .
-The marginal vector ξ K = (ξ k : k ∈ K) is a regular Gaussian in R K with mean vector µ K and covariance Σ K .
-Let y ∈ R K . The conditional ξ K c | ξ K = y is a regular Gaussian in R K c with mean vector µ K c + Σ K c ,K Σ −1 K (y − µ K ) and covariance Σ K .
-Let a Gaussian over N = IJ be given with covariance Σ ∈ PD IJ . Then the marginal independence [ξ I ⊥ ⊥ ξ J ] holds if and only if Σ I,J = 0.
The general CI statement [ξ I ⊥ ⊥ ξ J | ξ K ] is the result of marginalizing ξ to IJK, conditioning on K and then checking for independence of I and J. The previous lemma implies the following algebraic CI criterion for regular Gaussians: The last equivalence follows from rank additivity of the Schur complement (see Zhang (2005)). Indeed, the matrix in (⊥ ⊥ 1 ) is the Schur complement of K in Σ IK,JK and must have rank zero since the principal submatrix Σ K has full rank |K| already because it is positive definite. In particular, the truth of a conditional independence statement does not depend on the conditioning event and it does not depend on the mean µ.
Hence, for CI purposes in this article, we identify regular Gaussians with their covariance matrices Σ ∈ PD N . Rank additivity of the Schur complement also shows that the "≥" part of the rank condition in (⊥ ⊥ 2 ) always holds. Hence, the minimal rank |K| is attained if and only if all minors of Σ IK,JK of size |K| + 1 vanish. But only a subset of these minors is necessary: by (⊥ ⊥ 1 ) the rank of Σ IK,JK is |K| if and only if Σ I,J = Σ I,K Σ −1 K Σ K,J holds. This is one polynomial condition for each i ∈ I and j ∈ J, namely det Σ iK,jK = 0again by Schur complement expansion of the determinant. These minors correspond to CI statements of the form [ξ i ⊥ ⊥ ξ j | ξ K ]. This proves the following "localization rule" for Gaussian conditional independence: Rules of this form go back to Matúš (1992). A weaker localization rule (L ) (discussed below) holds for all semigraphoids, whereas the one presented above can be proved for compositional graphoids. In both cases, a general CI statement is reduced to a conjunction of elementary CI statements [ξ i ⊥ ⊥ ξ j | ξ K ] about the independence of two singletons. We adopt the form [I ⊥ ⊥ J | K] for CI statements [ξ I ⊥ ⊥ ξ J | ξ K ] without the mention of a random vector. These symbols are treated as combinatorial objects and The localization rule shows that Σ encodes the entire set of true CI statements for a Gaussian with covariance matrix Σ and with slight abuse of notation we employ statements such as [I ⊥ ⊥ J | K] ∈ Σ .
It is important to note in this context that we treat only pure CI statements, i.e., [I ⊥ ⊥ J | K] where I, J, K are pairwise disjoint. Any general CI statement with overlaps between the three sets decomposes, analogously to the localization rule, into a conjunction of pure CI statements and functional dependence statements. For a regular Gaussian, functional dependences are always false, so this is no restriction in generality. In particular, the general statement [N ⊥ ⊥ M | L], which frequently appears later, is equivalent to Polymatroids and selfadhesivity. A polymatroid over the finite ground set N is a function h : 2 N → R assigning to every subset K ⊆ N a real number, such that h is With the linear functional (I, J|K) · h := h(IK) + h(JK) − h(IJK) − h(K), submodularity on pairwise disjoint I, J, K can be restated as (I, J|K) · h ≥ 0. If h ξ is the entropy vector of a discrete random vector ξ, i.e., h ξ (K) is the Shannon entropy of the marginal vector ξ K , then it is a polymatroid and the quantity (I, J|K) · h ξ is known as the conditional mutual information I(ξ I ; ξ J |ξ K ). Its vanishing is equivalent to the conditional independence [ξ I ⊥ ⊥ ξ J | ξ K ]. Hence we may define the CI structure of a polymatroid as h : Matúš (1994) and (equivalently, but based on properties of multiinformation instead of entropy vectors) structural semigraphoids in Studený (1994). Again, per Matúš (1994) a localization rule holds for them which we use to interpret the containment of non-elementary CI statements: This rule can be proved from the semigraphoid axioms and hence it holds true also for Gaussians. In this case, it is equivalent to the shorter rule (L ) using the Intersection and Composition properties of Gaussians. Matúš (2007) introduced the notion of adhesive extensions and selfadhesive polymatroids to mimic a curious amalgamation property of entropy vectors. The underlying construction is the Copy lemma of Zhang and Yeung (1998), also known as the conditional product; see Studený (2005), Section 2.3.3. Let g and h be two polymatroids on ground sets N and M , respectively, and suppose that their restrictions g| L and h| L to L = N ∩ M coincide. A polymatroid f on N M is an adhesive extension of g and h if: In polymatroid terms, N and M are said to form a modular pair in f if this CI statement holds.
Next, suppose that we have only one polymatroid h on ground set N and fix L ⊆ N . An L-copy of N is a finite set M with |M | = |N | and M ∩ N = L. We fix a bijection π : N → M which preserves L pointwise. h is a selfadhesive polymatroid at L if there exists an adhesive extension of h and its induced copy π(h) over their common restriction to L. The polymatroid is selfadhesive if it is selfadhesive at every L ⊆ N . The fundamental result of Matúš (2007) is: Theorem. Any two of the restrictions of an entropic polymatroid have an entropic adhesive extension. In particular, entropy vectors are selfadhesive.
Furthermore, the set of polymatroids on a 4-element ground set which are selfadhesive forms a rational, polyhedral cone in R 16 . This follows at once from the definition of selfadhesivity and the Fourier-Motzkin projection theorem. This cone is characterized (in addition to the polymatroid properties) by the validity of the Zhang-Yeung inequalities (see Remark 1). In this sense, selfadhesivity is a reformulation of the Zhang-Yeung inequalities using only the notions of restriction and conditional independence. Generalizations of adhesive extensions to multiple polymatroids lead to book inequalities of Csirmaz (2014).

Adhesive extensions of Gaussians
The analogous result for Gaussian covariance matrices is our main theorem: We use the following names for blocks of Σ and Σ : where Λ will be determined shortly. Its restrictions to N and M are clearly equal to Σ and Σ , respectively. The CI statement [N ⊥ ⊥ M | L] is equivalent to the rank requirement rk Φ N,M = |N ∩ M | = |L|, but then rank additivity of the Schur complement shows This implies Λ = A T X −1 B and thus Φ is uniquely determined by Σ and Σ via the two conditions in the theorem. To show positive definiteness, consider the transformation The result is clearly positive definite and since P is invertible, this shows Φ ∈ PD N M .
Remark 1. Zhang and Yeung (1998) proved the first information inequality for entropy vectors which is not a consequence of the Shannon inequalities (equivalently, the polymatroid properties). It can be expressed as the non-negativity of the functional Matúš (2007) characterized the selfadhesive polymatroids over a 4-element ground set as those polymatroids satisfying (i, j|kl) ≥ 0 for all choices of i, j, k, l. As a corollary to Theorem 1 we obtain that the multiinformation vectors and hence the differential entropy vectors of Gaussian distributions satisfy the Zhang-Yeung inequalities. This is one half of the result proved by Lněnička (2003). However, that result also follows from the metatheorem of Chan (2003) since (i, j|kl) is balanced.
In the theory of regular Gaussian conditional independence structures, it is natural to relax the positive definiteness assumption on Σ to that of principal regularity, i.e., all principal minors, instead of being positive, are required not to vanish. Principal regularity is the minimal technical condition which allows the formation of all Schur complements and the property is inherited by principal submatrices and Schur complements, hence enabling analogues of marginalization and conditioning over general fields instead of the field R; see Boege (2021) for applications. However, the last step in the above proof requires positive definiteness and does not work for principally regular matrices: Example 1. Consider the following principally regular matrix over N = ijkl: and fix L = ij. By the proof of Theorem 1, the submatrix and rank conditions uniquely determine an adhesive extension of Γ with an L-copy of itself over the ground set ijklk l . This unique candidate matrix is But this matrix is not principally regular, as the lk l -principal minor is zero. However, the CI structure represented by Γ is the dual of the graphical model for the undirected path ilkj; cf. Lněnička and Matúš (2007), Section 3. This implies that it is representable by a positive definite matrix with rational entries and even though the particular matrix representation above does not have a selfadhesive extension, another representation exists which is positive definite and hence selfadhesive.

Structural selfadhesivity
The existence of adhesive extensions and in particular selfadhesivity of positive definite matrices induces similar properties on their CI structures, since the conditions in Theorem 1 can be formulated using only the concepts of restriction and conditional independence. On the CI level, we sometimes use the term structural selfadhesivity to emphasize that it is generally a weaker notion than what is proved for covariance matrices above. Selfadhesivity can be used to strengthen known properties of CI structures: if it is known that all positive definite matrices have a certain distinguished property p, then the fact that Σ and any L-copy of it fit into an adhesive, positive definite extension obeying p says more about the structure of Σ than p alone. We begin by making precise the notion of a property: A property p consists of one set p(n) ⊆ A n per finite cardinality n. This is the set of CI structures over [n] which "have property p". CI structures L and M over N and M , respectively, are isomorphic if there is a bijection π : N → M such that under the induced map M = π(L). We are only interested in properties which are invariant under isomorphy. Hence, the choice of ground sets [n] presents no restriction. Moreover, we freely identify isomorphic CI structures in the following. In particular, each k-element subset K ⊆ [n] will be tacitly identified with [k] and we use notation such as p(K).
Example 2. By the localization rule (L ), the well-known semigraphoid axioms of Pearl and Paz (1985) reduce to the single inference rule Being a semigraphoid is a property defined by sg(n) := { L ⊆ A n : (S) holds for L for all distinct i, j, k ∈ [n] and L ⊆ [n] \ ijk } .
Being realizable by a Gaussian distribution is another property Both are closed under restriction, which can be expressed as follows: for every L ∈ p(N ) and every K ⊆ N we have L| K := L ∩ A K ∈ p(K).
The property lattice is equipped with a natural order relation of component-wise set inclusion from the boolean lattices 2 An . This order relation ≤ compares properties by generality: if p ≤ q, then for all n ≥ 1 we have p(n) ⊆ q(n), and p is sufficient for q and, equivalently, q is necessary for p.
Definition 2. Let p be a property of CI structures. The selfadhesion p sa (N ) of p is the set of CI structures L such that for every L ⊆ N together with an L-copy M of N and bijection π : N → M there exists L ∈ p(N M ) satisfying the two conditions: The equality L| N = L = L| M in the above definition is understood up to isomorphy via π. The following is a direct consequence of Theorem 1: Corollary 1. The property g + of being regular Gaussian is selfadhesive.
Lemma 1. The operator · sa is recessive and monotone on the property lattice.
Proof. Let p be a property and L ∈ p sa (N ). In particular, L is selfadhesive with respect to p at L = N . The L-copy M of N in the definition must be M = N and it follows that L ∈ p(N M ) = p(N ). This proves recessiveness p sa ≤ p. For monotonicity, let p ≤ q and L in p sa (N ). Then for every L with L-copy M of N there exist a certificate for the existence of L in p sa . This certificate lives in p(N M ) ⊆ q(N M ) which proves L ∈ q sa (N ).
Thus, from monotonicity and the fact that g + is a fixed point of selfadhesion, we find that a property which is necessary for Gaussianity remains necessary after selfadhesion. Since selfadhesion makes properties more specific, this allows one to take known necessary properties of Gaussian CI and to derive new, stronger properties from them.
Iterated applications of selfadhesion give rise to a chain of ever more specific properties g + ≤ · · · ≤ p k·sa ≤ · · · ≤ p 2·sa := (p sa ) sa ≤ p sa ≤ p. For each fixed component n of the property, this results in a descending chain in the finite boolean lattice 2 An which must stabilize eventually. However, the whole property p has a countably infinite number of components and it is not clear if iterated selfadhesions converge eventually to their limit p ω·sa := ∞ k=1 p k·sa . Question 1. Does · sa stabilize after the first application to "well-behaved" properties like sg, i.e., is sg sa = sg ω·sa ? Under which assumptions on a property does it stabilize eventually?
We now turn to the question which closure properties of p are recovered for p sa . For example, if for every L, L ∈ p(N ) we have L ∩ L ∈ p(N ), then p is closed under intersection. Semigraphoids enjoy this closure property because they are axiomatized by the Horn clauses (S). The following lemma shows that all iterated selfadhesions inherit closure under intersection.
Lemma 2. If p is closed under intersection, then so is p sa .
Proof. Let L, L ∈ p sa (N ) and fix a set L ⊆ N and an L-copy M of N . There are L and L in p(N M ) witnessing the containments of L and L , respectively, in p sa (N ). Their intersection L ∩ L is in p(N M ) by assumption and we have Thus it proves selfadhesivity of L ∩ L with respect to p at L.
Similarly to matroid theory, minors are the natural subconfigurations of CI structures. They are the CI-theoretic abstraction of marginalization and conditioning on random vectors.
Definition 3. Let L ⊆ A N and x ∈ N . The marginal and the conditional of L on N \ x are, respectively, A minor of L is any CI structure which is obtained by a sequence of marginalizations and conditionings.
If for every L ∈ p(N ) and every minor K of L on ground set M ⊆ N we have K ∈ p(M ), then p is minor-closed. Minor-closedness is necessary for the existence of a finite axiomatization of a property p. More concretely, Matúš (1997) studied descriptions of properties by finitely many "forbidden minors", which is under natural regularity assumptions equivalent to having a finite axiomatic description by boolean CI inference formulas; cf. Boege (2022), Section 4.4, for details.
Lemma 3. If p ≤ sg is minor-closed, then so is p sa .
Proof. By induction it suffices to prove closedness under marginals and conditionals. Let L ∈ p sa (N ) and x ∈ N . Fix L ⊆ N \ x and an L-copy M of N with bijection π and let L be the witness for selfadhesivity of L at L. The minor L \ { x, π(x) } is in p(N M \ { x, π(x) }) by assumption of minor-closedness; and note that M \ π(x) is an L-copy of N \ x.
. For the last argument we need the semigraphoid property to hold for p. This ensures by Studený (2005), Lemma 2.2, that the localization rule (L ) applies. This rule shows that The proof for conditionals is entirely analogous.

Selfadhesivity testing
Whether or not a CI structure L ⊆ A N is in p sa (N ) can be checked algorithmically if an oracle p( L) for the property p is available. This oracle receives a partially defined CI structure L over N , i.e., a set of CI statements or negated CI statements specifying constraints on some statements from A N . Then p decides if L can be extended to a member of p(N ). return true 13: end function Each component p(n) of a property p is a set of subsets of A n . There are two principal ways of representing this set: explicitly, by listing its elements, or implicitly, by listing its properties. A typical application of Algorithm 1 takes in both, an explicit description of p(n) to loop over, as well as an implicit description p of p to perform selfadhesivity testing for ground sets of sizes between n and 2n. It outputs only an explicit description of p sa at a given index n. This makes it difficult to iterate selfadhesions. Each set corresponds to a boolean function ϕ p(n) : { 0, 1 } An → { 0, 1 } indexed by A n . In terms of this function, the explicit and implicit representations of p(n) roughly correspond to the disjunctive normal form and the conjunctive normal form, respectively. Transforming the explicit description obtained from Algorithm 1 into CNF to compute the next selfadhesion is a hard problem and it would be required to compute p sa (m) explicitly for all n ≤ m ≤ 2n.

Algorithm 1 Blackbox selfadhesion membership test
Remark 2. The proof of Lemma 1 shows that a CI structure L satisfies selfadhesivity with respect to p at L = N if and only if L has property p. In the other extreme case, every structure in p is selfadhesive at L = ∅ if p is closed under direct sums. Many useful properties are closed under direct sums because this operation mimics the independent joining of two random vectors; see Matúš (2004). If this is known a priori, some selfadhesivity tests can be skipped.
We now proceed to apply Algorithm 1 to two practically tractable necessary conditions for Gaussian realizability. The computational results allow, via Corollary 2, the deduction of new CI inference axioms for Gaussians on five random variables.

Structural semigraphoids
It is easy to see that every Gaussian CI structure L = Σ can also be obtained from the correlation matrix Σ of the original distribution Σ. Hence, we may assume that Σ is a correlation matrix. In that case, the multiinformation vector of Σ is the map m Σ : 2 N → R given by m Σ (K) := − 1 /2 log det Σ K . This function satisfies m Σ (∅) = m Σ (i) = 0 for all i ∈ N and it is supermodular by the Koteljanskii inequality; see Johnson and Barrett (1993). Similarly to entropy vectors, the equality condition in these inequalities characterizes conditional independence: In the nomenclature of Studený (2005), Chapter 5, m Σ is an -standardized supermodular function. The functions having these two properties form a rational, polyhedral cone S N of codimension |N | + 1 in R 2 N . Its facets are precisely given by the supermodular inequalities (ij|K) ≤ 0 for all elementary CI statements [i ⊥ ⊥ j | K] ∈ A N . Since the facets of this cone are in bijection with CI statements, it is natural to identify faces (intersections of facets) dually with CI structures (unions of CI statements). The property of CI structures defined by arising from a face of S N is that of structural semigraphoids, denoted by sg * , and it is necessary for g + since every Gaussian CI structure Σ is associated with the unique face on which m Σ ∈ S N lies in the relative interior.
Remark 3. Structural semigraphoids can be equivalently defined via the face lattice of the cone of tight polymatroids, i.e., polymatroids h with h(N ) = h(N \ i) for every i ∈ N .
The tightness condition poses no extra restrictions: for every polymatroid, there exists a tight polymatroid inducing the same pure CI statements (only differing in the functional dependences); cf. Matúš and Csirmaz (2016), Section III. A proof of the equivalence is contained in Boege (2022), Section 6.3.
Deciding whether a partially defined CI structure L is consistent with the structural semigraphoid property is a question about the incidence structure of the face lattice of S N . Such questions reduce to the feasibility of a rational linear program as previously demonstrated by Bouckaert et al. (2010): Algorithm 2 Structural semigraphoid consistency test 1: function is-structural( L) tests if L can be extended to a member of sg * (N ) 2: for all s ∈ A N do 4: return is-feasible(P ) call an LP solver 9: end function Equipped with this oracle for sg * , Algorithm 1 can be applied to compute membership in sg sa * . We use the gaussoids of Lněnička and Matúš (2007) as input because they are easily computable candidates for Gaussian CI structures; see also Boege et al. (2019).
For n = 4 random variables, the gaussoids which are structural semigraphoids already coincide with the realizable Gaussian structures (as classified in Lněnička and Matúš (2007)) and selfadhesivity offers no improvement. This is no longer the case on five random variables: Computation 1. There are 508 817 gaussoids on n = 5 random variables modulo isomorphy. Of these 336 838 are structural semigraphoids and 335 047 of them are selfadhesive with respect to sg * .
A semigraphoid L is structural if and only if it is induced by a polymatroid, i.e., L = h . In this case, two distinct notions of selfadhesivity can be applied to L: the first is Matúš's definition of selfadhesivity for the inducing polymatroid h; and the second is Definition 2 for the CI structure L with respect to the property sg * . The second condition is implied by the first. The existence of a selfadhesive inducing polymatroid can be efficiently tested for ground set size four based on the polyhedral description of the cone of selfadhesive 4-polymatroids from Matúš (2007), Corollary 6.
Computation 2. Out of the 1 285 isomorphy representatives of sg * (4), exactly 1 224 are in sg sa * (4). Each of them is induced by a selfadhesive 4-polymatroid. Question 3. Is every element of sg sa * (N ) induced by a selfadhesive N -polymatroid, for every finite set N ?

Orientable gaussoids
Recall from Boege et al. (2019) that a gaussoid is orientable if it is the support of an oriented gaussoid. Oriented gaussoids are a variant of CI structures in which every statement [i ⊥ ⊥ j | K] has a sign { 0, +, -} attached, indicating conditional independence, positive or negative partial correlation, respectively. Oriented gaussoids are axiomatically defined and therefore SAT solvers are ideally suited to decide the consistency of a partially defined CI structure with these axioms. The property of orientability, denoted o, is obtained from the set of oriented gaussoids by mapping all CI statements oriented as 0 to elements of a CI structure and all statements oriented + orto non-elements. To facilitate orientability testing, one allocates two boolean variables V 0 s and V + s for every CI statement s. The former indicates whether s is 0 or not while the latter indicates, provided that V 0 s is false, if s is + or -. Further details about oriented gaussoids, their axioms and use of SAT solvers for CI inference are available in Boege et al. (2019).
Computation 3. All orientable gaussoids on n = 4 are Gaussian. Of the 508 817 isomorphy classes of gaussoids on n = 5 precisely 175 215 are orientable and 168 010 are selfadhesive with respect to orientability.

Structural orientable gaussoids
The meet property sg * ∧ o of structural semigraphoids and orientable gaussoids is likewise necessary for Gaussianity and an oracle for it can be combined from the oracles of its two constituents. Its selfadhesion yields no improvement over apparently weaker properties: