Normal limiting distributions for systems of linear equations in random sets

We consider the binomial random set model $[n]_p$ where each element in $\{1,\dots,n\}$ is chosen independently with probability $p:=p(n)$. We show that for essentially all regimes of $p$ and very general conditions for a matrix $A$ and a column vector $\mathbf{b}$, the count of specific integer solutions to the system of linear equations $A\mathbf{x} = \mathbf{b}$ with the entries of $\mathbf{x}$ in $[n]_p$ follows a (conveniently rescaled) normal limiting distribution. This applies among others to the number of solutions with every variable having a different value, as well as to a broader class of so-called non-trivial solutions in homogeneous strictly balanced systems. Our proof relies on the delicate linear algebraic study both of the subjacent matrices and the corresponding ranks of certain submatrices, together with the application of the method of moments in probability theory.


Introduction and main results
The study of the existence of solutions to linear equations in subsets of the integers (and more generally in additive or even non-abelian groups) has been a prominent topic not only in number theory, but also in extremal combinatorics, ergodic theory, functional analysis and theoretical computers science, among other research areas. A prototypical example of such investigations is Roth's Theorem [8], which proves (by using Fourier analytic means) that dense sets of integers always contain arithmetic progressions of length 3. This result was fully generalized by Szemerédi [17], who obtained a similar statement for arithmetic progressions of all (fixed) lengths.
Following this line of research, Frankl, Graham and Rödl [5,Theorem 2] proved similar results for homogeneous systems of linear equations. In particular, they proved that when A is a density regular matrix with integer entries (that is, the columns of A sum up to the zero vector), then any dense integer set T will contain solutions to the linear system Ax = 0. This result generalizes Szemerédi's Theorem, as arithmetic progressions of length k (or k-APs for short) can be encoded as solutions to the linear system x 2 − x 1 = x 3 − x 2 = · · · = x k − x k−1 . Other classical equations, such as the Schur equation for sum-free sets (sets without solutions to the equation x + y = z), and Sidon sets (sets without non-trivial solutions to the equation x + y = z + t) also fit into this framework.
A very recent trend of investigation is to transfer these extremal results to a sparse setting, and hence to prove analogues on subsets which are dense relative to certain ambient sets. Specific examples for such sparse ambient sets are for instance the primes or perfect powers, but also the binomial random set [n] p obtained by choosing to include independently each element in [n] = {1, 2, . . . , n} with probability p. Returning to the example of k-APs, define a set T to be (δ, k)-Szemerédi if every subset U ⊆ T with at least δ|T | elements contains a k-AP. Thus, Szemerédi's Theorem shows that [n] (in fact, every set T that is itself dense in [n]) is (δ, k)-Szemerédi for every δ and k. For the specific case of k = 3, Kohayakawa, Łuczak and Rödl [7] proved that for all δ > 0 there exists constants c, C > 0 such that if p ≤ cn −1/2 , then as n tends to infinity, P([n] p is (δ, 3)-Szemerédi) tends to 0, while if p ≥ Cn −1/2 , it tends to 1. The existence of this threshold was extended to all values of k in [14,4], and later rediscovered in [13,1] in the context of independent sets in hypergraphs. These ideas, named hypergraph container method were then used by Spiegel [15] and independently by Hancock, Staden and Treglown [6] to extend [5,Theorem 2] to the broadest class of linear systems possible. Similar techniques were used by Rué, Serra and Vena [10] to study random sparse analogues of [5,Theorem 2] in finite fields and more general configurations than linear systems of equations.
The main contribution of our paper follows this trend of research, with the aim of studying the typical number of solutions to linear systems of equations in random sets of integers. Our main theorem, Theorem 18 is technical and in order to formally state it we need to introduce a wide variety of algebraic definitions and lemmas, which will be the core of Section 2. However, we will now introduce some main definitions in order to be able to formulate two important consequences of Theorem 18, namely Theorems 4 and 7.
Proper solutions and their distribution In order to properly state our results we need to introduce some notation. Let m > r be positive integers, A ∈ Z r×m an integer matrix, and b ∈ Z r an integer vector. We write the rank of a matrix A by rk(A). Define S(A, b) = {x ∈ Z m : Ax = b} as the set of integer solutions to the system of linear equations Ax = b. Furthermore, if Q ⊂ [m], let A Q ∈ Z r×|Q| denote the r × |Q| matrix obtained by only keeping the columns of A indexed by Q. We notice that the rank of the empty matrix A ∅ is zero. We will identify tuples x = (x 1 , . . . , x m ) ∈ Z m with the corresponding column vectors but abuse notation slightly by letting x Q denote the vector obtained by only keeping the rows of x. This should not be confused with the notation x k , where k is a positive integer, which we will use to denote the vector Finally, if x = (x 1 , . . . , x m ) ∈ Z m is an m-tuple, we will write {x} as a shorthand for the set {x 1 , . . . , x m }. In particular, if some entry in x is repeated, then the cardinality of {x} is strictly less than m.
Roughly speaking, our main goal in this paper is to estimate |S(A, b) ∩ [n] m p |, that is, the number of solutions whose entries lie in [n] p for different regimes of p. In order for this to be a well-posed problem, we need some definitions from [16] and [11]. Sometimes the second requirement for positivity in Definition 1 is called irredundancy. For our applications, we will never consider positivity and irredundancy separately, and hence we have combined those two properties into a single one for expediency's sake.
In general, considering any possible solution in S(A, b) is a bit too lenient. For instance, in the 3-AP situation (whose associated matrix is A = 1 −2 1 ), S(A, 0) also contains all the tuples (a, a, a) with a ∈ Z, and those are clearly not of interest. To remedy this issue we need to introduce the easy notion of proper solutions to Ax = b.

Definition 2.
Let m > r be positive integers, A ∈ Z r×m , and b ∈ Z r . Then the set S 0 (A, b) of proper solutions is the subset of S(A, b) with all coordinates being pairwise distinct, that is Before proceeding to the statement of our first main result, we need to define one more parameter, which will measure the densest subsystem of A. The motivation behind this can be compared to the graph setting, where one wants to study the number of occurrences of some fixed graph G as a subgraph of a binomial random graph with n vertices and edge probability p. Here, one naively would expect the threshold for a random graph to contain G to be when p is around n −v(G)/e(G) , which is when the expected number of copies of G flips from 0 to positive. But this is not the case: if G contains a subgraph H with v(H)/e(H) > v(G)/e(G), then n −v(H)/e(H) will define the threshold instead. A similar behavior occurs for the distribution of subgraph counts, as shown by Ruciński in [9].

Definition 3. For positive integers m > r and a positive integer matrix
Note that this is indeed well-defined, since for a positive matrix we see that rk(A Q ) ≥ rk(A) − |Q| + 1 for every ∅ = Q ⊂ m (see, for instance, the proof of Lemma 12).
We are now ready to state our first main theorem. For a random variable X with finite first moment E(X) and non-zero finite variance Var(X), denote byX = (X − E(X))/ Var(X) its normalization. We write X n d − → Y when a sequence of random variables {X n } n≥1 tends in distribution to Y .

Theorem 4.
Let m > r be positive integers, A ∈ Z r×m a positive and abundant integer matrix, and b ∈ Z r such that S(A, b) = ∅. Furthermore, let n be an integer, 0 ≤ p := p(n) ≤ 1 and X n the random variable equal |S 0 (A, b) ∩ [n] m p |, which counts the number of proper solutions Note that in the specific case of k-APs, this problem was already investigated by Barhoumi-Andréani, Koch and Liu [2]. In fact, they prove not only results on the limiting distribution of the number of k-term arithmetic progressions in [n] p even when k is unbounded (but sublogarithmic), they also establish a bivariate central limit theorem for the joint distribution when considering the counts of two distinct progression lengths. In contrast, Theorem 4 in the setting of progressions requires the length to be fixed.
Non-trivial solutions, and main theorem for their distribution Proper solutions are always of interest in the number theoretical context, but in a wide variety of situations we need to take care of non-proper solutions. For example, for A = 1 1 −1 −1 , we see that S(A, 0) is the number of additive quadruples satisfying a + b = c + d. Clearly, if {a, b} ∩ {c, d} = ∅, then these two sets must in fact be the same. On the other hand, solutions of the form 2a = c + d with a / ∈ {c, d} are not proper but obviously of interest and might be considered as valid ones. Keeping this in mind, we will now recall the notion of non-trivial solutions for systems of linear equations due to Rué, Spiegel and Zumalacárregui [11] which generalized an earlier notion of this for single line equations introduced by Ruzsa [12]. For a solution to be the partition of [m] such that for any i, j ∈ [m] it holds that x i = x j if and only if i and j are in the same partition class of p(x). One can view p(x) as an ordered |p(x)|-tuple (C 1 , . . . , C |p(x)| ) such that min C i < min C j whenever i < j. Doing this, we can now define the matrix A p(x) in the following way. Suppose the columns of A are denoted by c 1 , . . . , c m ∈ Z r , then We are finally ready to introduce the notion of a non-trivial solution.

Definition 5.
Let m > r be positive integers, A ∈ Z r×m , and b ∈ Z r . Then the set S 1 (A, b) of non-trivial solutions is the subset of S(A, b) with associated partitions coming from P(A), that is This definition might seem quite arbitrary, but the interested reader is invited to read the discussions in [11] and [16] which show that, in some sense, it is quite natural in that it encompasses the natural notions of non-triviality for specific systems of linear equations studied in the literature. When we want to investigate the distribution of nontrivial solutions, we actually need to look at c(A p ) for any partition type p that is to be considered. A special case in which it suffices to only consider c(A) is that of strictly balanced systems. Both of our main results can be compared to those proved in [11] and [16]. In the former the authors investigated the threshold behavior of |S 1 (A, 0) ∩ [n] m p | when A ∈ Z r×m is a positive matrix and in the latter, Spiegel extended this to |S 1 (A, b) ∩ [n] m p | for an arbitrary b ∈ Z r when A is positive and also abundant.
In that sense, their results hold in a more general setting, even when compared to our main technical result, Theorem 18. However, we note that the study of the distribution is more delicate than the threshold behavior, as can be seen by another result from [11], in which the authors explore the behavior at the threshold of nontrivial solutions to Ax = 0 for strictly balanced systems, that is, they work exactly in the same setting as in Theorem 7.
To conclude this section, let us mention that the core of the proof is based on a combination of algebraic ideas dealing with the matrix associated with the linear system of equations and the method of moments in probability theory to show convergence towards a normal distribution. The second part is highly inspired by the techniques developed by Ruciński in [9] where he proved the following result on the distribution of the number of occurrences of a fixed graph G as a subgraph in a binomial random graph. Theorem 8 ([9]). Let G be a graph on v(G) vertices and e(G) edges. Furthermore, let n be an integer, 0 ≤ p := p(n) ≤ 1 and X n the random variable that the number of subgraphs of G(n, p) that are isomorphic to G. ThenX The proof of our main result, Theorem 18 which in particular implies Theorems 4 and 7 follows a similar structure as that of Theorem 8. Specifically, in order to show that the stated conditions on p are sufficient, one considers three different ranges of p and determines the exact structure of the objects that actually affect the moments.
It turns out that when viewed through the lens of hypergraphs, these important objects will actually be essentially the same for both graphs and systems of linear equations. However, we want to stress that actually establishing this fact requires the delicate study of the more difficult linear algebraic structure of the patterns we want to take into account, and hence our work highly differs from [9].
Plan of the paper In Section 2 we will state some needed prior results from [11] and [16], as well as prove intermediate lemmas that are needed to establish a meta result, namely Theorem 18 from which one can deduce Theorems 4 and 7. Section 3 will then be devoted entirely to stating and proving this main meta result, Theorem 18. To do this, we will be using the so-called method of moments to analyze three distinct regimes of the probability p. We conclude with a discussion on some remaining open problems in Section 4. In particular, we will investigate the question of whether the sufficient conditions on p in Theorems 4, 7 and 18 are also necessary.

Algebraic properties of systems of linear equations
Let us start by investigating the relation between proper and non-trivial solutions. For any A ∈ Z r×m and b ∈ Z r it is clear that Furthermore, suppose s > 1 and we fix a specific ( x j · δ i∈C j satisfies p(x) = p and Ax = b (here δ i∈C j denotes the indicator function which takes values 0 or 1 if i ∈ C j and i ∈ C j , respectively). So we get the following lemma, which is essentially Lemma 1.10 in [16].

Lemma 9 ([16]
). Let m > r be positive integers, A ∈ Z r×m , and b ∈ Z r . Then for any partition p ∈ P(A) and any subset T ⊂ Z, it holds that Lemma 9 will be very helpful because if A is positive, then the approximate size of S 0 (A, b) is not difficult to determine for any vector b. Specifically, we have the following result which is Lemma 1.4 in [16].

Lemma 10 ([16]). Let m > r be positive integers,
In [11] the authors managed to find a threshold probability function for when non-trivial solutions to a homogeneous system of linear equations appear in [n] p , but for our current purposes we need to make one more specification regarding types of solutions. The issue essentially lies in the fact that while A is positive, the same might not hold for A p for some p ∈ P(A). Note that this can only happen in the case b = 0, since clearly S 0 (A p , 0) = ∅ implies that A p is positive. As a problematic example, consider for instance the matrix A = 1 1 1 1 −1 which is positive since for instance x = (1, 2, 3, 4, 10) ∈ N 5 is a proper solution to the system Ax = 0. But taking b = 6, we see that y = (1, 2, 3, 3, 3) satisfies Ay = b. We have p(y) = ({1}, {2}, {3, 4, 5}) and hence A p(y) = 1 1 1 which is clearly not positive.
The problem arises as follows: if A p is not positive but we still have a significant number of solutions of this type -this can for instance be achieved by using the previous example as a gadget in a larger system -the number of solutions in the random binomial set will depend heavily on whether or not a specific bounded number of elements are included, and so getting any good results on their distribution is unlikely.
Having stated this motivation, we first introduce the concept of positive partitions as the subset P 0 (A) ⊂ P(A) such that For any P ⊂ P(A) we can now define the concept of type-P solutions.
Non-trivial solutions are therefore the same as type-P(A) solutions. The following is a straightforward consequence of Lemma 9 and the upper bound in Lemma 10.

Furthermore, if A is also abundant, then
Proof. We will consider each partition p ∈ P separately. Due to Lemma 9 we can consider y ∈ S 0 (A p , b) ∩ [n] |p| . If y contains an element of Z, then there exists some non-empty index set Q ⊂ [|p|] and a vector z ∈ Z |Q| such that A p is positive, which implies that Indeed, the positivity implies that the removal of any single column will not change the rank of the resulting matrix, and since the rank of a matrix is the dimension of its column space, any subsequent removal decreases the rank by at most 1. Hence Since there are only O(1) choices for p, Q and z, this implies the result. For the second part, note first that for any |p| < m, Relation (1) actually implies the statement already, so it only remains to consider the case |p| = m, that is, y ∈ S 0 (A, b) ∩ [n] m . If A is abundant, the removal of any two columns keeps the rank constant, while any subsequent removal will decrease it by at most 1 at a time. Hence for any Q ⊂ [m] with |Q| ≥ 2 we have which is what we wanted to show.

Compounded matrices
While Lemma 12 showed that the upper bound of Lemma 10 is always helpful in the situation of counting solutions with some entries fixed beforehand, the requirement of positivity is sometimes too restrictive to do the same for the lower bound. Suppose x ∈ S 1 (A, b) is some fixed nontrivial solution. We want then to count the number of solutions y ∈ S 1 (A, b) that intersect is some index set and x ∈ {x} |Q| . Lemma 12 with Z = {x} immediately gives helpful upper bounds, but there are two issues when trying to apply the lower bound of Lemma 10 directly.
The first is that when summing over several distinct Q, the same solution will be counted multiple times, but this could be alleviated by just counting a single Q that maximizes the cardinality of the corresponding set. The bigger issue is that it is not clear at all that the matrix A Q p(y) will be positive, and in fact this is not true in general even when A p(y) is itself positive. Consider for instance the matrix A = 1 1 −1 (associated with the Schur equation x + y = z), which is both positive and abundant. Then for any Q with |Q| = 1, the equation A Q (x, y) = 0 implies either x = y or x = −y, so in either case positivity will be violated.
Instead, we will consider the concept of the compounded matrix already used in [11] to study the distribution at the threshold. For matrices where a 1 , . . . , a m A denote the columns of A and b 1 , . . . , b m B the columns of B. Note that while we can apply this operator iteratively, the operation is in general not associative or even well-defined, that is An exception to this is the case when dom(M ) = ∅, which we will abbreviate by writing A . ×B. In general, whenever no parentheses are used the compounded matrix is implied to be constructed An important example that will appear often concerns the case A = B and M = id Q the identity function for some index The reason to study these matrices is natural:  We first state an easy but important property that any compounded matrix satisfies: λ 1 c 1 + · · · + λ r 1 +r 2 c r 1 +r 2 must satisfy λ 1 = · · · = λ r 1 = 0. But since the remaining r 2 columns were also linearly independent, we must have λ r 1 +1 = · · · = λ r 1 +r 2 = 0 as well, and hence the r 1 + r 2 columns c 1 , . . . , c r 1 +r 2 are linearly independent. Since the rank of a matrix is the dimension of its column space we are done.
Next we will show that for certain compounded matrices this is indeed the correct rank.
Let r 1 = rk(A) and r 2 = rk(A Q ) and suppose we have r 1 + r 2 + 1 row vectors from this matrix. If at least r 1 + 1 rows come from the upper half, then there exists a nontrivial representation of 0 ∈ Z 1×(2m−|Q|) , and hence the remaining coefficients can be set to 0. If on the other hand more than r 2 + 1 rows come from the bottom half, we can achieve a nontrivial representation of 0 only using these rows. Since one of those two cases must happen, we have the required upper bound.
The next result shows how positivity of A can (under some conditions) be passed on to a compounded matrix. Note that if A ∈ Z r×m is abundant, then any Q ⊂ [m] of size |Q| = 1 will satisfy the requirements of Lemma 15. The next result shows that even in the case that Q itself does not satisfy them, we can find a superset that does, which will help us to get universal lower bounds that are sufficient for our applications.

Lemma 16.
Let m > r be positive integers, n ∈ N, A ∈ Z r×m positive, and b ∈ Z r such that Proof. If Q satisfies the assumptions of Lemma 15, we see that A  by c 1 , . . . , c m ∈ Z r the columns of A, and let Q 1 ⊂ Q be the index set of the columns that will always have coefficient zero in a linear combination of the zero vector. We claim that these columns are linearly independent and their span does not contain any column c i with i ∈ Q \ Q 1 . Indeed, linear independence holds because any non-trivial linear combination i∈Q 1 λ i c i = 0 could be extended to a linear combination i∈Q λ i c i = 0 such that λ i = 0 for at least one i ∈ Q 1 , a contradiction to the definition of Q 1 . Similarly, if i∈Q 1 λ i c i = c j for some j ∈ Q \ Q 1 , we obviously see that i∈Q 1 λ i c i − c j = 0 is a linear combination with at least one λ i = 0, again a contradiction.
We thus see that defining Q ′ = Q ∪ Q 1 , the matrix A Q ′ has rank rk(A Q ′ ) = rk(A Q ) − |Q 1 | and its number of columns is m − |Q ′ | = m − |Q| − |Q 1 |, and hence By construction, A Q ′ satisfies the assumption of Lemma 15, and hence for n ∈ N, applying Lemma 10 implies which is what we wanted to show. Lemmas 14 and 15 actually apply to a more general iterated construction. Using the assumptions made in Lemma 13, write Q = {q 1 < · · · < q |Q| }, and define for any integer j ≥ 1 the bijection M j : [j(m − |Q|) . Then all the results mentioned in the aforementioned lemmas also apply in a natural way to the matrix for any t ≥ 1. The idea being that instead of just having a pair of proper solutions to Ax = b, we now have a collection of t + 1 of them that all mutually intersect exactly in the variables indexed by Q. Specifically, we get the following result. Proof. The proof is identical to that of Lemmas 14 and 15, noting that since A is abundant, for any Q ⊂ [m] of size 1 it will hold that S(A Q , 0) contains a solution with all entries non-zero.

The distribution of type-P solutions: the main meta theorem
We now state our main result (which we call meta theorem) that will imply Theorems 4 and 7.  Proof of Theorem 7. We see that for any p ∈ P(A), whenever S 0 (A p , 0) = ∅, the matrix A p is positive by definition, that is p ∈ P 0 (A). Since A is strictly balanced, it holds that for any p ∈ P(A) with |p| < m and hence np c(Ap) → ∞.
The proof of Theorem 18 involves the analysis of several sub-cases, which we will split into different subsections. In general, it will follow the ideas that were used by Ruciński in [9] where he proved a similar result to Theorem 18 in order to determine the distribution of the number of occurrences of a fixed graph G as a subgraph in a binomial random graph.

The proof of Theorem 18
Before going into further case analysis, let us first discuss the common jumping off point and strategy. Let µ k = E((X n − E(X n )) k ) denote the k-th central moment of X n associated to the system of linear equations Ax = b. Our final goal will always be to show that, independently of the system studied, These estimates would show that the moments E(X k n ) converge to the moments of a normal distribution, which is uniquely determined by its moments.
Given a solution x = (x 1 , . . . , x m ) in S (A, b), denote by I x the indicator random variable for the event {x 1 , . . . , x m } ⊂ [n] p . Abusing notation somewhat, if χ = (x 1 , . . . , x k ) ∈ S(A, b) k is a k-tuple of solutions, we will write {χ} = k i=1 {x i }. As a visual shorthand, bold latin letters will indicate solutions, while greek letters will denote tuples of solutions. Note first that by definition we have where S k is the set containing all k-tuples χ = (x 1 , . . . , Equation (4) behaves slightly different depending on the behavior of p, so suppose p → a for some constant a ∈ [0, 1]. We split the analysis in three cases, depending on wether this limit belongs to (0, 1), is equal to 1 and is equal to 0.

Case 1: 0 < a < 1.
In this case we see that (4) implies µ k = S k Θ(1), so we need to analyze the cardinality of S k . We are going to prove (3) by induction on k. The base cases k = 2 and k = 1 are clearly always true. Our induction hypothesis tells us then that for any ℓ < k it holds that µ ℓ = O(µ ℓ/2 2 ) and hence by our previous observation Let us now study the statement for parameter k. The analysis will be different depending on wether k is odd or even. Let S ′ k denote the subset of S k containing all k-tuples of solutions (x 1 , . . . , x k ) with x i ∈ (S P (A, b) ∩ [n] m ) such that for every i the choice of j = i with {x i } ∩ {x j } = ∅ is unique, that is, the solutions can be grouped into k/2 pairwise disjoint pairs. Note that clearly, S ′ k = ∅ for every odd k.
To see that this is true, suppose χ := (x 1 , . . . , x k ) ∈ S ′ k . Then there must exist an index is contained in S k−1 . Taking the minimum choice of i for each χ, we have thus defined a map π : S ′ k → S k−1 , and hence

Since by induction |S
We will first determine a lower bound on |S 2 |. It is clear that . Since A is abundant, any Q containing exactly one index will satisfy the conditions of Lemma 15, so fix an arbitrary one. It follows by this and Lemma 10 applied to We now turn to giving an upper bound on max υ∈S k−1 |π −1 (υ)|, so fix an υ ∈ S k−1 . By definition of S k , any solution that could extend υ to a k-tuple in S k must intersect {υ}, and hence by Lemma 12 we see that by the previously obtained lower bound (5). Since S ′ k = ∅ for odd k, this also proves the odd case of (3).
We now turn to the case of even k, noting that (3.1.1) tells us that essentially, the tuples that are summed over in µ k are k/2 pairwise disjoint pairs of those summed over in µ 2 . We begin by showing that for all but a negligible amount, pairs (x, y) ∈ S 2 will satisfy x, y ∈ S 0 (A, b) and |{x} ∩ {y}| = 1. Let us first see that we can restrict ourselves to pairs of proper solutions. This follows from a similar argument as was used in the proof of Lemma 12. Namely, if we fix an arbitrary x ∈ S P (A, b), then by (1) Since there are at most n m−rk(A) choices for x, this is negligible when compared to |S 2 |. Applying the second part of Lemma 12 directly, the number of pairs of proper solutions that intersect in at least two elements is negligible as well. Note that for any pair (x, y) satisfying the structure described above, we have E((I x −p {x} )(I y −p {y} )) ∼ (a 2m−1 (1 − a)).
Finally, note that for any pair χ ∈ S 2 , the number of pairs υ ∈ S 2 that share elements with χ is negligible: Clearly, any such υ will consist of an x ∈ S 0 (A, b) and hence, as p tends to a ∈ (0, 1), which is what we wanted to prove.

Case 2: a = 1.
Recall that p is chosen such that n(1 − p) tends to infinity. This property will be specially relevant in this case. We will instead look at the complements, so if x is a solution to Ax = b, define q = 1 − p andĪ x = 1 − I x . We see that ν x := E(Ī x ) = 1 − p |{x}| ∼ |{x}|q when n tends to infinity. Let χ = (x 1 , . . . , x k ) ∈ S k be a k-tuple of solutions in S P (A, b) ∩ [n] m . Using these definitions we see that (4) can be written as We associate then Furthermore, it is clear that for any t ∈ [k − 1], every (k − t)-sub-collection of χ will require removal of at least s − t elements to destroy all solutions in it, that is, for any I ⊂ [k] with |I| = k − t it holds that Note here, we interpret χ as a k-vector in which every coordinate is itself an m-vector, so χ I denotes the vector obtained by only keeping the solutions indexed by I. Putting this together, we see that (6) becomes where S k,s is the subset of k-tuples χ ∈ S k such that τ (H(χ)) = s. We will now show that for any fixed s, only χ of a certain structure contribute significantly to (7). For this, define by S ′ k,s the set of s-milky ways, which are χ ∈ S k,s such that H(χ) is the union of s disjoint components, with all edges in a component intersecting in a unique vertex. Note that this set is only non-empty for s ≤ ⌊k/2⌋. Furthermore, note that each component corresponds to a matrix as described in (2) We will now show by induction on k that for any k and s, This holds trivially for k = 1, and for k = 2, it follows immediately from Lemma 12, since A is abundant and s = 1 being the only vertex cover number leading to a non-empty set. So suppose k ≥ 3 and note that the bound (9) together with the induction hypothesis in particular implies |S ℓ,s | = O(n ℓ(m−rk(A)−1)+s ) for any ℓ < k. We will split up S ′ k,s even further, into disjoint sets S ′′ k,s , S ′′′ k,s , and S ′′′′ k,s , which are defined as follows: a) The set S ′′ k,s will contain all χ = (x 1 , . . . , x k ) ∈ S ′ k,s such that there are solutions x i , x j ∈ χ satisfying {χ [k]\{i,j} } ∩ {χ {i,j} } = ∅, and |{x i } ∩ {x j }| ≥ 2.
b) The set S ′′′ k,s contains all χ = (x 1 , . . . , x k ) ∈ S ′ k,s \ S ′′ k,s , such that there exists a solution are the remaining non-milky ways.
We proceed to analyze the cardinality of each set. a) We start with case S ′′ k,s . For every χ ∈ S ′′ k,s , there are indices i and j such that χ [k]\{i,j} ∈ S k−2,s−1 , and taking the lexicographic smallest pair we have defined a map π : S ′′ k,s → S k−2,s−1 , which implies b) We continue with the analysis of S ′′′ k,s . In this case, taking the smallest possible index i, this again defines a map π : S ′′′ k,s → S k−1,s ∪ S k−1,s−1 and we see by induction hypothesis that Again, for any υ ∈ S k−1,s ∪ S k−1,s−1 , we see that |π −1 (υ)| is at most the number of solutions y ∈ S P (A, b) c) Finally, it remains to look at S ′′′′ k,s . If χ = (x 1 , . . . , x k ) in S ′′′′ k,s , we claim that there must exist an index i such that χ [k]\{i} ∈ S k−1,s−1 . In the sequel, we will only consider the components of H(χ) that do not intersect in a unique vertex, of which there is at least one since χ / ∈ S ′ k,s . First note that it is clear that since χ / ∈ S ′ k,s ∪ S ′′ k,s , there must exist an index i such that χ [k]\{i} ∈ S k−1,s−1 ∪ S k−1,s or in other words, the remaining solutions are still intersecting. Since χ / ∈ S ′′′ k,s , for all of these indices, it must hold that |{x i } ∩ {χ [k]\{i} }| = 1. Finally, for at least one index i of the previously considered, it must actually hold that there is a unique j i such that Indeed, since the components we consider are not sunflowers, there must exist an x ℓ that intersects the rest of its component in at least two points, and so the negation of (12) would imply χ ∈ S ′′′ k,s , since x ℓ would be a valid choice. We see that for any valid i satisfying (12) it holds that χ [k]\{i} ∈ S k−1,s−1 . Having established this, we repeat the arguments already used, taking the minimal valid i and defining the appropriate function π : S ′′′′ k,s → S k−1,s−1 and conclude Together with (8), we see that (10), (11) and (12) imply that only s-milky ways contribute meaningfully when s ≤ ⌊k/2⌋. Sadly, this bound is not quite strong enough when s > ⌊k/2⌋, since s-milky ways do not exist here. For this, we will prove by induction on k that for any k and s ≥ ⌊k/2⌋ it holds that Again, the cases k = 1 and k = 2 follow from previous observations. Moreover, the statement is trivially true for any k and s = ⌊k/2⌋, so suppose k ≥ 3 and s > ⌊k/2⌋. If χ = (x 1 , . . . , x k ) ∈ S k,s , then there must exist a least index i such that χ [k]\{i} ∈ S k−1,s ∪ S k−1,s−1 , which allows us again to define a projection map π : S k,s → S k−1,s ∪ S k−1,s−1 . Since s ≥ ⌊k/2⌋ + 1 we have s − 1 ≥ ⌊(k − 1)/2⌋, and hence by induction hypothesis we have which in particular implies (13). Since q → 0, we thus have and so the terms with s > ⌊k/2⌋ in (7) can be discarded. Now, if χ ∈ S ′ k,s is a milky way, we see that since the removal of any single solution does not impact the cover number, while any subsequent removal can reduce it by at most one. Furthermore, by our previous observations we have already seen that for any k and s ≤ ⌊k/2⌋ |S ′ k,s |q s = Θ(n k(m−rk(A)−1) (nq) s ), so since nq → ∞ by assumption only the s = ⌊k/2⌋ term contributes significantly. Hence (7) can be rewritten as This proves (3) for odd k, since For the even case, it is easy to see that one can repeat the argument from Case 1 to show that only those milky ways with all solutions being proper contribute meaningfully and that among those, the number of pairs in S ′ 2,1 that intersect another pair is negligible, so just like in that case we conclude

Case 3: a = 0.
Finally, in this case the crucial property of p that will be used is the fact that np c(Ap) tends to infinity for every considered partition type p ∈ P with non-empty solution set. Let χ = (x 1 , . . . , x k ) ∈ S k . Then for any I [k] we see that since every x j with j / ∈ I intersects at least one other solution. Since we clearly have Furthermore, since np c(Ap) → ∞ for any p ∈ P, wee see that for any nonempty Q ⊂ [|p|], it holds that n |Q|−r Q (Ap) p |Q| = ω(1).
Let us prove (3) via induction on k, the cases k = 1 and k = 2 clearly being true. Note that in particular, the induction hypothesis implies that for any ℓ < k it holds that µ ℓ = O(µ ℓ/2 2 ). Let us decompose µ k as Recall that S ′ k denoted the subset of S k such that every x ∈ χ had a unique partner y ∈ χ and was disjoint from all other components. Our first step will be to show that which would in particular imply (3) in the case of odd k, since here S ′ k = ∅. To see this, note that χ = (x 1 , . . . , x k ) ∈ S k \ S ′ k implies that there must exist an index i ∈ [k] such that χ [k]\{i} ∈ S k−1 . Let i χ denote the least index i ∈ [k] for which this is true, then we can define the map π : S k \ S ′ k → S k−1 by π(χ) = χ [k]\{iχ} . Furthermore, let Q χ ⊂ [|p(x iχ )|] denote the index set of all the components of x χ that are contained in {π(χ)}. Then n |p|−rk(A)−(|Q|−r Q (Ap)) p |p|−|Q| Since ) it thus suffices to show that the remaining expression in (16) is o(µ 2 1/2 ). But by Lemma 16 there exists a Q ′ ⊃ Q such that, using p → 0 we see that which follows from (3.1.3). As stated before, this finishes the proof in the case of odd k, so suppose k is even. If p, q ∈ P and M : P → Q is a bijection between nonempty P ⊂ [|p|] and Q ⊂ [|q|], we will call the triple (p, q, M ) leading if and hence Let us first make an observation that will be helpful later. Suppose |p| ≥ |q|, and let M : P → Q be a bijection between some index sets P ⊂ [|p|] and Q ⊂ [|q|]. Then by Lemma 16 there exists a P ′ ⊃ P such that using np → ∞ and p → 0 we see Here the last line followed from the fact that rk(A p |S 0 (C ∩ [n] 2|p|−|P | )| = Θ(n 2|p|−(|P |−r P (Ap)) ), and v) (p, p, id P ) and (q, q, id Q ) are leading overlaps, since otherwise some of the Ω terms in Equation (17) would turn into 'ω' terms. Note that these things in particular imply that there cannot exist any P ′ P satisfying since p → 0 would then imply that (p, p, id P ) is not a leading triple. By the proof of Lemma 16 this means that there does not exist any index i ∈ [|p|] such that x i = 0 for every solution x ∈ S (A p , 0). Before continuing, let us quickly try to understand the preceding statements. Essentially, one would naively hope that all leading triples are of the form (p, p, id P ) for some P ⊂  A p , b) and S 0 (A q , b).
We can now continue with the actual proof by defining S ′′ k ⊂ S ′ k to be the set of k-tuples χ = (x 1 , . . . , x k ) such that whenever 1 ≤ i < j ≤ k and |{x i } ∩ {x j }| = s > 0, it holds that (p(x i ), p(x j ), M i,j ) is a leading triple. Here M i,j is the bijection defining the incidences between the distinct components of x i and x j . We will show that To see this, we will again define a map π : is not a leading triple, let {i χ , j χ } be the set minimizing min{i, j}. Then we can define π(χ) = χ [k]\{iχ,jχ} and see that  (p 1 , q 1 , M 1 ), . . . , (p u , q u , M u ), and for any choice of 0 ≤ i 1 , . . . , i u ≤ k/2 such that i j = k/2, define the matrix A(i 1 , . . . , i u ) by iu times , and write ℓ j for the expression |p j | + |q j | − | dom(M j )|. We also need the following result.

Concluding remarks
In this paper we have established sufficient conditions for the choice of p in order to guarantee normal limiting distributions for the number of solutions to linear systems of equations of the form Ax = b. Specifically, we showed that n(1 − p) → ∞ and np c(Ap) → ∞ (for all partitions under consideration) sufficed, and we used them in their full strength in Cases 2 and 3, respectively of the proof of Theorem 18. Both of these conditions are analogous to those that appear in Ruciński's proof of normality for the number of copies of a given subgraph H in the binomial random model G(n, p). Note that when comparing to the graph setting, the elements of [n] in systems take on the function of both vertices and edges, and hence the analogue of our n(1 − p) → ∞ requirement is to ask that n 2 (1 − p), the expected number of edges, is unbounded.
In fact, in [9] Ruciński showed that those conditions were necessary as well as sufficient. In our context we can say something similar regarding the condition that n(1 − p) is unbounded. Observe that the expression n(1 − p) can be interpreted as the expected number of elements that are not chosen in [n] p , hence if n(1 − p) → ∞, then [n] p is typically all the interval [n] with the exception of a bounded number of elements. Then it is easy to show that under this condition, X n is strongly concentrated around its mean value, concluding thatX n d − → 0, and hence, we de not have a normal limiting distribution for the number of solutions.
However, the argument used by Ruciński in order to study the second condition requires a delicate study of moments of order 4 and 6, which we are unable to adapt. Roughly speaking, in our algebraic setting the structure of solutions is more complicated compared with the graph setting, as solutions with repeated components can be valid ones (the analogy would be to consider also subgraphs of the fixed graph H that we want to count the corresponding number of subcopies). Hence, an open question arising from our work is to obtain an only if statement of Theorem 18. As an intermediate step, one could investigate the two main corollaries of our meta theorem, Theorems 4 and 7, and in fact, the following arguments suggest that the sufficient conditions on p stated in these theorems are also necessary.
Let us start with the setting of Theorem 7, that of non-trivial solutions in strictly balanced homogeneous systems of linear equations. Here, as discussed before, Rué, Spiegel and Zumalacárregui in [11] already established a threshold result showing that when np c(A) = o(1), asymptotically almost surely S 1 (A, 0) ∩ [n] m p is empty. Using a concentration argument similar to the one above when discussing the necessity of n(1 − p) → ∞, we see that in this case |S 1 (A, 0) ∩ [n] m p | will converge in distribution to the constant 0 distribution. In addition to this threshold results, the authors also studied the distribution at the threshold, that is, the case np c(A) → a > 0 where a is a constant. Their results show that here, the random variable |S 1 (A, 0) ∩ [n] m p | converges in distribution to a Poisson. Putting all of this together, we see that this establishes the only if direction for Theorem 7.
Let us now turn to the setting of Theorem 4, that of proper solutions, where there are no repeated variables by definition. The argument that np c(A) = o(1) implies thatX n d − → 0 is the same as before, so what remains is to understand the case when np c(A) tends to some positive constant. When A is strictly balanced, one can use the same arguments that were used by Rué, Spiegel and Zumalacárregui in the proof of the strictly balanced homogeneous case for the distribution of non-trivial solutions to see that one will also have a Poisson distribution in the setting of Theorem 4. When A is not strictly balanced, an analysis as was performed by Ruciński in the subgraph setting is needed, but since we are now in the situation that all variables are pairwise distinct, the same result should follow, which would establish the necessity of np c(A) → ∞.
To conclude, let us mention that once one has proved a normal limiting distribution, a natural next question is the study local limit theorems as well as anti-concentration results and tail estimates in a general context. This has been a very active trend of research in the last years, see for instance [18,3].