Generating Product Systems

Generalizing Krieger's finite generation theorem, we give conditions for an ergodic system to be generated by a pair of partitions, each required to be measurable with respect to a given sub-algebra, and also required to have a fixed size.


Introduction
Let (Z, FZ, µ) be a probability space, and let T : Z → Z be a measurable, probability preserving transformation of Z. If µ is ergodic with respect to T , we say that (T, FZ, µ, T ) is an ergodic system.
A partition P of Z is called generating if the sigma-algebra generated by W ∞ −∞ T i P and the sigma-algebra of null and co-null sets is equal to FZ . If P has a parts, then for any W T i P -measurable partition Q, the entropy h(T, Q) is less than or equal to log a. Hence, if h(T ) > log a, then there are no generating partitions with a parts. In [4], Krieger proved the following partial converse: Theorem 1.1 (Krieger). Let (Z, FZ, µ, T ) be an ergodic system, and let a be an integer such that log a > h(T ). Then there is a generating partition P of Z that has a parts.
The case where h(T ) is equal to log a is different; there is a unique ergodic system for which there is a generating partition with a parts (namely, the Bernoulli system).
Krieger's theorem can be thought of as an infinite version of Shannon's coding theorem for noiseless channels. In Shannon's theorem the block size is finite and the error remains positive but converges to zero as as the size of the block grows. Shannon's theorem for more general channels was also given a zero error version using inifinite codes, as is usual in ergodic theory in a series of papers including [2] and [3] In information theory there is a surprising extension of Shannon's coding theorem to correlated sources due to Slepian and Wolf, [6]. Our main result is a zero error version of the Slepian-Wolf result for noiseless channels. Combining the techniques of Kieffer with those of this paper would lead to a zero error version of the Slepian-Wolf theorem in the context of noisy channels, but we leave these extensions for the future.
In order to describe the result, consider the following scenario: Suppose that FX , FY ⊂ FZ are sub-sigma-algebras and that FZ = FX ∨ FY (without loss of generality, we may assume that Z = X × Y is a product space, and that FX , FY are the corresponding sigma-algebras). We seek a pair of partitions PX , PY such that PX is FX -measurable and has a parts, PY is FY -measurable and has b parts, and the partition PX ∨PY generates. There are three necessary conditions for such partitions to exist: the first is that the conditional entropy of T with respect to FY , which we denote by h(T |FY ), is less than or equal to log a, the second condition is that h(T |FX ) ≤ log b, and the third condition is that h(T ) ≤ log a + log b. As in Theorem 1.1, we show that these conditions are almost sufficient: Theorem 1.2. Let (Z, FZ , m, T ) be an ergodic and probability preserving system, let FX , FY be invariant sub-sigma-algebras of FZ , and let a and b be integers. Assume that log a > h(T |FY ), that log b > h(T |FX ), and that log a + log b > h(T ). Then there are partitions PX and PY of Z such that 1. PX is measurable with respect to FX and has a parts.
2. PY is measurable with respect to FY and has b parts. 3. PX ∨ PY generates the sigma-algebra FX ∨ FY .
More generally, it is straightforward to generalize Theorem 1.2 to the case of n partitions. Namely, assume that there are n sigma-algebras, F1, . . . , Fn, and n integers a1, . . . , an. If for every non-empty subset S ⊂ {1, . . . , n}, then there are partitions P1, . . . , Pn such that Pi has ai parts and is Fimeasurable, and that P1 ∨ . . . ∨ Pn generates. The proof of Theorem 1.2 is given in Section 4, and consists of two parts. In the first part (Proposition 4.2) we show that there are partitions PX , PY that satisfy the first two requirements, and such that the partition PX ∨ PY is close to being a generating partition (in the sense that the sigma-algebra generated by it contains a partition which is close to a fixed generating partition of Z). This is done by a procedure of 'painting names along Rohlin towers'. In the second part (Proposition 4.4) we show that one can slightly change these partitions and get partitions that are even closer to being generating. The proof of Proposition 4.4 uses the procedure of 're-painting names along Rohlin towers'. Applying Proposition 4.4 inductively, we get a Cauchy sequence of partitions that converges to a generating partition, which proves Theorem 1.2.
The procedures of painting and repainting along Rohlin towers are explained in Section 3. Finally, Section 2 contains several auxiliary results, the important one is a corollary of the Shannon-McMillan theorem. No novelty is claimed for this material; it is assembled for the reader's convenience.

A (Rohlin-) tower is a pair
and M is a natural number, such that for any 0 < i < M , the sets A and T i A are disjoint.
2. If T = (A, M ) is a tower, the set A is called the base of the tower, and is denoted by bT ; the number M is called the height of the tower, and is denoted by |T |.
3. If ǫ is a real number, we say that a tower T = (A, M ) covers more For a proof of the following theorem, see [5].
Theorem 2.4. Let (Z, FZ , µ, T ) be an ergodic system, let f : Z → C be a bounded function, and let P be a partition of Z. For every ǫ > 0 there is a natural number N such that if T is a tower of height greater than N that covers more than ǫ of the space, then 1. The set of points z in the base of T that satisfy has measure greater than (1 − ǫ)µ(A).
2. There is a collection N of 2 (h(Z,P )+ǫ)|T | elements in |P | |T | such that (a) For every n = (ni) ∈ N , the measure of the set of z's in A whose (T , P )-name is equal to n is between 2 −(h(Z,P )+ǫ)|T | and For a set Ω, we denote by 2 Ω the collection of finite subsets of Ω.
Corollary 2.6. Let (Z, T ) be an ergodic system, and let P, Q be two partitions of Z. Denote the sub-algebra W ∞ i=−∞ T i P by F. For every ǫ > 0, there is an N such that if T = (A, M ) is a tower of height greater than N that covers more than ǫ of the space, then there is a function Φ : NP → 2 N Q such that 1. For every z ∈ A, the set Φ(νT ,P (z)) has less than 2 |T |(h(T,Q|F )+ǫ) elements.
2. The set of z's in A such that Φ(νT ,P (z)) contains the (T , Q)-name of z has measure greater than (1 − ǫ)µ(A).
Proof. By Theorem 2.4, if N is large enough, then there are collections such that for any n ∈ N , the measure of the set of points whose (T , P )name is equal to n is less than 2 −|T |(h(T,P )−ǫ) , for every m ∈ M, the measure of the set of points whose (T , P × Q)-name is equal to m is greater than 2 −|T |(h(T,P ×Q)+ǫ) , and the set of points whose (T , P )-names are not in N or whose (T , P × Q)-names are not in M has measure less than ǫµ(A). Let π1 : |P × Q| |T | → |P | |T | and π2 : |P × Q| |T | → |Q| |T | be the co-ordinate projections. Let n ∈ NP . If n ∈ N then We will use the standard notation for the entropy function: The following is well known. For a (much more) general statement, see [1, Theorem 3.9].
Lemma 2.7. Let P be a partition, let G be an invariant sub-sigma-algebra, and let T = (A, M ) be a G-measurable tower that covers more than 1 − ǫ of the space. If there is a G-measurable function Φ : A → 2 N P such that the number of elements in Φ(z) is less than 2 h|T | and the set of z ∈ A such that νT ,P (z) ∈ Φ(z) has measure greater than (1 − ǫ)µ(A), then h(T, P |G) < h + H(ǫ) + 2ǫ|P |.
Definition 2.8. Let P, Q be partitions of a probability space (Z, µ). Define Definition 2.9. Let P be a partition, let A be an algebra of sets, and let The following lemma is evident: Proposition 2.11. Let (Z, T ) be an ergodic system, and let P, Q be par- Then for every ǫ > 0, there is an K, such that for every tower T = (A, M ), which covers more than ǫ of the space and for which M > K, there is a function Φ : NQ → 2 N P such that Proof. By assumption, there is a number N , and a function φ : |Q| 2N+1 → |P | such that the set has measure greater than 1−δ. Let η > 0, to be chosen later. By Theorem 2.4, if T = (A, M ) is a tower that covers more than ǫ of the space, and M is sufficiently large, then the set of z ∈ A such that has measure greater than (1 − ǫ)µ(A). If we require in addition that N < ηM/2 we get that for those z's, For (qi) ∈ |Q| M , let Φ((qi)) be the set of elements in |P | M whose Hamming distance from ψ((qi)) is less than (η + δ)M . The size of Φ((qi)) is less than which is less than 2 (H(δ)+ǫ+δ|P |)M if η is taken small enough. Moreover, for every z that satisfies (1), νT ,P (z) ∈ Φ(νT ,Q(z)).
We end this section by the following lemma, which should be thought of a relative version of the claim that if f is a random function from a set of size a to a set of size b then the non-empty fibers of f have size Lemma 2.12. Let (Z, FZ, µ) be a probability space, and let Ω be a finite set. Assume that φ : Z → Ω, Φ : Z → 2 Ω are functions such that |Φ(z)| < a and φ(z) ∈ Φ(z) for every z ∈ Z. The probability that a random function Proof. Let B be the complement of the set in (2). Then Since the restriction of ψ to Φ(z) \ {φ(z)} is independent of ψ(φ(z)), and hence the integrand is smaller than ǫ. This means that Eµ(B) < ǫ, which implies the claim.

Painting and Re-painting
From this point on, fix an ergodic system (Z, FZ, T ). All partitions are of Z and all sigma-algebras are contained in FZ .
2. An ℓ-admissible sequence of length n on a symbols is a sequence in {0, . . . , a − 1} n whose first element is 1, and that does not contain a segment of consecutive 0's of length ℓ. The space of ℓ-admissible sequences of length n on a symbols is denoted by A(n, ℓ, a).
The following lemma is evident: This lemma will ensure that we can use the symbols 0 and 1 to mark the base of the tower and still have enough names for encoding.

A painting data φ induces a partition Q given by
1. For every tower T , and every (T , ℓ, a)-painting data φ that generates a partition Q, the base of T is measurable with respect to W ∞ −∞ T i Q. 2. Suppose that h(T ) ≥ log a and let ǫ > 0. There is ǫ ′ > 0 such that if ℓ is big enough, T is a tower that covers more than 1−ǫ ′ of the space, and |T | is big enough, then the probability that a random (T , ℓ, a)painting data ψ that induces a partition Q satisfies that h(T, Q) > log a − ǫ, is greater than 1 − ǫ.
Proof. The first claim follows, from the fact that z ∈ A if and only if P (z) = 1 and P (T −1 z) = . . . = P ( T −k z) = 0. As for the second claim, let ǫ ′ < ǫ be such that 2ǫ ′ | e P | + H(ǫ ′ ) < ǫ, and assume that T covers more than ǫ ′ of the space. By Theorem 2.4, there is a collection N of less than 2 (h(T )+ǫ ′ )|T | (T , P )-names such that the set of z ∈ A for which νT ,P (z) ∈ N has measure greater than (1 − ǫ ′ )µ(A). Applying Lemma 2.12 to the functions Φ(z) = N and νT ,P (z), we get that the probability that

A repainting data induces a partition of Z given by
Lemma 3.7. Let P be a ℓ-admissible partition, let T be a tower that covers more than 1 − η of the space, and let φ be a (T , P, ℓ, ǫ)-repainting data with associated partition Q. Then |P △Q| < ǫ + η.
Lemma 3.8. Let P be a ℓ-admissible partition.
1. For every tower T = (A, M ), if φ is a (T , P, ℓ, ǫ)-repainting data with associated partition Q, then A is measurable with respect to W ∞ −∞ T i Q.

If h(T, P ) > log a − η, then if T is sufficiently invariant and covers
more than 1 − η of the space, then the probability that the partition Q associated with a random repainting data ψ satisfies h(T, Q) > log a − η is greater than 1 − η.
Similarly to the proof of Lemma 3.5, there is a function Φ : A → 2 N P such that 1. Φ(z) depends only on the (T old , P )-name of T ǫ|T | z.
By taking the product of Φ and N , there is a function Ψ : A → 2 N P such that 1. Ψ(z) depends only on the (T old , Q)-name of T ǫ|T | z.
By Lemma 2.12, the probability that log a + η, which proves the claim.

Proof of Theorem 1.2
Let (Z, FZ = FX ∨ FY , µ, T ) be as in Theorem 1.2. By Krieger's theorem, there are generating partitions e PX and e PY for FX and FY respectively. Let ǫ0 > 0 be such that h(T ) < log a + log b − ǫ0.
The first part of the proof of Theorem 1.2 is to obtain a pair of partitions that are close to generate, in the following sense: and has measure greater than (1 − η)µ(A). Choose such a tower S that is measurable with respect to FY , covers 1 − η of the space, and such that |S| satisfies 2. Choosing the Partition PY : Choose a random function f : | e PY | |S| → A(|S| − ℓ, ℓ, b), and let PY be the partition obtained by painting f on S. It is clear that PY is FY -measurable. Define a function Φ1 from |PY | |S| to subsets of | e PY | |S| by By Lemma 2.12 and inequalities (5) and (6), most paintings satisfy that for all but 2η of the points z in A, and ν S, e P X × e P Y (z) = Ψ(ν S, e P X ×P Y (z)).
By Lemma 2.7, we get that 3. Choosing the Tower L: By Corollary 2.6 and Equation (9), if L is an invariant-enough tower with base B, then there is a function Φ2 from such that for all but η-portion of the points of B, Also, by making L more invariant and using Corollary 2.6, we can assume that there is a function Φ3 from | e PY | |L| to subsets of | e PX | |L| of size 2 |L|(h(T |F Y )+η) such that for all but η-portion of the points z in B, ν L, e P X (z) ∈ Φ3(ν L, e P Y (z)).
By composing Φ2 and Φ3 we get a function Φ4 from |PY | |L| to subsets of | e PX | |L| of sizes 2 |L|(h(T )−log b+2η(log | e P Y |+3)+H(η)) such that for all but 2η-portion of the points z in B, Finally, we require that L is FX -measurable and |L| satisfies log |A(|L| − ℓ, ℓ, a)| > |L|(log a − η) 4. Choosing the Partition PX : Choose a random function g : | e PX | |L| → A(|L| − ℓ, ℓ, a), and let PX be the partition obtained by painting g on the tower L. It is clear that PX is FX -measurable. Define a function Φ5 from |PX × PY | |L| to | e PX × PY | |L| by Φ5(n, m) =  (g −1 (n) ∩ Φ4(m), m) |g −1 (n) ∩ Φ4(m)| = 1 undefined else By Lemma 2.12, there is a function g as above such that for all but 3η of the points in B, ν L, e P X ×P Y (z) = Φ5(νL,P X ×P Y (z)).
5. Conclusion of the proof: For all but 2η of the points z ∈ Z, z is in the image of both L and S. Assuming this is the case, by looking at the PX -name of z we can find the smallest non-negative number i such that T −i z ∈ B. Denote w = T −i z. Similarly, the PYname of z determines the minimal non-negative integer j such that T −j z ∈ A. Denote u = T −j z. The PX × PY -name of z determines νL,P X ×P Y (u), and hence, for 1 − 3η of the points, determines ν e P X ×P Y (u) = Φ5(νL,P X ×P Y (u)). Let now j be the smallest nonnegative number such that T −j z ∈ A. The tuple ν e P X ×P Y (u) determines ν S, e P X ×P Y (T −j z) and so for 1 − η of the points determines ν S, e P X × e P Y (T −j z) = Ψ(ν S, e P X ×P Y (T −j z)). This, clearly, determines e PX (z) and e PY (z).
The second part of the proof of Theorem 1.2 is to improve the pair (PX , PY ) and make it "more generating". .
Proof. We can assume that ǫ, δ, and f (ǫ, δ) are less than 1/2. There is We can assume without loss of generality that 4N η < δ and that . Choose ℓ such that lim n→∞ 1 n log |A(n, ℓ, a)| > log a − η By changing QX slightly (and enlarging ℓ if needed), we can also assume that QX is ℓ-admissible.
In order to construct the partition PX , we choose a tower T = (A, M ) which is FX -measurable and covers more than 1 − η of the space, and repaint the first f (ǫ, δ)|T | levels of it using a random function ψ : | e PX | |T | → A(ℓ, f (ǫ, δ)|T |, a). We will show that if T is taken as sufficiently invariant, then with high probability (on ψ), the obtained partition-which we denote by PX -is good. By Lemma 3.7, |PX △QX | < f (ǫ, δ) + η.
Applying Proposition 2.11 to the pairs ( e PX , QX ∨ QY ) and ( e PY , QX ∨ QY ), if T is invariant enough, then there is a function Φ1 : NQ X ∨Q Y → 2 N e P X ∨ e P Y such that for any z ∈ T f (ǫ,δ)|T | A, the set Φ1(ν T old ,Q X ∨Q Y (z)) has at most 2 (H(ǫ)+H(δ)+ǫ| e P X |+δ| e P Y |+η)|T | elements, and the set of z ∈ T f (ǫ,δ)|T | A for which ν T old , e P X ∨ e P Y (z) ∈ Φ1(ν T old ,Q X ∨Q Y (z)) has measure larger than (1 − η)µ(A).
By Corollary 2.6 applied to the pair (QY , e PY ), if T is invariant enough, then there is a function Φ2 : NQ Y → 2 N e P Y such that for any z ∈ A, the set Φ2(νT new ,Q Y (z)) has at most 2 (h(T,F Y )−log b+ǫ 0 +η)|T new | elements, and the set of z ∈ A such that ν T new , e P Y (z) ∈ Φ2(νT new ,Q Y (z)) has measure greater than (1 − η)µ(A).
Applying Corollary 2.6 to the pair ( e PX , e PY ), if T is invariant enough, then there is a function Φ3 : N e P Y → 2 N e P X such that for every z ∈ A, the set Φ3(ν T new , e P Y (z)) has at most 2 (h(T |F Y )+η)|T new | elements, and the set of z ∈ A such that ν T new , e P X (z) ∈ Φ2(ν T new , e P Y (z)) has measure greater than (1 − η)µ(A).
Finally, if h(T, QX) > log a, then by Lemma 3.8, if T is sufficiently invariant, the entropy h(T, PX ) is greater than log a with high probability on the random function ψ. PX and e PY that are FX and FY measurable, have a and b parts, and generate FX and FY respectively. We assume in the following that this does not hold.
Choose ξ0 such that Proposition 4.4 applies to the pair (2ξ0, 2ξ0). Inductively choose a decreasing sequence ξn such that both the sums P n f (ξn, ξn) and P n f (ξn+1, 2ξn) converge, where f is defined in 4.3. If h(T, FX) < log a and h(T, FY ) ≥ log b, then by Krieger's theorem , there is a generating partition e PX to FX . We define a sequence of partitions P n Y as follows: by Proposition 4.2, there is a pair of partitions (PX , P 0 Y ) that is (ξ0, ξ0)-good. It follows that ( e PX, P 0 Y ) is also (0, ξ0)good. Assuming P n Y was defined, applying Proposition 4.4, there is a partition P n+1 Y such that ( e PX , P n+1 Y ) is (0, ξn+1)-good and |P n Y △P n+1 Y | < f (0, ξn) < f (ξn, ξn). Therefore P n |P n Y △P n+1 Y | < ∞, so there is a limit partition P ∞ Y . it follows that ( e PX , P ∞ Y ) satisfy the requirements of the theorem. The same proof holds if h(T, FY ) < log b and h(T, FX ) ≥ log a.
Assume finally that h(T, FX ) ≥ log a and h(T, FY ) ≥ log b. We define a sequence of pairs of partitions (P n X , P n Y ) as follows: by Proposition 4.2, there is a pair of partitions (P 0 X , P 0 Y ) that is (ξ0, ξ0)-very good. Assuming (P n X , P n Y ) have been defined and the pair is (ξn, ξn)-very good, applying Proposition 4.4 there is a partition P n+1 X such that (P n+1 X , P n Y ) is (ξn+1/2, 2ξn)-very good and |P n X △P n+1 X | < f (ξn, ξn). Applying Proposition 4.4 again, there is a partition P n+1 Y such that the pair (P n+1 X , P n+1 Y ) is (ξn+1, ξn+1)-very good and |P n Y △P n+1 Y | < f (ξn+1, 2ξn). Since by definition of the ξn's, the sums P |P n X △P n+1 X | and P |P n Y △P n+1 Y | converge, there are limit partitions P ∞ X and P ∞ Y . The pair (P ∞ X , P ∞ Y ) satisfies the requirements of the theorem.