Kolmogorov-Sinai entropy via separation properties of order-generated sigma-algebras

In a recent paper, K.Keller has given a characterization of the Kolmogorov-Sinai entropy of a discrete-time measure-preserving dynamical system on the base of an increasing sequence of special partitions. These partitions are constructed from order relations obtained via a given real-valued random vector, which can be interpreted as a collection of observables on the system and is assumed to separate points of it. In the present paper we relax the separation condition in order to generalize the given characterization of Kolmogorov-Sinai entropy, providing a statement on equivalence of sigma-algebras. On its base we show that in the case that a dynamical system is living on an m-dimensional smooth manifold and the underlying measure is Lebesgue absolute continuous, the set of smooth random vectors of dimension n>m with given characterization of Kolmogorov-Sinai entropy is large in a certain sense.

1. Introduction 1.1. Motivation. Kolmogorov-Sinai entropy of a µ-preserving map T on a probability space (Ω, F , µ) is an important concept in dynamical systems and ergodic theory. It is defined as the supremum of the entropy rates h µ (T, A) of all finite partitions A ⊂ F of Ω, which usually makes its determination complicated. In some exceptional cases, a generating partition is known allowing to determine the Kolmogorov-Sinai entropy on the base of only this partition (see, e.g., [18]), but generally one has to take into account an infinite collection of finite partitions.
Here the question arises whether such a collection is given in a natural way. An interesting approach leading to some kind of natural partitioning was given by introducing the concept of permutation entropy by C. Bandt and B. Pompe [5] (see also [2]). This quantity is based on only considering the order structure of a system and has been applied to the analysis of long time series, for example, of electroencephalograms and cardiograms. The point that Kolmogorov-Sinai entropy and permutation entropy coincide for piecewise monotone interval maps, as shown by C. Bandt, G. Keller and B. Pompe [4], gives rise to the question if both entropies are equivalent for a broader class of dynamical systems.
Remark. J. Amigó, M. Kennel, and L. Kocarev [3,1] have shown equivalence of Kolmogorov-Sinai entropy to a modified concept of permutation entropy which is structurally similar to that of Kolmogorov-Sinai entropy.
Here the idea is to measure complexity of a system via the 'observables' ξ 1 , ξ 2 , . . . , ξ n . For given d ∈ N, the set Ω is partitioned into sets of points ω ∈ Ω for which all vectors (ξ i (ω), ξ i (T (ω)), . . . , ξ i (T d (ω)), i = 1, 2, . . . , d are of the same order type. The larger d the more information on the system is given by the partition obtained in this way and called P Θ, T d here. The permutation entropy is defined as the upper limit of the Shannon entropy of the P Θ, T d relative to d for d → ∞.

It has been shown that under certain 'separation' conditions on (T, Θ) it holds
and that the permutation entropy with respect to ξ is not less than the Kolmogorov-Sinai entropy. Under validity of (1), the problem of equality of both entropies is reduced to a combinatorial problem related to the problem of equality of permutation entropy and the right side of (1) (see K. Keller, A. Unakafov and V. Unakafova [11]). Therefore, it is of some particular interest to find sufficient conditions for (1) being as general as possible. This is the central aim of the present paper. (1) is the equivalence of two σ-algebras with respect to µ in the case of ergodic T :

An outline. The main ingredient for showing
where Σ Θ, T is the σ-algebra generated by ∞ d=1 P Θ, T d . For making apparent the structural arguments, consider the third σ-algebra σ {Θ• T k } k≥0 , generated by Θ and their 'shifts' Θ•T, Θ•T 2 , . . .. The central statement of this paper is that for ergodic T which can be verified by standard arguments, this provides that in the ergodic case is equivalent to (2), hence sufficient for (1). The second ingredient for showing (1) is ergodic decomposition.
Condition (5) is substantially weaker than the corresponding statement in [10], allowing generalizations of consequences of the main statement therein. In particular, the application of embedding theory (compare [16] and [17]) is more apparent from the viewpoint of our paper, but it also turns out that the full power of this theory is not needed. In this paper we will show that the set of smooth maps Θ satisfying (1) which are not too far from being injective is large in a certain sense.
1.3. Organization of the paper. In Section 2 we give the basic definitions and formulate the main statements of the paper, which are Theorems 2.5 and 2.13. Section 3 is mainly devoted to the proof of Theorem 2.5. Here, the ideas given in [10] are lifted to a sufficiently abstract level, in order to extract the general structures and to find the necessary assumptions under which (1) is satisfied.
The proof of Theorem 2.13 is given in Section 4. As a preparation of the proof, we recall some definitions and statements from (differential) topology, as for example the Multijets transversality theorem, and deduce some statement being interesting from their own right.

Kolmogorov-Sinai entropy.
Let Ω be a non-empty set. For a family of subsets A = {A i } i∈I of Ω, denote by σ(A) the σ-algebra generated by A.
If Θ : Ω → X is a map into some topological space X, then we denote by σ(Θ) the σ-algebra on Ω of inverse images of the σ-algebra B(X) of Borel subsets of X under Θ.
If A = {A i } i∈I and B = {B j } j∈J are two partitions of Ω, then we define the new partition A ∨ B of Ω by

We write
A ≺ B if each element A ∈ A is a finite union of some elements of B.
Let F be a σ-algebra of subsets of Ω and µ be a measure on F . Denote by Π(F ) the set of all finite partitions A = {A 1 , . . . , A n } of Ω such that A i ∈ F for each i = 1, . . . .n. Then the entropy of A ∈ Π(F ) with respect to µ is defined by the formula Further, let T : Ω → Ω be a measurable map. Denote by T −1 A the partition of Ω consisting of all inverse images of elements of A: For each k ≥ 1 define the partition Evidently, τ 1 (τ k (A)) = τ k+1 (A).
Though the computation of Kolmogorov-Sinai entropy requires considering all finite partition of Ω belonging to Π(F ), the following lemma shows that this entropy can be obtained from certain increasing sequences of finite partitions.
. For a permutation π = (i 0 , . . . , i d ) of a set {0, . . . , d} define the subset O π of R d by the following rule: the point (x 0 , . . . , x d ) ∈ R d+1 belongs to O i 0 ,...,i d whenever Remark. Notice that each vector x = (x 0 , . . . , x d ) ∈ R d+1 can be regarded as a (d + 1)-tuple of pairs of numbers: This set can be uniquely lexicographically ordered in a decreasing manner: at first we sort them by values of x i , and then by their indices i. Thus we can associate to x a unique permutation π of indexes {0, . . . , d} which sorts the above set of pairs (6). Then O π consists of all x ∈ R d+1 that can be sorted by the same permutation π.
It is easy to see that the following family of sets is a partition of R d+1 .

2.3.
Ordinal partition of Ω. Now let Ω be a set, T : Ω → Ω and ξ : Ω → R be a function. Then for each d ∈ N we can define the following map Remark. Notice that each set P ξ, T π , π = (i 0 , . . . , i d ) consists of all ω ∈ Ω such that Remark. The partition P ξ, T d can also be described in the following way. For each pair (i, j) such that 0 ≤ i < j ≤ d define the partition of Ω by two sets: Then it is easy to see that is called the ordinal σ-algebra of Ω for (ξ, T ).
More generally, let Θ = (ξ 1 , . . . , ξ n ) : Ω → R n be a map. Then we define the partition and the σ-algebra The following lemma easily follows from (7) and (8) and we left it to the reader.
2.4. Main results. The following theorem gives sufficient conditions for the validity of (1).
Suppose also that one of the following conditions holds true: either (a) T is ergodic, or (b) T is not ergodic, however Ω can be embedded into some compact metrizable space so that F = B(Ω). Then .
We will now recall the notions of residuality and prevalence being respectively a topological and a measure-theoretic formalization of the expression "almost every".
A subset A of a topological space is residual if A is an intersection of countably many sets with dense interiors. A Baire space is a topological space in which every residual subset is dense. Every complete metric space is Baire.
Suppose now that V is a topological vector space, i.e. that it has a topology in which addition of vectors and multiplication by scalars are continuous operations.
Definition 2.7. Let µ be a nonnegative measure on the σ-algebra B(V ), and S ⊂ V be a Borel subset. Then µ is said to be transverse to S if the following two conditions hold: (i) There exists a compact subset U ⊂ V such that 0 < µ(U) < ∞.
The following lemma summarizes some properties of prevalent sets obtained in [9].
In general classes of residual and prevalent subsets of a complete metric space V are distinct and no one of them contains the other.
Let Ω and X be smooth manifolds and r = 0, 1, . . . , ∞. Then the space C r (Ω, X) admits two natural topologies weak, C k W , and strong, C k S . The following lemma collects some information about these topologies, see e.g. [8, Chapter 2] and [6, Chapter II, §3].
If Ω is compact, then these topologies coincide.
2) C r (Ω, X) is a Baire space with respect to each of the topologies C r W and C r S . 3) C r (Ω, X) admits a complete metric with respect to the weak topology C r W . 4) Suppose X = R n , so the space C r (Ω, R n ) has a natural structure of a linear space. Then C r (Ω, R n ) is a topological vector space with respect to the weak topology C ∞ W . However, if Ω is non-compact, then C r (Ω, R n ) is not a topological vector space with respect to the strong topology C r S , since the multiplication by scalars is not continuous. Again let Ω be a smooth manifold of dimension m. Remark. We can reformulate Definition 2.12 as follows. Let λ be a Lebesgue measure on R m and (U, ϕ) be a local chart on Ω. Since ϕ is an embedding, we can define the induced measure Then µ is Lebesgue absolute continuous if for any local chart (U, ϕ) the restriction of µ to B(U) is absolute continuous with respect to ϕ * (λ).
Our second result shows that the set of maps Θ for which (1) holds is "large".
Theorem 2.13. Let Ω be a smooth manifold of dimension m, µ be a measure on B(Ω), T : Ω → Ω be a measurable µ-invariant transformation. Suppose µ is Lebesgue absolute continuous in the sense of Definition 2.12. Let V be the set of all maps Θ ∈ C ∞ (Ω, R n ) for which The latter justifies that F can also be considered as a function from [−∞, +∞] into [0, 1]. For further considerations it will be convenient to keep in mind the following commutative diagram: For each a ∈ R define the following two elements of [−∞, ∞]: F (a)), a * = sup(F −1 F (a)).
Lemma 3.1. Let a ∈ R. Then the following statements hold true.
We will now prove that [a * , a * ) ⊂ F −1 F (a). By definition of infimum and supremum of the set Since F is nondecreasing and F (x i ) = F (y i ) = F (a) for all i, it follows that F is constant on each segment [x i , y i ], and so Moreover, from right-continuity of F we obtain that F (a * ) = lim (2) The inclusion C a * ⊂ F −1 F (C a * ) is evident. Suppose that there exists some t ∈ F −1 F (C a * ) \ C a * . This means that (i) t ≥ a * , and (ii) F (t) ∈ F (C a * ), i.e. F (t) = F (s) for some s < a * , Thus s < a * ≤ t. Since F is non-decreasing, that is s ∈ F −1 F (a), and therefore a * ≤ s, contradicting the assumption. Thus F −1 F (C a * ) = C a * .
(3) Since a * < a, F (a * ) = F (a), and F is non-decreasing, it follows that It follows that Z either equals [a, a * ] or [a, a * ). Suppose Z = [a, a * ], then Lemma is completed.
Proof. It is easy to see that σ(F Then σ(ξ) is generated by the sets P a , so it suffices to prove that for each a ∈ R there exists some Q a ∈ σ(F • ξ) such that µ(Q a △ P a ) = 0. In fact we will put Since F is non-decreasing F (C a * ) is a Borel subset of [0, 1], hence Q a * ∈ σ(F • ξ). So it remains to show that µ(Q a △ P a ) = 0 for each a ∈ R.
The Lemma is proved.
3.2. Ergodic properties. Let T : Ω → Ω be a measurable map. Define the function I d : Ω → R by So I d (ω) is the number of points among the first d − 1 points of the T -orbit of ω at which ξ takes values not greater than ξ(ω).

Lemma 3.3.
If T is ergodic and µ-preserving, then For each a ∈ R consider the following set Then by definition Moreover, as T is ergodic, it follows from Birkhoff's Ergodic Theorem that there exists a subset Ω a ⊂ Ω such that µ(Ω a ) = 1, and for each ω ∈ Ω a µ(K a ) = lim Take any countable dense subset S ⊂ R containing all points of discontinuity of F and letΩ = a∈S Ω a .
Letω ∈Ω be such that a = ξ(ω) ∈ R \ S. Then F is continuous at a.
We will show that Then by Corollary 3.4 we get the inclusions which imply (13) with Θ = ξ : Ω → R.

3.3.
Proof of Theorem 2.5. Let (Ω, F , µ) be a probability space, T : Ω → Ω be a measurable µ-invariant transformation, and Θ : Ω → R n be a measurable map such that σ {Θ • T k } k≥0 • = F . We have to prove that (17) h if either (a) T is ergodic, or (b) T is not ergodic, however Ω can be embedded into some compact metrizable space so that F = B(Ω). In the case (a) it follows from Corollary 3.5 and the assumptions above that Σ Θ, T • = F , which by Lemma 2.2 implies (17).
In the case (b) the equality (17) follows from the Ergodic decomposition theorem by the arguments of the proof of [10, Theorem 2.1].

Residuality and prevalence
4.1. The set of non-injectivity. Let Θ : Ω → X be a continuous map between topological spaces Ω and X. Suppose also that µ is a measure on the σ-algebra B(Ω) of Borel sets of Ω. In this section we give sufficient conditions on Θ for the equvalence σ(Θ) • = B(Ω) and also prove Theorem 2.13.
The subset of Ω is called the set of non-injectivity of Θ. It plays a principal role in the further considerations.
In particular, for s = 2 consider the following subset of Ω 2 : Let also p : Ω 2 → Ω be the projection to the first coordinate. Then it is evident that The following lemma describes some properties of the set of non-injectivity.
3) Suppose Ω and X are Hausdorff, Ω is also second countable and locally compact (e.g. a manifold). Then N Θ is an F σ subset of Ω, and in particular N Θ ∈ B(Ω).
Proof. Statements 1) and 2) are evident. Let us prove 3). It is easy to see that a topological space X is Hausdorff iff the diagonal ∆X 2 is closed in X 2 . This implies that (Θ 2 ) −1 (∆X 2 ) is closed in Ω 2 , hence M defined by (19) is second countable and locally compact as well.
But each p(M i ) is compact and so closed in Ω. Hence N Θ is an F σset.
Recall that a Polish space is a second countable completely metrizable topological space. Proof. Since σ(Θ) ⊂ B(Ω), it remains to consider the inverse inclusion. It suffices to show that for any open set G ∈ B(Ω) there exists some G ∈ σ(Θ) such that µ(G △ G) = 0. Given G, put Then by 2) of Lemma 4.1 the restriction Θ| G : G → Θ( G) is one-toone. So Θ( G) is a one-to-one image of the Polish space G under the continuous map Θ| G : G → X. This implies, [14,Theorem 15.1], that Θ( G) ∈ B(X), whence

4.2.
Multijets transversality theorem. The proof of Theorem 2.13 is based on the so-called multijets transversality theorem, see [6, Chapter 2, Theorem 4.13]. We will formulate it below preserving the notation from [6].
Let Ω and X be smooth manifolds, dim Ω = m, dim X = n, J k (Ω, X) be the manifold of k-jets of smooth maps Θ : Ω → X, α : J k (Ω, X) → Ω be the natural projection to the source, α s = α × · · · × α : J k (Ω, X) s → Ω s ,  We will apply this theorem to the case k = 0 and s = 2. First we will show that description (20) of N Θ is related to multijets transversality theorem. Recall that J 0 (Ω, X) = Ω × X. Then (Ω, X) → X 2 be the projection to the destination given by Evidently, β is a submersion. Therefore it is transversal to ∆X 2 , and so B is a submanifold in J 0 2 (Ω, X) of codimension codim ∆X 2 = dim X 2 − dim ∆X 2 = dim X = n.
Also notice that B is non-compact. Consider the map as in (19) and (20). Proof. Notice that B has codimension n in J 0 2 (Ω, X). Therefore the submanifold M = (j 0 2 Θ) −1 (B) has the same codimension n in Ω (2) . Since m < n, we obtain that Consider the restriction p| M : M → Ω. As dim M < dim Ω, each point of M is critical for p| M , whence by Sard's theorem, [6], the image of the set of critical points of p| M , i.e. the set p(M) = N Θ , has measure zero in the sense of Definition 2.11. Corollary 4.5. Suppose in Corollary 4.4 X = R n for some n, so C ∞ (Ω, R n ) is a linear space. Then the set V B has a probe. In particular, by Lemmas 2.9 and 2.10 it is prevalent with respect to any of weak topologies C r W on C ∞ (Ω, R n ).
Proof. First we introduce some notation and prove Lemma 4.6 below. Let M(n, k) be the space of (n × k)-matrices (n rows and k columns) which can be identified with R nk , and D r (n, k) be the subset of M(n, k) consisting of matrices of rank r. Then D r (n, k) is a smooth submanifold of codimension (n − r)(k − r), see e.g. [ is a smooth diffeomorphism. Now we can construct the probe for V B . Let G : Ω → M(n, k) be a map satisfying statement of Lemma 4.6. For each v ∈ R k define the following smooth map . Then P is a linear subspace of C ∞ (Ω, R n ) of dimension ≤ k. We claim that P is a probe for V B .
Indeed, let Θ ∈ C ∞ (Ω, R n ) be any map. For each v ∈ R k denote so the translation of P by Θ is the following affine subspace of C ∞ (Ω, R n ): We should prove that the following set: . Then the Jacobi matrix of Ψ at point (ω, ω ′ , v) has the form shown in Figure 4.2, and so its rank is maximal and equals 2(m + n) due to the choice of G. Hence Ψ is a submersion. Therefore it is transversal to B, and so M = Ψ −1 (B) is a submanifold in Ω (2) × R k . Let π : M → R k be the restriction to M of the natural projection Ω (2) × R k → R k . Then it is easy to see that Q coincides with the set of critical values of π. Since π is smooth, we get from Sard's theorem that Q has Lebesgue measure zero, see e.g. [7, Chapter 2, §3]. Corollary 4.5 is completed. Lemma 4.6. If k ≥ 2(m + n), then there exists a smooth map G : Ω → M(n, k) such that for any pair of distinct points ω = ω ′ ∈ Ω the matrix Φ(G(ω), G(ω ′ )) has rank 2n.
Proof of Theorem 2.13. Let Ω be a smooth manifold of dimension m, µ be a Lebesgue absolute continuous measure on B(Ω), T : Ω → Ω be a measurable µ-invariant transformation, and n > m. Let V = V B be defined by (22). Then by Corollaries 4.4 and 4.5 V is residual with respect to the strong topology C ∞ S and prevalent with respect to the weak topology C ∞ W . We claim that (9) holds for each Θ ∈ V. Indeed, by 3) of Lemma 4.1 N Θ is a Borel subset of Ω. Also by Corollary 4.4 it has measure zero in the sense of Definition 2.11. Since µ is Lebesgue absolute continuous, we see that µ(N Θ ) = 0, whence by Theorem 4.2 σ(Θ) • = B(Ω). Furthermore, as Ω is an m-dimensional manifold, it can be embedded in (2m + 1)-cube being a compact metric space. Therefore by (b) of Theorem 2.5 we have that This completes Theorem 2.13.