Topological entropy and distributional chaos in hereditary shifts with applications to spacing shifts and beta shifts

Positive topological entropy and distributional chaos are characterized for hereditary shifts. A hereditary shift has positive topological entropy if and only if it is DC2-chaotic (or equivalently, DC3-chaotic) if and only if it is not uniquely ergodic. A hereditary shift is DC1-chaotic if and only if it is not proximal (has more than one minimal set). As every spacing shift and every beta shift is hereditary the results apply to those classes of shifts. Two open problems on topological entropy and distributional chaos of spacing shifts from an article of Banks et al. are solved thanks to this characterization. Moreover, it is shown that a spacing shift $\Omega_P$ has positive topological entropy if and only if $\mathbb{N}\setminus P$ is a set of Poincar\'{e} recurrence. Using a result of K\v{r}\'{\i}\v{z} an example of a proximal spacing shift with positive entropy is constructed. Connections between spacing shifts and difference sets are revealed and the methods of this paper are used to obtain new proofs of some results on difference sets.


Introduction
A hereditary shift is a (one-sided) subshift X such that x ∈ X and y ≤ x (coordinate-wise) imply y ∈ X. As far as we known, hereditary shifts were introduced by Kerr and Li in [15,p. 882]). We are not aware of any further research on hereditary shifts. The notion of hereditary shift generalizes at least two classes of subshifts whose importance has been established in the literature: spacing shifts and beta shifts.
Given β > 1 the (one-sided) beta shift Ω β is a subset of Ω ⌈β⌉ = {0, 1, . . . , ⌊β⌋} N defined as the closure (with respect to the product topology) of the set of sequences arising as a β-expansion of numbers from [0, 1]. Beta shifts were first considered by Rényi [28] and are a family of symbolic spaces with an extremely rich structure and a profound connection to number theory, tilings, and the dynamics of systems with discontinuities.
By a spacing shift Ω P , where the parameter P is a subset of the positive integers N, we mean the set of all infinite binary sequences for which the occurrences of 1's have distances lying in P. In other words, Ω P contains only those sequences ω = (ω i ) that ω i = ω j = 1 and i j imply |i − j| ∈ P. Spacing shifts were introduced by Lau and Zame in [18] (see also [22, pp. 241-2]). Spacing shifts served for Lau and Zame as counterexamples. It seems that spacing shifts were hardly explored afterwards, except in [4,14,17] where again they were used to construct counterexamples. Recently, a more thorough study of spacing shifts was conducted in [5]. It was revealed that spacing shifts exhibit wide variety of interesting dynamics worth to be exploited further.
Our work extends and completes the line of investigation of [5] to a broader class of hereditary shifts, which also contains all beta shifts. In particular, we solve two open problems (Questions 4 and 5 of [5]), regarding topological entropy and distributional chaos in the more general context of hereditary shifts.
In order to classify hereditary shifts, notice first that the fixed point 0 ∞ belongs to any hereditary shift, hence the atomic measure µ 0 carried by this fixed point is an invariant measure of the system. Therefore one can divide all hereditary shifts into two major classes: I. those with a unique invariant measure µ 0 (uniquely ergodic ones), and II. those which have another invariant measure. Our main result (contained in Theorems 12,13,and 23) states that for hereditary shifts the above classification coincides with at least three other natural classifications: zero versus positive topological entropy, lack of any DC2, or even DC3 distributionally scrambled pair versus presence of uncountable set of distributionally scrambled pairs, that is, distributional chaos DC2 (or equivalently, DC3-chaos), and zero Banach density of occurrences of symbol 1 in all points of X versus existence of a point in X with 1's appearing with positive upper Banach density.
Recall here, that distributional chaos was introduced in the setting of maps of the interval, as an equivalent condition for positive topological entropy (see [29]). Although this equivalence does not hold in general, distributionally chaotic dynamics is a source of interesting research problems (see [8,24,25,27]).
Another classification of hereditary shifts is this A. those with a unique minimal set, consisting of a single fixed point 0 ∞ (proximal ones), and B. those which have another minimal set. Notice that any shift in class (I) must be in class (A), as any minimal set carries at least one invariant measure. In other words, the class (IB) is empty. In Theorem 24 we characterize hereditary shifts exhibiting distributional chaos of type 1 (DC1chaos) as non-proximal shifts, that is, those in class (B). It is known that every beta shift is in class (IIB). It follows that every beta shift is DC1-chaotic. Next, we use our characterization of hereditary shifts with positive entropy as those presenting distributional chaos of type 2 (DC2-chaotic ones), and the example constructed by Kříž [16] (and refined in [23] according to the idea of Ruzsa), to show in Theorem 25 the existence of a topologically weakly mixing spacing shift with unique minimal set 0 ∞ but not unique invariant measure, hence proving there exists a DC2chaotic spacing shift, which is not DC1-chaotic (there exists a hereditary shift of class (IIA)). This answers [5,Question 4]. Finally, Theorem 27 proves that the class (IA) is also non-empty and there are non-spacing and non-beta hereditary shifts.
Further, we prove in Theorem 16 that the entropy of a spacing shift Ω P is positive if and only if N \ P is not a set of recurrence, or, equivalently, P intersects nontrivially any set of recurrence. Here, following Furstenberg (see [11, p. 219]), we say that R ⊂ N is a set of recurrence if for every measure preserving system (X, X, µ, T ) and any set A ∈ X with µ(A) > 0 there is an r ∈ R such that µ(T −r (A) ∩ A) > 0. The later result links the topological entropy of spacing shifts with the return times sets appearing in a generalization of Poincaré recurrence theorem. At first sight this connection is quite unexpected, since it ties a measure theoretic notion of Poincaré recurrence with the notion of topological entropy of some subshift, which in turn may be expressed in combinatorial terms only. Unfortunately, the problem of intrinsic characterization of sets of recurrence is notoriously elusive, and our result turns out to be only its restatement. But we still believe that our approach opens the possibility to explore sets of recurrence from the new a perspective.
Finally, we would like to point out a connection of spacing shifts with combinatorial number theory. It is possible to apply the results on spacing shifts to explore difference sets, that is, sets of the form Identifying, as above, infinite binary sequences with characteristic functions of subsets of N one observes that for any P the spacing shift Ω P contains the sequences representing such sets A ⊂ N that A − A ⊂ P. Therefore it is natural to ask how the properties of a difference set P = A − A are related to the spacing shift Ω P . In this direction our work provides a topological version of the Furstenberg ergodic proof that for any set A with positive upper Banach density the set A − A contains the difference set of some set D with positive asymptotic density (see the proof of Theorem 8 below and [10, Corollary to thm. 3.20]).

Acknowledgements.
Results contained in the present paper were presented by the author at the Visegrad Conference on Dynamical Systems, held in Banská Bystrica between 27 June and 3 July 2011, and at the 26th Summer Conference on Topology and Its Applications hosted in July 26-29, 2011 by The City College of CUNY. Note that [5,Question 5] was also independently solved by Dawoud Ahmadi Dastjerdi and Maliheh Dabbaghian Amiri in [1]. The authors of [1] also proved that for a spacing shift zero entropy implies proximality. This is also a corollary of the more general Theorem 13 presented below. The author is greatly indebted to professor Mike Boyle, Jian Li, and Piotr Oprocha for several helpful comments concerning the subject of this paper. The anonymous referee of the previous version of this paper provided a superb report with many useful suggestions, which are included in the present form. The research leading to this paper were supported by the grant IP2011 028771.

Basic notions and conventions
A dynamical system is a pair (X, f ), where X is a compact metric space, and f : X → X is a continuous map. We usually denote the metric on X by d. By an invariant set we mean any set K ⊂ X such that f (K) ⊂ K. Any nonempty, closed and invariant set K is identified with the subsystem (K, f | K ) of (X, f ). A dynamical system is minimal if it has no proper subsystems. A point x ∈ X is called a minimal if it belongs to some minimal subsystem. A pair (x, y) ∈ X × X is a proximal pair if lim inf n→∞ d( f n (x), f n (y)) = 0.
We say that a dynamical system (X, f ) is proximal if every pair in X × X is a proximal pair.
By a Lebesgue space we mean a triple (X, X, µ), where X is a Polish space, X is the σ-algebra of Borel sets on X, and µ is a probability measure on X. We ignore null sets, and accordingly we will assume that that all probability spaces are complete. A measure preserving system is a quadruple (X, X, µ, T ), where (X, X, µ) is a Lebesgue space, and T : X → X is a measurable map preserving µ, that is, T −1 (B) ∈ X and µ(T −1 (B)) = µ(B) for every B ∈ X. If (X, f ) is a dynamical system, then there always exists an invariant measure, that is, a complete Borel probability measure µ, such that (X, X, µ, f ) is a measure preserving system. An invariant measure for (X, f ) is ergodic if the only members B of X with f −1 (B) = B satisfy µ(B) = 0 or µ(B) = 1. A dynamical system (X, f ) is uniquely ergodic if it has exactly one invariant measure.
Given an infinite set of positive integers S we enumerate S as an increasing sequence s 1 < s 2 < . . . and define the sum set FS(S ) of S by We say that a set A ⊂ N is 1. thick, if it contains arbitrarily long blocks of consecutive integers, that is, for every n > 0 there is k ∈ N such that {k, k + 1, . . . , k + n − 1} ⊂ A, 2. syndetic, if it has bounded gaps, that is, for some n > 0 and every k ∈ N we have {k, k + 1, . . . , k + n − 1} ∩ A ∅, 3. an IP-set if it contains the sum set FS(S ) of some infinite set S ⊂ N. 4. ∆-set if it contains the difference set A − A of some infinite set A ⊂ N, 5. piecewise syndetic if it is an intersection of a thick set with a syndetic set, 6. ∆ * -set (IP * -set), if it has non-empty intersection with every ∆-set (IP-set, respectively). Note that for some authors IP-sets are exactly the finite sum sets as defined above (see, e.g., Furstenberg's book [10]).
By the upper density of a set A ⊂ N we mean the number If limes superior above is actually the limit, then we write d(A) instead of d(A), and call it the asymptotic density of A. The upper Banach density of a set A ⊂ N is the number Given a dynamical system (X, f ) and sets A, B ⊂ X we define the set of transition times from A to B by There are no commonly accepted names for the sets N(A, B) and N(x, B). Some authors (see, e.g., [19]) prefer to call them the set of hitting times of A and B, and the set of times x enters into B, respectively. Note that N(x, B) = N({x}, B). Many recurrence properties of a dynamical system (X, f ) may be characterized in terms of sets of transition (visiting) times sets. For the purposes of the present paper we will state these equivalent characterizations in the theorems below and omit the standard definitions. The first equivalence above is straightforward, the second one follows, e.g., from [9, Proposition II.3]. For the proof of the next theorem see, e.g., [7], and consult [20, Section 5] for more information.

Spacing shifts
Let n ≥ 2 and Λ n = {0, 1, . . . , n − 1} be equipped with the discrete topology. We endow the space of all infinite sequences of symbols from Λ n indexed by the positive integers N with the product topology, and denote it by Ω n = Λ N n . The reader should remember (especially reading section 5) that we will equip Ω n with a compatible metric ρ given by The shift transformation σ acts on ω ∈ Ω n by shifting it one position to the left. That is, σ : A word of length k (a k-word for short) is a sequence w = w 1 w 2 . . . w k of elements of Λ n . The length of a word w is denoted as |w|. We will say that a word u = u 1 u 2 . . . u k appears in a word w = w 1 w 2 . . . w n at position t, where Similarly, a word u appears in ω = (ω i ) ∈ Ω at position t ∈ N if ω t+ j−1 = u j for j = 1, . . . , k. A cylinder given by a word w is the set [w] of all sequences ω ∈ Ω n such that w appears at position 1 in ω. The collection of all cylinders form a base for the topology on Ω n .
The concatenation of words w and v is a sequence u = wv given by u i = w i for 1 ≤ i ≤ |w| and u i = v i−|w| for |w| + 1 ≤ i ≤ |w| + |v|. If u is a word, and n ≥ 1, then u n is the concatenation of n copies of u. Then u ∞ has its obvious meaning.
If S ⊂ Ω n , then the language of S is the set L(S ) of all nonempty words which appear at some position in some x ∈ S . The set L k (S ) consists of all elements of Given a nonempty set W of words we can define a set X W ⊂ Ω n as a set of all ω ∈ Ω such that L(ω) ⊂ W. It is well known (see [21,Proposition 1.3.4]) that if W is a nonempty collection of words such that for every word w ∈ W all words appearing in w are also in W and at least one word among wα, where α ∈ Λ n is in W, then X W is an one-sided subshift and L(X W ) = W.
Let P be a subset of positive integers. We say that a binary word w = w 1 . . . w l is P-admissible if w i = w j = 1 implies |i − j| ∈ P ∪ {0}. Let W(P) be the collection of all P-admissible words. By the result mentioned above, Ω P = X W(P) ⊂ Ω 2 is a binary subshift, and its language, L(Ω P ) is the set of all P-admissible words. We will write σ P for σ : Ω 2 → Ω 2 restricted to Ω P , and call the dynamical system given by σ P : Ω P → Ω P a spacing shift given by P. If w ∈ L(Ω 2 ), then by [w] P we denote [w] ∩ Ω P .
It is easy to see that definition of a spacing shift implies that N([1] P , [1] P ) = P. Moreover, σ P is weakly mixing if and only if P is a thick set (see [5,18,22]).
As we are concerned here with the entropy of subshifts of Ω n , we recall here a definition of topological entropy suitable for our purposes. If X ⊂ Ω n is a subshift, then we set λ k = # L k (X). It is straightforward to see that λ m+n ≤ λ n · λ m , therefore the number is well defined, and actually h(X) = inf log λ k /k. (Here, as elsewhere, we use logarithms with base 2). It is well known (see [21,34]) that h(X) is equal to the topological entropy of the dynamical system (X, σ| X ).

Hereditary subshifts and their topological entropy
The aim of the present section is to provide a characterization of hereditary subshifts with positive topological entropy. It will allow us to describe topological and ergodic properties of the hereditary subshifts with zero entropy. Some of the results we include in this section are known and can be proved using ergodic theory.
Here we present them with new, more elementary and straightforward proofs which use only basic combinatorics and topological dynamics to keep the exposition as self-contained as possible. Nevertheless, we admit that the ergodic theory approach is undeniably elegant.
Recall, that a subshift X ⊂ Ω n is hereditary provided for any ω ∈ X if for some The following lemma follows directly from the definition of a hereditary subshift, and records basic properties of hereditary subshifts for further reference. Here, for a binary word w = w 1 . . . w k ∈ L k (Ω 2 ) we define  Proof. By our assumption we can find ε > 0 and a sequence w (k) of words appearing in ω such that l(k) = |w (k) | → ∞ with k → ∞, and w (k) ≥ l(k)ε. By Lemma which concludes the proof.
We will need the following simple combinatorial result whose proof can be found for example in [30, p. 52].
Lemma 5. Let 0 < ε ≤ 1/2 and n ≥ 1. Then Let X ⊂ Ω n be a subshift of the full shift over Λ n . For a symbol α ∈ Λ n we define δ k (X, α) as the maximal number of occurrences of the symbol α in a word w ∈ L k (X), that is, Clearly, δ s+t (X, α) ≤ δ s (X, α) + δ t (X, α) holds for any positive integers s and t. Therefore, the sequence δ k (X, α) is subadditive, and δ k (X, α)/k has a limit as k approaches infinity. Hence we can define maximal density of α in X as Theorem 6 is the best motivation for the above definition. Note that for a hereditary shift X ⊂ Ω n we have The following lemma follows from the ergodic theorem, but here we present a direct proof inspired by [13]. Theorem 6. If X ⊂ Ω n is a subshift, then for every α ∈ Λ n there exists a point ω ∈ X such that d({ j : ω j = α}) = ∆ α (X).
Proof. Without loss of generality we may assume that n = 2 and α = 1. If ∆ 1 (X) = 0, then the set N \ 1(ω) must be thick for every ω ∈ X. Then 0 ∞ ∈ X since X is closed and shift invariant. We assume that ∆ 1 (X) > 0. For every n > 0 let w (n) =w (n) 1 . . .w (n) n ∈ L n (X) be a word of length n such that and fix any pointx (n) ∈ [w (n) ] X . We claim that for each integer k > 0 there exists a word w (k) ∈ L(X) such that For the proof of the claim, assume on contrary that (1) do not hold for some k > 0. Then, ∆ 1 (X) − 1/k > 0. Set m = k 2 + 1. As we assumed that our claim fails, for a point y =x (m) defined above we can find a strictly increasing sequence of integers contradicting the definition of m. Therefore, our claim holds. Now, for each integer k > 0 there exists a point x (k) ∈ [w (k) ] X , and since X is compact, we may without loss of generality assume that x (k) converge to some x ∈ X. Hence for every k > 0 there exists N ≥ k such that x| [0,k) = w (N) | [0,k) . For every k > 0 we have where the first inequality follows by our claim, and the second is a consequence of the definition of δ k (X, 1). We conclude the proof by passing to the limit as k → ∞.
It is clear that if there exists ω ∈ X such that 1(ω) have positive upper Banach density, then ∆ 1 (X) is also positive. Let us note an immediate consequence: Corollary 7. If X ⊂ Ω n is a subshift and BD * (1(x)) > 0 for some x ∈ X, then there exits y ∈ X such that d(1(y)) > 0.
We can now use the previous theorem and its corollary to provide a proof of [10, Corollary to thm. 3.20].

Theorem 8. If A ⊂ N is a set of positive upper Banach density, then there is a set B ⊂ N with positive density such that B − B is contained in A − A.
Proof. Let P = A − A. Then the characteristic function of A denoted by ω A belongs to the spacing shift Ω P . By the Corollary 7 there is a point ω ∈ Ω P with d(1(ω)) > 0. Let B ⊂ N be such that ω is its characteristic function. Then d(B) > 0 and Let us note here yet another application of spacing shifts to combinatorial number theory. It follows directly from Theorem 2.

Lemma 9. If Z ⊂ N is a piecewise syndetic set, then there is a syndetic set S
In the case of a binary subshift, we prove that ∆ 1 (X) > 0 is necessary for h(X) > 0.

Theorem 10. Let X ⊂ Ω n be a subshift. If the maximal density of α in X is zero
Proof. Without loss of generality we may assume that n = 2 and α = 1. Fix 0 < ε < 1/2. As there exists an N = N(ε) > 0 such that for each n ≥ N we have It implies that # L n (X) ≤ ⌊nε⌋ j=0 n j for every n ≥ N.
Clearly, Theorems 6 and 10 imply: Finally, we state our main theorem characterizing hereditary shifts with positive entropy as the ones with positive density of occurrences of 1's. Proof. Necessity of positive density of occurrences of 1's follows from Lemma 3(4) and Theorem 10, sufficiency follows from Lemma 4.
As remarked above we might take a different route and obtain an ergodic proof of Theorem 12. It would hinge upon the Variational Principle for the topological entropy and the well known result (see [10,Lemma 3.17]), which is included in the first part of the following theorem (the equivalence of conditions 1-3). The other implications follows from Theorems 6 and 12.
Theorem 13. For a subshift X ⊂ Ω n the following conditions are equivalent: 1. There exists a point ω ∈ X such that BD * ({n : ω n = α}) > 0 for some α ∈ Λ n \ {0}. We find it useful to slightly rephrase the previous theorem. Note that, even for hereditary shifts, the condition (⋆⋆) above does not imply unique ergodicity, nor zero entropy, which we will show later in Theorem 25. Now we restrict ourselves back to the spacing shifts, and turn our attention to the natural question: is there any property of P that ensures h(Ω P ) > 0? We have no satisfactory answer, but we will do show that this question is equivalent to the notoriously elusive problem of characterization of the sets of (Poincaré) recurrence.

There exists a shift invariant measure µ on X such that
First, recall that a refinement of the classical Poincaré recurrence theorem motivates the following definition. Definition 1. We say that R ⊂ N is the a set of recurrence if for any measure preserving system (X, X, µ, T ), and any set A ∈ X with µ(A) > 0 we have µ(A ∩ T −n (A)) > 0 for some n ∈ R.

Lemma 15. A necessary and sufficient condition for R ⊂ N to be a set of recurrence is that for every
By the above lemma we obtain the combinatorial characterization of sets of recurrence in terms of topological entropy of spacing shifts.

Theorem 16. A set R ⊂ N is a set of Poincaré recurrence if and only if h(Ω
Recall that in [5] the following problem is formulated (note that we slightly rephrased it below):

Question 5:
Is there P such that N \ P does not contain any IP-set but Ω P is proximal? Is there P such that N \ P does not contain any IP-set but h(Ω P ) > 0? Are these two properties (i.e. proximality and zero entropy) essentially different in the context of spacing subshifts?
To answer this question we will need the following lemma (see also [ Proof. By our assumption there is a positive number β and a sequence of intervals [s n , t n ] with s n , t n ∈ N and t n − s n → ∞ as n → ∞ such that Let k ∈ N be such that β > 1/k, and take any We will show that the sets A j = A + b j for j = 1, . . . , k can not be pairwise disjoint. Assume on contrary that this is not the case. Let l n = t n − s n + 1. Let n be large enough to assure the following Then C ⊂ [s n , t n + b k ]. Moreover, for each j the set (A + b j ) ∩ [s n + b j , t n + b j ] has at least ⌈(t n − s n + 1)/k⌉ + b k elements. Now the assumption that the sets A j = A + b j for j = 1, . . . , k are pairwise disjoint leads to the conclusion that C has more than t n − s n + 1 + kb k elements, which gives us a contradiction. Therefore A i ∩ A j ∅ for some 1 ≤ i < j ≤ k, hence there are a i , a j in A and b i , b j in B such that a i − a j = b j − b i , which concludes the proof.
The following theorem generalizes [5, Theorem 3.6] since every IP-set is a ∆set.
Theorem 18. If the entropy of Ω P is positive, then P intersects the difference set of any infinite subset of integers, that is, P is a ∆ * -set.
Proof. It is an immediate consequence of Theorem 12 and Lemma 17.
, which is clearly a ∆-set To prove that B is not an IP-set, consider the binary expansions of elements of B, and observe that each must be of the form Therefore there is no infinite set A ⊂ B with FS(A) ⊂ B. Hence the complement of B in N is an IP * -set which is not ∆ * -set, and we get the following corollary, which answers [5, Question 5].

Corollary 19.
There is a proximal spacing shift Ω P with P being an IP * -set and h(Ω P ) = 0.
It follows from Theorem 14 that for a spacing shift zero entropy implies proximality, and it will follow from Theorem 25 that the converse is not true.

Distributional chaos of hereditary shifts
In this section we consider distributional chaos for hereditary shifts, generalizing and extending results from [5].
Let (X, f ) be a dynamical system. Given x, y ∈ X we define an upper and lower distribution function on the real line by setting Clearly, F xy and F * xy are nondecreasing, and 0 ≤ F xy (t) ≤ F * xy (t) ≤ 1 for all real t. Moreover, F xy (t) = F * xy (t) = 0 for all t ≤ 0, and F xy (t) = F * xy (t) = 1 for all t > diam X. We adopt the convention that F xy < F * xy means that F xy (t) < F * xy (t) for all t in some interval of positive length.
Following [3] we say that a pair (x, y) of points from X is a DC1-scrambled pair if F * xy (t) = 1 for all t > 0, and F xy (s) = 0 for some s > 0. A pair (x, y) is a DC2-scrambled pair if F * xy (t) = 1 for all t > 0, and F xy (s) < 1 for some s > 0. Finally, by a DC3-scrambled pair we mean a pair (x, y) such that F xy < F * xy . The dynamical system (X, f ) is distributionally chaotic of type i (or DCi-chaotic for short) where i = 1, 2, 3, if there is an uncountable set S ⊂ X such that any pair of distinct points from S is DCi scrambled.
The proof of the following lemma is a standard exercise, therefore we skip it.
and an uncountable family Γ of subsets of S 0 such that for every S ′ , S ′′ ∈ Γ, Proof. Let α = d(S ) > 0. There exists an increasing sequence of positive integers b 1 < b 2 < . . . such that Without loss of generality we may assume that n · b n ≤ b n+1 for all n ∈ N. For n ∈ N let and therefore ( * ) holds. Note that To finish the proof it is enough to observe that there exists an uncountable family Θ of infinite sets of positive integers such that for any A, B ∈ Θ with A B the sets A \ B and B \ A are infinite.

Lemma 22.
Let X ⊂ Ω n be a hereditary subshift. If x and y is a pair of points in X such that F xy (s) < 1 for some s > 0, then there exists an uncountable set Γ ⊂ X such that for every u, v ∈ Γ, u v we have In particular, any pair (u, v) with u v is DC2-scrambled, (DC1-scrambled, if in addition we have F xy (s) = 0).
Proof. Let (x, y) be a pair of points such that F xy (t) < 1 for some t > 0. By Lemma 20(1) we get that d({n : x n y n }) > 0. Since X is hereditary without loss of generality we may assume that d(1(x)) > 0. With the customary abuse of notation, we let Γ to be the set of characteristic functions of subsets of S = 1(x) provided by Lemma 21. Now, we apply both parts of Lemma 20 to see that each pair of different points of Γ fulfills the desired conditions. Theorem 23. Let X ⊂ Ω n be a hereditary subshift. Then the following conditions are equivalent 1. The topological entropy of X is positive. 2. There exists points x, y ∈ X such that F xy (t) < 1 for some t > 0.

X is DC3-chaotic. 4. X is DC2-chaotic.
Proof. On account of Lemma 22 conditions (2-4) are equivalent. By Theorem 13 positive entropy of X is equivalent to the existence of a point x ∈ X with d(1(x)) > 0. Now we may consider a pair (x, y) where y = 0 ∞ , and apply Lemma 22 to finish the proof.

Theorem 24. A hereditary shift X ⊂ Ω n is DC1-chaotic if and only if X is not proximal.
Proof. If ω = (ω i ) 0 ∞ is a minimal point, then x = σ ν (ω) ∈ [α] for some ν ≥ 0 and α ∈ Λ n \ {0}. Moreover, x is also a minimal point of X, and hence it returns to the cylinder [α] syndetically often, that is, there is k > 0 such that x [ j, j+k) 0 k for each j ∈ N. Let y = 0 ∞ . Therefore (x, y) is a pair such that F xy (2 −k ) = 0. We conclude from Lemma 22 that there must be an uncountable DC1-chaotic set in X. For the other direction, note that by [24,Corollary 15] the DC1-scrambled pairs are absent in any proximal system. Hence, DC1-chaos implies existence of a minimal set other than 0 ∞ .
The following theorem completes our answer to [5,Questions 4 and 5]. Note that such a subshift we obtain by this theorem has an invariant measure supported outside minimal sets. The first example of this phenomenon was given by Goodwyn in [12]. Here, following [23] by an r-coloring of N we mean any partition N = C 1 ∪ . . . ∪ C r . The indices 1, 2, . . . , r are called the colors. A set E ⊂ N is said to be r-intersective if for every r-coloring N = C 1 ∪ . . . ∪ C r there exists a color i such that (C i − C i ) ∩ E is non-empty. We say that E ⊂ N is chromatically intersective if E is r-intersective for any r ≥ 1. By [23, Proposition 0.12] a set E is chromatically intersective if and only if whenever (X, f ) is a dynamical system, x ∈ X is a minimal point and U ⊂ X is an open neighborhood of x then there is n ∈ E such that U ∩ f −n (U) is non-empty.

Theorem 25.
There exits a weakly mixing and proximal spacing shift (Ω P , σ P ) with positive topological entropy. Hence, there is a DC2-chaotic spacing shift which is not DC1-chaotic.
Proof. By the result of Kříž (proved first by [16], here we use [23, Theorem 1.2]) there exists a set A ⊂ N with d(A) > 0 such that (A − A) ∩ C = ∅ for some chromatically intersective set C. Let P = N \ C.
We claim that the spacing shift Ω P is proximal, that is, we claim that 0 ∞ is the unique minimal point of Ω P . Assume on contrary that there is another minimal point ω ∈ Ω P . Then there is some k ≥ 0 such that [1] P is an open neighborhood of a minimal point σ k (ω). By [23, Proposition 0.12] there must be n ∈ N([1] P , [1] P )∩C, but this contradicts the definition of P = N \ C. So Ω P is proximal.
Moreover, the characteristic function of the set A belongs to Ω P , hence h(Ω P ) > 0, since d(A) > 0. By Theorems 23 and 24 the spacing shift Ω P is a DC2-chaotic but it is not DC1-chaotic. To prove that Ω P is weakly mixing we need to show that C can be chosen so that P = N \ C is thick. Since most of the construction of the set C can be repeated without introducing anything new, we ask the reader to re-examine the proof of [23,Theorem 1.2] to see that C is defined as an union of finite sets where c · J = {c j : j ∈ J}, and positive integers n 1 , n 2 , . . . can be chosen to be arbitrarily large. As all sets C 1 , C 2 , . . . are finite, and do not depend on n i 's, one can force C to have thick complement.

Beta shifts are hereditary
We prove here that the very important class of beta shifts provides a whole family of examples of hereditary shifts. We follow the description of beta shifts presented in [32]. To define a beta shift fix a real number β > 1 and let the sequence ω (β) ∈ Ω ⌈β⌉ be the expansion of 1 in base β, that is, Then ω (β) ∈ Ω ⌈β⌉ is given by ω (β) 1 = ⌊β⌋ and ω (β) Let denote the lexicographic ordering of the set (N ∪{0}) N . Then it can be proved that for any k ≥ 0 we have (2) σ k (ω (β) ) ω (β) , where σ denotes the shift operator on (N ∪ {0}) N . By a result of Parry [26], the converse is also true, that is, if any sequence over a finite alphabet satisfies the above equation then there is a β > 1 such that this sequence is a β-expansion of 1. It follows from (2) that is a subshift of Ω ⌈β⌉ , called the beta subshift defined by β.
It is easy to see that the above description of beta shifts implies the following: Lemma 26. Every beta shift Ω β ⊂ Ω ⌈β⌉ is hereditary.

Final remarks and an open problem
Finally, we present an example, which shows that there exist hereditary shifts other than spacing shifts or beta shifts. Theorem 27. There exists mixing, hereditary binary subshift without any DC3scrambled pair, which is not conjugated to any spacing shift, nor any beta shift.
Proof. To specify X we will describe the language of X. Let W be the collection of all w words from L(Ω 2 ) such that for any word u occurring in w if 2 k−1 + 1 ≤ |u| ≤ 2 k , then the symbol 1 occurs at less than k + 1 positions in u. It is clear that W fulfills the assumptions of [21,Proposition 1.3.4], and hence X = X W is a binary subshift with W = L(X W ). Then clearly, X is hereditary, and d(ω) = 0 for every ω ∈ X, hence the topological entropy of X is zero, and there is no DC3-scrambled pair in X. Now fix any two cylinders [u] and [v] in X. Since u0 k v0 ∞ ∈ X for all sufficiently large k, we conclude by Theorem 1 that X is mixing. It follows from [5] that all mixing spacing shifts have positive topological entropy. On the other hand it is well known that the topological entropy of every beta shift is also positive. Hence X is not conjugated to any spacing shift nor beta shift.
As the topological entropy of a beta shift Ω β is log β, using the main result of [8] or Theorem 23 we obtain that every beta shift is DC2-chaotic. But actually more is true.

Theorem 28. Every beta shift is DC1 chaotic.
Proof. It is well known that beta shifts are never proximal (it follows for example from the main result of [31] or [32, Proposition 5.2]). Then one invokes Theorem 24 to finish the proof.
Note that it is known that all beta shifts have the unique measure of maximal entropy (see [31]). It prompts us to state the following conjecture which to our best knowledge remains open.
Conjecture: Every hereditary shift is intristically ergodic, that is, it posses the unique measure of maximal entropy.