Avoiding fractional powers over the natural numbers

We study the lexicographically least infinite $a/b$-power-free word on the alphabet of non-negative integers. Frequently this word is a fixed point of a uniform morphism, or closely related to one. For example, the lexicographically least $7/4$-power-free word is a fixed point of a $50847$-uniform morphism. We identify the structure of the lexicographically least $a/b$-power-free word for three infinite families of rationals $a/b$ as well many"sporadic"rationals that do not seem to belong to general families. To accomplish this, we develop an automated procedure for proving $a/b$-power-freeness for morphisms of a certain form, both for explicit and symbolic rational numbers $a/b$. Finally, we establish a connection to words on a finite alphabet. Namely, the lexicographically least $27/23$-power-free word is in fact a word on the finite alphabet $\{0, 1, 2\}$, and its sequence of letters is $353$-automatic.


Introduction
A major thread of the combinatorics on words literature is concerned with avoidability of patterns. Beginning with work of Thue [8,9,4], a basic question has been the following. Given a pattern, on what size alphabet does there exist an infinite word containing no factors matching the pattern? For example, it is easy to see that squares (words of the form ww where w is a nonempty word) are unavoidable on a binary alphabet, but Thue [8] exhibited an infinite square-free word on a ternary alphabet.
If it is not known whether a given pattern is avoidable on a given alphabet, it is natural to attempt to construct long finite words that avoid the pattern as follows. Choose an order on the alphabet. Begin with the empty word, and then iteratively lengthen the current word by appending the least letter of the alphabet, or, if that letter introduces an instance of the pattern, the next least letter, etc. If no letter extends the word, then backtrack to the previous letter and increment it instead. If there exists an infinite word avoiding the pattern, then this procedure eventually computes prefixes of the lexicographically least infinite word avoiding that pattern.
Lexicographically least words avoiding patterns have become a subject of study in their own right. An overlap is a word of the form cxcxc where c is a letter. On a binary alphabet, the lexicographically least overlap-free word is 001001ϕ ∞ (1), where ϕ(0) = 01, ϕ(1) = 10 and ϕ ∞ (1) is the complement of the Thue-Morse word [1].
Date: April 8, 2018. The second-named author was supported in part by a Marie Curie Actions COFUND fellowship at the University of Liège.
Guay-Paquet and Shallit [6] began the study of lexicographically least words avoiding patterns on the alphabet Z ≥0 . They gave morphisms generating the lexicographically least words on Z ≥0 avoiding overlaps and avoiding squares. Since the alphabet is infinite, prefixes of such words can be computed without backtracking.
Rowland and Shallit [7] gave a morphism description for the lexicographically least 3 2 -power-free word. A fractional power is a partial repetition, defined as follows. Let a and b be relatively prime positive integers. If v = v 0 v 1 · · · v l−1 is a nonempty word whose length l is divisible by b, define For example, (0111) 3/2 = 011101. We say that v a/b is an a b -power. Note that |v a/b | = a b |v|. If a b > 1, then a word w is an a b -power if and only if w can be written v e x where e is a non-negative integer, x is a prefix of v, and |w| |v| = a b . We say that a word is a b -power-free if none of its factors are a b -powers. Avoiding 3 2 -powers, for example, means avoiding factors xyx where |x| = |y| ≥ 1. Avoiding 5 4 -powers means avoiding factors xyx where 3|x| = |y| ≥ 1. More generally, if 1 < a b < 2 then an a b -power is a word of the form xyx where |xyx| |xy| = a b . A bordered word is a word of the form xyx where x is nonempty, so for 1 < a b < 2 one can think of an a b -power as a bordered word with a prescribed relationship between |x| and |y|.
Basic terminology is as follows. If Σ is an alphabet (finite or infinite), Σ * denotes the set of finite words with letters from Σ. We index letters in a finite or infinite word starting with position 0. A morphism on an alphabet Σ is a map ϕ : Σ → Σ * . A morphism on Σ extends naturally to finite and infinite words by concatenation. A morphism ϕ on Σ is k-uniform if |ϕ(n)| = k for all n ∈ Σ. If there is a letter c ∈ Σ such that c is the first letter of ϕ(c), then iterating ϕ gives a word ϕ ∞ (c) which begins with c and which is a fixed point of ϕ.
In this paper, we show that for some rational numbers a b , the lexicographically least a b -power-free word on Z ≥0 is a fixed point of a uniform morphism. For other rationals, this word is the image, under a coding, of a fixed point of a morphism on the alphabet Z ≥0 ∪ Σ for some finite set Σ. In both cases, the morphisms ϕ| Z ≥0 are a b -power-free, meaning that if w avoids a b -powers then ϕ(w) avoids a b -powers. By studying lexicographically least words, we discover many a b -power-free morphisms, which are interesting in their own right.
The outline of the paper is as follows. In Section 2 we discuss the lexicographically least word avoiding a b -powers for several explicit rationals a b and discuss the k-regularity of their sequences of letters. In Section 3 we show that, for an infinite family of rationals in the interval 5 3 ≤ a b < 2, the lexicographically least a b -powerfree word is a fixed point of an a b -power-free (2a−b)-uniform morphism. In Section 4 we discuss automating proofs of a b -power-freeness and prove similar theorems for other morphisms. In Section 5 we establish the structure of the lexicographically least a b -power-free word for two additional infinite families of rationals. Using the machinery we have built, in Section 6 we address some sporadic words that we have not found to belong to infinite families.
The theorems in Section 4 and, in part, Sections 5 and 6, are proved by automated symbolic case analysis. Since the morphisms are symbolic in a and b, a substantial amount of symbolic computation is required. The Mathematica package SymbolicWords was written to manipulate words with symbolic run lengths and perform these computations. It can be downloaded from the web site of the second-named author 1 .

Some explicit rationals
The following word is our main object of study.
Notation. Let a and b be relatively prime positive integers such that a b > 1. Define w a/b to be the lexicographically least infinite word on Z ≥0 avoiding a b -powers. We require a b > 1, because if 0 < a b ≤ 1 then every word of length a is an a b -power and w a/b does not exist. For a b > 1, the word w a/b exists, since, given a prefix, appending an integer that doesn't occur in the prefix yields a word with no a b -power suffix. It is clear that w a/b is not eventually periodic, since the periodic word xxx · · · contains the a b -power x a = (x b ) a/b . In this section we examine w a/b for some explicit rational numbers a b . Guay-Paquet and Shallit [6] showed that the lexicographically least square-free word on Z ≥0 is w 2 = ϕ ∞ (0) = 01020103010201040102010301020105 · · · , where ϕ is the 2-uniform morphism given by ϕ(n) = 0(n + 1). More generally, for an integer a ≥ 2 we have w a = ϕ ∞ (0), where ϕ(n) = 0 a−1 (n + 1). If we write w a = w(0)w(1) · · · so that w(i) is the letter at position i in w a , then The results in this paper can been seen as generalizations of this morphism and recurrence to fractional powers. Many of the morphisms that will appear are of the form ϕ(n) = u (n + d), where u is a word of length k − 1 and d ∈ Z ≥0 . If we write ϕ ∞ (0) = w(0)w(1) · · · , then the letter sequence w(i) i≥0 satisfies (1) w(ki + r) = w(r) if 0 ≤ r ≤ k − 2 w(i) + d if r = k − 1 for all i ≥ 0. Consider w 3/2 = 001102 100112 001103 100113 001102 100114 001103 100112 · · · .
The first array in Figure 1 shows the first several letters of w 3/2 , partitioned into rows of length 6, with the integers 0 through 6 rendered in gray levels from white to black. The first five columns are periodic, and the last column is a "self-similar" column consisting of the letters of w 3/2 , each increased by 2. The letters of w 3/2 satisfy if r ∈ {0, 2, 4} and i is even 1 − w(r) if r ∈ {0, 2, 4} and i is odd w(r) if r ∈ {1, 3} w(i) + 2 if r = 5 for all i ≥ 0, which follows from the recurrence given by Shallit and the secondnamed author [7]. Moreover, w 3/2 is the image under a coding of a fixed point of a 6-uniform morphism as follows. Consider the alphabet Z ≥0 ∪ {0 Let τ be the coding defined by τ (0 ) = 0, τ (1 ) = 1, and τ (n) = n for n ∈ Z ≥0 ; this morphism τ will be the same throughout the paper. Then w 3/2 = τ (ϕ ∞ (0 )).
The integer 6 features prominently in the structural description of w 3/2 , and the sequence of letters of w 3/2 is a 6-regular sequence in the sense of Allouche and Shallit [2]. For an integer k ≥ 2, a sequence s(i) i≥0 is said to be k-regular if the Z-module generated by the set of subsequences {s(k e i + j) i≥0 : e ≥ 0 and 0 ≤ j ≤ k e − 1} is finitely generated. This implies that s(i) can be computed from the base-k digits of i from some finite set of linear recurrences (such as Equation (1)) and initial conditions.
One of the main motivations of the present paper is to put the '6' for w 3/2 into context by studying w a/b for a number of other rationals a b . We will see that w a/b is often k-regular for some value of k. For each integer a ≥ 2, the word w a is a-regular. To demonstrate the variety of values that occur for b ≥ 2, let us survey w a/b for some rationals with small numerators and denominators. We start with some words with fairly simple structure and progress toward more complex words.
Partitioning w 5/3 into rows of length 7 produces the second array in Figure 1. There are 6 constant columns and one self-similar column in which the sequence reappears with every term increased by 1. Therefore the sequence of letters seems to satisfy Equation (1) with k = 7 and d = 1, and in fact we have the following. Theorem 1. Let ϕ be the 7-uniform morphism defined by ϕ(n) = 000010(n + 1) for all n ∈ Z ≥0 . Then w 5/3 = ϕ ∞ (0).
The last array in Figure 1 shows the word w 9/5 partitioned into rows of length k = 13. Again we see 12 constant columns and one self-similar column. Indeed this word is generated by the following morphism.
We prove Theorems 1 and 2 in Section 3. For a b = 8 5 the value of k is somewhat larger. for all n ∈ Z ≥0 . Then w 8/5 = ϕ ∞ (0).
Among rationals with denominator 4 we come across an even longer morphism.
Let us look at a couple more rationals with denominator 5. For 6 5 the correct value is k = 1001. However, the array obtained by partitioning w 6/5 into rows of length 1001, shown in Figure 2, does not have constant columns but columns that become constant after 30 rows. Subsequent rows suggest a certain 1001-uniform morphism ϕ, but we must build in the earlier rows by defining ϕ(0 ) = v ϕ(0) for some word v whose first letter is 0 . Note that ϕ is no longer uniform. We refer to the prefix v as the transient. Let τ be as before; τ (0 ) = 0 and τ (n) = n for n ∈ Z ≥0 . Then we have the following. Theorem 5. There exist words u, v of lengths |u| = 1001 − 1 and |v| = 29949 such that Although we state Theorem 5 as an existence result, the words u and v can be obtained explicitly by computing the appropriate prefix of w 6/5 .
The sequence of letters in w 6/5 does not satisfy Equation (1) but does satisfy a modified equation accounting for the transient. Write w 6/5 = w(0)w(1) · · · . Then for all i ≥ 0 we have That is, the letters of w 6/5 reappear as a subsequence, with every term increased by 3, beginning at w(30949) = w(0) + 3. For 7 5 there seems to be a similar transient, with k = 80874. Conjecture 6. There exist words u, v of lengths |u| = 80874 − 1 and |v| = 93105 such that w 7/5 = τ (ϕ ∞ (0 )), where For the word w 4/3 , partitioning into rows of length k = 56 gives k − 1 eventually periodic columns. Unlike the previous examples, the self-similar column for w 4/3 does not contain the sequence simply transformed by adding a constant d; each 0 is increased by 1 and other integers are increased by 2.
Theorem 7. There exist words u, v of lengths |u| = 56 − 1 and |v| = 18 such that We prove Theorems 3-5 and Theorem 7 in Section 6. In principle, Conjecture 6 can be proved in the same manner, although the computation would take longer than we choose to wait.
Finally, we mention a word whose structure we do not know. We have computed the prefix of w 5/4 of length 400000. When partitioned into rows of length 12 (or 6 or 24), all but one column appears to be eventually periodic, but we have not been able to identify self-similar structure in this last column.
With the possible exception of w 5/4 , the structure of each of these words is organized around some integer k. This integer is the length of ϕ(n) for each n ∈ Z ≥0 . We next show that the sequence of letters in such a word forms a k-regular sequence. For the fixed point ϕ ∞ (0) of a k-uniform morphism ϕ(n) = u (n + d), the k-regularity follows directly from Equation (1).
Let v be a nonempty finite word on Z ≥0 ∪ {0 } whose first letter is 0 and whose remaining letters are integers. Let Then the sequence of letters in τ (ϕ ∞ (0 )) is a k-regular sequence.
Proof. Let w(i) be the letter at position i in τ (ϕ ∞ (0 )). Let u(i) be the letter at position i in u. We have (2) w The set of sequences {w(k e i + j) i≥0 : e ≥ 0 and 0 ≤ j ≤ k e − 1} is called the kkernel of w(i) i≥0 . We show that the Z-module generated by the k-kernel of w(i) i≥0 is finitely generated, and hence w(i) i≥0 is k-regular.
First we consider sequences in the k-kernel of the form w(k e i + j) i≥0 where j = −k e q + k e −1 k−1 (|v| + k − 1) for some integer q. Since 0 ≤ j < k e , we have Q − 1 < q ≤ Q (and therefore q = Q ), where Q = k e −1 k e (k−1) (|v| + k − 1) . Since Q approaches the finite limit |v|+k−1 k−1 as e → ∞, for sufficiently large e all the q are the same. Moreover, e applications of Equation (2) show that w(k e i + j) = w(i − q) + ed for all i ≥ q. This isn't sufficient, because we would like to show that the terms w(k e i + j) are related for all i ≥ 0. However, N applications of Equation (2) show that provided the argument of w on the right side is non-negative. To ensure this is the case, we solve for N and find that suffices. (Note that the argument of log k is less than 1, so N < e.) We conclude that w(k e i + j) − N d is independent of e. Therefore, up to addition by multiples of d, there are only finitely many distinct sequences of the form w(k e i + j) i≥0 , and the Z-module they generate is finitely generated.
Every sequence in the k-kernel that is not of the form discussed in the previous paragraph is eventually constant, since iteratively applying Equation (2) shows that it is some multiple of d plus a subsequence of one of k − 1 eventually constant sequences. Aside from the multiple of d, there are only finitely many distinct sequences of this form. So these sequences also generate a finitely generated Zmodule.
The value of k for which a given sequence is k-regular is not unique; for each α ≥ 1, a sequence is k-regular if and only if it is k α -regular [2,Theorem 2.9]. Define an equivalence relation ∼ on Z ≥2 in which k ∼ l if there exist positive integers s and t such that k s = l t . If k ∼ l, then k and l are said to be multiplicatively dependent. Corollary 10 uses the following bound on the growth rate of letters in w a/b and a result of Bell [3] to show that if w a/b is k-regular then k is unique up to multiplicative dependence.
Theorem 9. Let a, b be relatively prime positive integers such that a b > 1. Let w(i) be the letter at position i in w a/b . Then Proof. If w a/b is a word on a finite alphabet, then the conclusion clearly holds, so assume that every n ≥ 0 occurs in w a/b . Let i n be the position of the first occurrence of n in w a/b . We say that a word v is a pre-a b -power if |v| = ma for some integer m ≥ 1 and if there exists an integer c such that the word obtained by changing the last letter of v to c is an a b -power.
Since m 0 , . . . , m n−1 are distinct positive integers, it follows that i n − i n−1 ≥ M ≥ n. It follows that i n grows at least like n 2 , so w(i) = O( √ i).
Corollary 10. Let a, b be relatively prime positive integers such that a b > 1. The values of k for which w a/b is k-regular are equivalent modulo ∼.
Proof. Write w a/b = w(0)w(1) · · · . Bell's generalization [3] of Cobham's theorem implies that if w(i) i≥0 is both k-regular and l-regular, where k ≥ 2 and l ≥ 2 are multiplicatively independent, then i≥0 w(i)x i is the power series of a rational function whose poles are roots of unity. Therefore it suffices to show that i≥0 w(i)x i is not such a power series. The coefficients of a rational power series whose poles are roots of unity are given by an eventual quasi-polynomial, that is, d e (i)i e + · · · + d 1 (i)i + d 0 (i) where each d j (i) i≥0 is eventually periodic. The sequence w(i) i≥0 is not given by an eventual quasi-polynomial of degree e = 0 since it is not eventually periodic. Theorem 9 rules out degree each e ≥ 1.
Corollary 10 suggests a partial function is the class of integers k such that w a/b is k-regular, if this class exists. If this class does not exist, we leave ρ( a b ) undefined. When ρ( a b ) is defined, we will abuse notation and write ρ( a b ) = k, choosing one integer k from the class. For example, ρ(a) = a for each integer a ≥ 2, and ρ( 3 2 ) = 6. Theorems 1-5 imply the values ρ( 5 3 ) = 7, ρ( 9 5 ) = 13, ρ( 8 5 ) = 733, ρ( 7 4 ) = 50847, and ρ( 6 5 ) = 1001. Conjecture 6 would imply ρ( 7 5 ) = 80874. The proof of Theorem 8 applies to the morphism in Theorem 7, since w(i) i≥0 is the only sequence in the 56-kernel that contains a 0 and is not eventually constant, so for the remaining sequences we can take d = 2 to be constant; therefore ρ( 4 3 ) = 56. We do not know if ρ( 5 4 ) is defined. Open question. For each rational number a b > 1, does there exist k such that w a/b is k-regular? In other words, is ρ( a b ) defined? We end this section by noting that there are two other natural notions of avoidance for fractional powers. For each notion we may define the lexicographically least word avoiding all words in the corresponding pattern.
Notation. Let a and b be relatively prime positive integers such that a b > 1. • w ≥a/b is the lex. least infinite word on Z ≥0 avoiding p q -powers for all p q ≥ a b . • w >a/b is the lex. least infinite word on Z ≥0 avoiding p q -powers for all p q > a b .
For an integer a ≥ 2, the structure of w ≥a is as follows.
Proposition 11. Let a ≥ 2 be an integer. Then w ≥a = w a .
Proof. Since the language of a-powers is a subset of the language of (≥ a)-powers, we have w ≥a ≥ w a lexicographically. Conversely, every p q -power contains a p qpower, so w a does not contain any p q -power with p q ≥ a; since w ≥a is the lexicographically least such word, it follows that w a ≥ w ≥a lexicographically. Therefore w ≥a = w a .
If b ≥ 2, it is possible to avoid a b -powers while containing a p q -power with p q > a b . For example, 001102 avoids 3 2 -powers but contains squares. For this reason, w ≥a/b and w a/b are not equal in general.
Proposition 12. Let a, b be relatively prime positive integers such that a b > 1 and b ≥ 2. Then w ≥a/b = w a/b .
Proof. To show that w ≥a/b and w a/b are unequal, we compare their prefixes. We start with w a/b . The word 0 a−1 is a b -power-free, because the shortest a b -powers have length a. The word 0 a = (0 b ) a/b is an a b -power, so the length-a prefix of w a/b is 0 a−1 1. The word w ≥a/b begins with 0 a/b −1 , since every p q -power with p q ≥ a b has length at least p ≥ aq b ≥ a b . However, 0 a/b is a a b -power, so w ≥a/b begins with 0 a/b −1 1. Since a b < a, the two words are not equal.
The words w ≥3/2 and w 3/2 are generated by the same underlying 6-uniform morphism [7]. In particular, w ≥3/2 (5i+4) = w 3/2 (i)+3 for all i ≥ 0. The words w ≥4/3 and w 4/3 also appear to have the essentially the same self-similar column. It would be interesting to know whether similar statements hold for other rationals. On the other hand, w >a/b and w a/b need not be related, even for b = 1. Guay-Paquet and Shallit [6] studied the overlap-free word w >2 = 001001100100200100110010021001002001001100 · · · and exhibited a (non-uniform) morphism ϕ such that w >2 = ϕ ∞ (0). The words w >2 and w 2 appear to be unrelated. It seems likely that the structure of w >a/b is typically more difficult to determine than w a/b .
3. The intervals a b ≥ 2 and 5 3 ≤ a b < 2 It turns out that for a b ≥ 2 the lexicographically least a b -power-free word is a word we have already seen. For example, one computes w 5/2 = 00001 00001 00001 00001 00002 00001 00001 00001 00001 00002 · · · and observes that w 5/2 agrees with w 5 on a long prefix. In fact these two words are the same.
Theorem 14. Let a, b be relatively prime positive integers such that a b ≥ 2. Then w a/b = w a . In particular, ρ( a b ) = a. Proof. We show that w a/b is a-power-free, which implies that w a ≤ w a/b lexicographically, and that w a is a b -power-free, which implies that w a/b ≤ w a lexicographically.
Since the a-power v a is an a b -power (v b ) a/b and w a/b is a b -power-free, it follows immediately that w a/b is a-power-free.
Suppose toward a contradiction that w a contains an a b -power. Let v a/b (where |v| is divisible by b) be an a b -power in w a of minimal length. Since a b ≥ 2, v 2 occurs in w a . We have |v a/b | ≥ a, since a and b are relatively prime. On the other hand, the longest zero factor of w a is 0 a−1 , so v contains at least one nonzero letter. The nonzero letters of w a = w(0)w(1) · · · are w(i) for i ≡ a − 1 mod a. Since nonzero letters occur spaced by multiples of a and v 2 occurs in w a , it follows that |v| is divisible by a. Let x be the word obtained by deleting the 0 letters of v. Then the word obtained by deleting the 0 letters of v a/b is x a/b , which is also the word obtained by sampling every a letters of v a/b starting from the first nonzero letter. By Equation (1) with d = 1, the word obtained by subtracting 1 from each letter in x a/b occurs in w a , and since |x| = |v|/a this contradicts the minimality of v.
In light of Theorem 14, the remainder of the paper is concerned with w a/b for 1 < a b < 2. Let Q (1,2) := Q ∩ (1, 2) denote the set of rational numbers in this interval. The following result relates the prefixes of w a/b to each other.
Proposition 15. The order on Q (1,2) ∪ Z ≥2 induced by the lexicographic order on Proof. Let a and b be relatively prime positive integers such that a b ∈ Q (1,2) ∪ Z ≥2 . As in the proof of Proposition 12, the length-a prefix of w a/b is 0 a−1 1. Therefore the rationals in Q (1,2) ∪ Z ≥2 with a given numerator form an interval in the order . If a = 2 then b = 1 and this prefix is followed by 02. If a ≥ 3 and b = 1 then this prefix is followed by 0 a−1 1. Finally, if b > 1 then this prefix is followed by 0 a−b−1 1, since 0 a−1 10 a−b−1 is a b -power-free but 0 a−1 10 a−b contains the a b -power 0 b−1 10 a−b = (0 b−1 1) a/b . This is sufficient information to distinguish all w a/b and hence order them. In particular, a Recall our claims in Theorems 1 and 2 that w 5/3 = ϕ ∞ (0) for the 7-uniform morphism ϕ(n) = 000010(n + 1) and w 9/5 = ϕ ∞ (0) for the 13-uniform morphism ϕ(n) = 000000001000(n + 1). These morphisms differ only in their run lengths. It is not immediately obvious why 7 is the correct value of k for w 5/3 and why 13 is the correct value for w 9/5 . However, these values can be understood in the context of an infinite family of morphisms that generate words w a/b .
Theorem 16. Let a, b be relatively prime positive integers such that 5 3 We devote the remainder of this section to a proof of Theorem 16. We will use the following concept. A morphism ϕ is a b -power-free if it preserves a b -powerfreeness; that is, if w is a b -power-free then ϕ(w) is a b -power-free. For example, the morphism ϕ(n) = 0 (n + 1) is square-free [6]. Since the word 0 is a b -power-free, if ϕ is an a b -power-free morphism then ϕ ∞ (0) is also a b -power-free. If, moreover, ϕ ∞ (0) is the lexicographically least a b -power-free word, then w a/b = ϕ ∞ (0). In the interval 1 < a b < 2, an a b -power is a word of the form (xy) a/b = xyx, where |xy| = mb and |xyx| = ma for some m ≥ 1. It follows that |x| = m · (a − b) and |y| = m · (2b − a). We use this in the following proof and repeatedly throughout the paper. Note that a b < 2 implies y is not the empty word.
Proof of Theorem 16. To show that ϕ is a b -power-free, we show that if w is a finite word such that ϕ(w) contains an a b -power, then w contains an a b -power. Suppose that v a/b is an a b -power factor of ϕ(w), where |v| = mb for some m ≥ 1. First consider m = 1. The word ϕ(w) is a finite concatenation of words of the form 0 a−1 1 0 a−b−1 (n + 1) for n ≥ 0. We would like to survey all length-a factors of ϕ(w) and verify that they are not a b -powers. To do this, it suffices to slide a window of length a through the circular word 0 a−1 1 0 a−b−1 (n + 1), since the length of this word is 2a − b > a. There are |ϕ(n)| = 2a − b length-a factors of 0 a−1 1 0 a−b−1 (n + 1), but we can partition them into the following five forms parameterized by i.
Therefore each length-a factor of ϕ(w) appears in the previous table for some n ≥ 0 and some i in the specified range. Each a b -power of length a is of the form xyx where |x| = a − b and |y| = 2b − a, so to check that no length-a factor of ϕ(w) is an a b -power we refine the previous table by writing each factor as xyz with |x| = a − b, Since n + 1 = 0, we see that x = z for each factor. (When n = 0, we have x = z in the third row since i = 2b − a + i.) It follows that ϕ(w) contains no a b -power factor of length a.
Therefore m ≥ 2. There are two cases. First consider the possibility that x = 0 m·(a−b) . The longest zero factor of ϕ(w) is 0 a−1 , so m · (a − b) < a, which implies m < a a−b ≤ a a−(3/5)a = 5 2 . Therefore m = 2. The factor y has at least one nonzero letter, because x0 2(2b−a) x = 0 2a is not a factor of ϕ(w). On the other hand, if y has at least two nonzero letters, then, since the shortest maximal zero factor of ϕ(w) is 0 a−b−1 , we have 2(2b − a) = |y| ≥ 1+(a−b−1)+1, which implies 5b ≥ 3a+1, but this contradicts 5 3 ≤ a b . Therefore y has exactly one nonzero letter. Therefore which produce a contradiction when solving for i. Now consider the case that x contains a nonzero letter. The word 0 a−b−1 10 a−b−1 is the only nonzero word of its length that can occur in ϕ(w) at two positions that are distinct modulo k := |ϕ(n)| = 2a−b, and extending this word in either direction determines its position modulo k; therefore two occurrences in ϕ(w) of any nonzero word of length ≥ 2(a − b) have positions that are congruent modulo k. Since , the positions of the two occurrences of x in ϕ(w) are congruent modulo k. These two positions differ by |xy| = mb, so k | mb. This implies k | m, since gcd(b, k) = 1. Therefore k divides |xyx|. Now we shift xyx appropriately. Let j be the position of xyx in ϕ(w). Write j = ki + r for some 0 ≤ r ≤ k − 1. Let x y z be the word of length |xyx| beginning at position ki, where |x | = |z | = |x| and |y | = |y|. We claim that x = z . Since |x y z | begins at position ki and |xy| is divisible by k, the words x and z agree on their first r letters. The remaining |x| − r letters of x are the first |x| − r letters of x, and the remaining |x| − r letters of z are the first |x| − r letters of the second x. Therefore x = z , and we have found an a b -power x y x beginning at position ki. Since the positions of y and both occurrences of x are all divisible by k, we have x = ϕ(u) and y = ϕ(v) for some u and v, which implies that the a b -power uvu is a factor of w.
It remains to show that ϕ ∞ (0) is lexicographically least. We show that decrementing any nonzero letter in ϕ ∞ (0) to any smaller number introduces an a b -power factor ending at that position. Since ϕ(n) = 0 a−1 10 a−b−1 (n + 1), the nonzero letters occur at positions congruent to a − 1 or k − 1 modulo k. The letter at each position congruent to a − 1 modulo k is 1, and decrementing this 1 to 0 introduces the a b -power 0 a = (0 b ) a/b ending at that position. The letter at a position congruent to a − 1 modulo k is n + 1 for some n ≥ 0. Consider the effect of decrementing n + 1 to c for some 0 ≤ c ≤ n. If c = 0, then this introduces the a b -power . Let c ≥ 1, and assume that decrementing any letter to c − 1 introduces an a b -power ending at this c − 1. Let ϕ(w) be a prefix of ϕ ∞ (0) with last letter n + 1. Then w is a prefix of ϕ ∞ (0) with last letter n. Decrementing n + 1 to c produces the word ϕ(w ), where w is the word obtained by decrementing the last letter of w to c − 1. By the inductive assumption, w contains an a b -power suffix; therefore ϕ(w ) does as well.
4. a b -power-free morphisms Theorem 16 establishes the structure of w a/b for an infinite family of rationals a b . It turns out there are additional families of words whose structure is given by a symbolic morphism, and there are also many words w a/b whose structure is given by a morphism that has not been found to belong to a general family. (Additionally, as in Theorem 5, sometimes the word w a/b is not a fixed point of a uniform morphism but is nonetheless related to a uniform morphism.) Ideally, we would prove a single theorem that captures all these cases. However, the structures are diverse enough that it is not clear how to unify them. The next best thing, then, is to identify a general proof scheme so that each individual proof may be carried out automatically. In this section we describe how to automatically verify that a morphism is a b -powerfree. We then apply this method to 30 symbolic morphisms.
The basic idea is that we use the special form of the morphisms to reduce the statement that ϕ is a b -power-free to a finite case analysis and then develop software to carry out the case analysis. In the case of an explicit rational number (as in Theorems 3 and 4) this is more or less straightforward using the results below. However, for parameterized morphisms that are symbolic in a and b (as in Theorem 16), this can require a significant amount of symbolic computation. 4.1. Bounding the factor length. As in the proof of Theorem 16, to show that ϕ is a b -power-free, we must verify that if ϕ(w) contains an a b -power then w contains an a b -power. In this subsection we reduce this task to the task of verifying the statement for factors of length am for only finitely many values of m. We use the following concept, which is related to the synchronization delay introduced by Cassaigne [5].
Definition. Let k ≥ 2 and ≥ 1. Let ϕ be a k-uniform morphism on Σ. We say that ϕ locates words of length if for each word x of length there exists an integer j such that, for all w ∈ Σ * , every occurrence of the factor x in ϕ(w) begins at a position congruent to j modulo k.
If ϕ locates words of length , then ϕ also locates words of length + 1, since if |x| = + 1 then the position of the length-prefix of x is determined modulo k.
≥0 be a word of length k − 1, and let ϕ be the k-uniform morphism defined by ϕ(n) = u (n + d). If, for all integers n ≥ 0 and all integers a ≥ 2, the word u (n + d) is not an a-power, then ϕ locates words of length k.
for some n ≥ 0 and 0 ≤ i ≤ k − 1. Suppose v occurs elsewhere in ϕ(w). Then without loss of generality we have If i = 0 then n = n and the two occurrences of v begin at positions that are congruent modulo k.
Therefore n = n, and, since u (n + d) is not an a-power, we have i = 0.
An a b -power (xy) a/b = xyx contains two occurrences of x, so if a morphism ϕ locates words of some length, then the length of xy for sufficiently long a b -powers in ϕ(w) is constrained to be divisible by k. In Lemma 18 and Proposition 19 we use this to bound the length of factors of ϕ(w) that we must verify are not a b -powers in order to conclude that ϕ is a b -power-free. Lemma 18. Let a, b be relatively prime positive integers such that 1 < a b < 2. Let k ≥ 2 such that gcd(b, k) = 1, and let ≥ 1. Let ϕ be a k-uniform morphism on a (finite or infinite) alphabet Σ such that • ϕ locates words of length , and • for all n, n ∈ Σ, the words ϕ(n) and ϕ(n ) differ in at most one position. Then w contains an a b -power whenever ϕ(w) contains an a b -power (xy) a/b = xyx with |x| ≥ .
The proof generalizes the case m ≥ 2 in the proof of a b -power-freeness in Theorem 16.
Let j be the position of xyx in ϕ(w). Write j = ki 1 + r for some 0 ≤ r ≤ k − 1. Then y begins at position j + |x| = ki 2 + r and the second x begins at position j +|xy| = ki 3 +r for some i 2 , i 3 . Since ϕ(n) and ϕ(n ) differ in at most one position, adjacent factors of length k in ϕ(w) differ in at most one position, so we can slide a window of length |xyx| either to the left or to the right from xyx and obtain an a bpower factor x y x beginning at position ki 1 or ki 1 +k with |x | = |x| and |y | = |y|. Since the positions of y and both occurrences of x are all divisible by k, we have x = ϕ(u) and y = ϕ(v) for some u and v, which implies that uvu = (uv) a/b is a factor of w.
Proposition 19. Assume the hypotheses of Lemma 18. Let I min ∈ Q such that Since m is an integer, we have m ≤ c·Imin−d This contradicts an assumption, so ϕ(w) is a b -power-free and hence ϕ is a b -power-free.
Given particular values of and a b , we may have several choices for c and d, and there are many choices for I min . But we will be applying Proposition 19 to families of morphisms for which I min , c, and d are fixed.
Example. Consider the morphism in Theorem 16, for which k = 2a − b. This morphism locates words of length = a, as one can verify with the assistance of the table of length-a factors of ϕ(w) in the proof of Theorem 16. To show that ϕ is a b -power-free, by Proposition 19 it suffices to verify for every a b -power-free w that ϕ(w) contains no a b -power factors of length ma for m ≤ m max = c·Imin−d Under some mild conditions, it turns out that the relevant factors of ϕ(w) are sufficiently short that w is necessarily a b -power-free. Namely, any factor of ϕ(w) of length ma is necessarily a factor of ϕ(v) for some word v of length ma−1 k + 1, and the following lemma shows that mmaxa−1 k + 1 ≤ a − 1. Since the shortest a b -powers have length a, this guarantees that v is a b -power-free. Therefore, in verifying the hypotheses of Proposition 19, we will replace "for every a b -power-free word w" with "for every word w"; we need not expend the computational effort to determine whether each w is a b -power-free, since it is too short to contain an a b -power. Lemma 20. Let s, t, m max ∈ Z ≥0 and I min , I max ∈ Q such that s = 0 and 1 < I min < I max ≤ 2. Let a min := min a ≥ 1 : Let a, b be relatively prime positive integers such that I min < a b < I max and gcd(b, s) = Proof. The condition s = 0 implies k = 0, so we can divide by k. The condition 1 < I min < I max ≤ 2 implies a min ≥ 3. We have This now implies mmaxa k < a − 2, which implies mmaxa k ≤ a − 2, and therefore mmaxa−1 k There will be no difficulty in satisfying the conditions of Lemma 20 for the morphisms we encounter.
Example (continued). The (2a − b)-uniform morphism ϕ in Theorem 16 locates words of length a, so s = 2, t = 1 and c = 1, d = 0. Earlier we computed m max = 2 for this morphism on the interval 5 3 < a b < 2. The smallest numerator of a rational number in the interval 5 3 < a b < 2 with an odd denominator is a min = 9, so we have s − t Imin (a min − 2) = 49 5 ≥ 2 = m max and Lemma 20 applies. To conclude that ϕ is a b -power-free, it remains to verify that no factor of ϕ(w) of length a or 2a is an a b -power. (Note that in the proof of Theorem 16 we only checked factors of length a by a table, suggesting the better bound m max = 1. But there we used a slightly different argument, treating separately the case where x consists only of zeros, which allowed the argument for large m to apply to factors of length 2a.) We would like to use Proposition 19 to prove, with as much automation as possible, that a given morphism ϕ, symbolic in a and b, is a b -power-free. For now we assume that the interval restricting a b is also given. The main steps required are the following.
(1) Identify an integer such that ϕ locates words of length .
(2) Verify that for every word w ∈ Σ * the word ϕ(w) contains no a b -power of length ma with 1 ≤ m ≤ m max , where m max is determined by .
Step (1) can be accomplished simply by using Lemma 17, but to carry out the computations for morphisms of moderate size we need to obtain a smaller value for ; we return to this in Section 4.4.
Step (2) works as in the proof of Theorem 16 by listing, for each m, all symbolic factors of length ma and verifying that none are a b -powers. We discuss the details in Sections 4.2 and 4.3. At this point the reader may be interested in looking at the theorems in Section 4.5 that we have proved with this approach.

4.2.
Listing factors of a given length. In the proof of Theorem 16 we generated a table of all possible length-a factors of words of the form ϕ(w). The idea behind the automatic generation of tables like this is that we slide a window of some length through the infinite word ϕ(n)ϕ(n)ϕ(n) · · · and stop when we reach a factor that is identical to the first. In Theorem 16, the window was short enough relative to |ϕ(n)| that each factor contained at most one letter n + 1. In general, this may not be the case; since we are considering arbitrary words we must slide a window through the word ϕ(n 0 )ϕ(n 1 )ϕ(n 2 ) · · · .
However, it suffices to slide a window through the periodic word ϕ(n)ϕ(n) · · · , stopping as before, and simply rename each occurrence of n in a factor to be a unique symbol n i before performing a test on that factor (such as determining whether it is an a b -power). When a b is an explicit rational number, ϕ(n) is a word of some explicit integer length. In this case, sliding a window through ϕ(n)ϕ(n) · · · is trivial; we simply increment the starting and ending positions of the window by 1 at each step.
For symbolic a b , we build a table as in Theorem 16, where each symbolic factor is parameterized by i in some interval. We treat ϕ(n)ϕ(n) · · · as a queue. Suppose the window length is ma. To compute the prefix of ϕ(n)ϕ(n) · · · of length ma, we begin with f = as the empty word and record the number ma − |f | of remaining letters to add. Initially there are ma remaining letters. At each step, we have a block c l of l identical letters c at the beginning of the queue, and we need to determine whether to add the entire block or just a part of it to our factor. If ma − |f | ≥ l, then we take the entire block; otherwise we take the partial block c ma−|f | . If we are creating a table of factors that are further factored into subfactors of lengths m · (a − b), m · (2b − a), and m · (a − b), then we do this procedure once for each subfactor.
To slide the window to the right, we add a parameter i to the run lengths of the current factor. We must determine the maximum value i max such that sliding the window through i max positions maintains factors whose run-length encodings differ from the current factor only in their exponents. The value of i max is the minimum of the first block length in f and the first block length in the queue (or, if we are factoring f into three subfactors, the minimum of the first block length in each of the three subfactors as well as the first block length in the queue). Drop i letters from the front of each subfactor, and add those i letters onto the end of the preceding subfactor (or throw them away from the first subfactor). This gives a factor parameterized by i for 1 ≤ i ≤ i max (or 0 ≤ i ≤ i max for the first factor), which we add to the list of factors. (If an interval contains only one point and the next interval contains more than one point, we can merge them to get an interval 0 ≤ i ≤ i max as in the tables of Theorem 16.) Then replace i with i max in the factor, drop i max letters from the front of the queue, and repeat.
The run lengths for the symbolic morphisms we encounter are linear combinations of a, b, 1. Therefore to compute i max we must be able to compute the minimum of two such expressions over an interval. For example, if 5 3 < a b < 2 and a, b ∈ Z ≥1 then min(2a − 2b, a − 1) = 2a − 2b.
If the minimum is not equal to either of its arguments on the entire interval, then we split the interval. For example, when we encounter min(a − b − 1, −4a + 7b) for the interval 3 2 < a b < 5 3 , we solve the homogeneous equation a − b = −4a + 7b to find a b = 8 5 and split the interval into three subintervals 3 2 < a b < 8 5 , a b = 8 5 , and 8 5 < a b < 5 3 . We continue breaking up subintervals until we can compute the symbolic factors of length ma for a b in each subinterval. Later, when we compute factors of length (m + 1)a, we start with the set of subintervals obtained for length ma.
Even though the length has changed, empirically it seems that most of the same subintervals reappear, so this saves the work of recomputing them.

4.3.
Testing inequality of symbolic words. Once we have generated all possible factors of ϕ(w) of length ma, we must verify that, for all values of parameters that appear (n, a, b, and any interval parameters i, j), each factor is not an a b -power. Since we have factored each word as xyz with |x| = |z| = m · (a − b), it suffices to verify that x = z.
We have not developed a decision procedure to decide if there exist parameter values for which two symbolic words are equal. For our purposes it is sufficient to show that pairs of symbolic words we encounter are unequal for all parameter values. We have implemented a number of criteria under which this is true. For example, if two words have identical prefixes (or suffixes), we can remove the common factor and recursively test inequality of the remaining factors. If the first letters or last letters in two words are unequal, then the words are unequal.
Another criterion is the following. Delete all explicit 0 letters in both words. If all remaining letters are unequal to 0 and the two new words are unequal, then the original words are unequal. It may happen that deleting 0s does not result in words that are unequal. For example, deleting 0s in the words 0 352a−621b−i−1 1 0 −51a+91b−1 (n + 1) 0 i , 0 −51a+91b−j−1 (n + 1) 0 352a−621b−1 1 0 j produces 1 (n + 1) and (n + 1) 1, which are not unequal if n = 0. We may still conclude the original words are unequal if the system of equalities of the corresponding deleted block lengths has no solution. In this example, −51a + 91b − 1 = 352a − 621b − 1 on the interval 30 17 < a b < 53 30 . We use our collection of inequality criteria to verify that no symbolic factor represents an a b -power, for the list of factors obtained on each subinterval by the process in Section 4.2. Since the subintervals are open intervals, we must also verify their endpoints. That is, when a b is an endpoint of a subinterval and satisfies the conditions of the theorem we are trying to prove (that is, it lies in the interval and is not eliminated by a gcd condition), then we check that the factors for this value of a b (which have explicit integer run lengths) are not a b -powers. 4.4. Determining a locating length. The final step to automate is the identification of an integer such that ϕ locates words of length . Lemma 18 and Proposition 19 show that the length of words that a morphism ϕ locates affects the length of factors we must check. Lemma 17 gives one possible length, but often we can find a smaller length. For example, as mentioned in Section 4.1, the (2a − b)-uniform morphism in Theorem 16 not only locates words of length 2a − b but also locates words of length a.
Given a morphism ϕ and an interval I min < a b < I max , we determine as follows. Generate all linear combinations ca − db with 10 ≥ c ≥ d ≥ 0. The upper bound 10 is sufficient for all morphisms we encounter below; for a more general upper bound, one could use s, where k = sa − tb, and then we are guaranteed to find a suitable under the conditions of Lemma 17. Eliminate any linear combinations that do not satisfy the hypothesis of Lemma 20, where m max = cImin−d Imin−1 − 1. Then sort the remaining linear combinations by the upper bound m max that each would imply if ϕ locates words of that length. Starting with the lowest potential bounds, test whether ϕ locates words of each length until a length is found.
To determine whether ϕ locates words of a given length, use the procedure described in Section 4.2 to compute all symbolic factors of ϕ(n)ϕ(n) · · · of the candidate length. As before, this may involve breaking up the interval I min < a b < I max into subintervals. Then check whether each pair of symbolic factors is unequal using the tests described in Section 4.3.

4.5.
Symbolic a b -power-free morphisms. We now give 30 symbolic a b -powerfree morphisms defined on the alphabet Z ≥0 . With the exception of Theorem 27, these morphisms were discovered empirically from prefixes of words w a/b . In Section 4.6 we discuss the details. We list morphisms in order of increasing number of nonzero letters in ϕ(n). The 30 intervals are shown in Figure 3.
All 30 theorems have the same form. Each concerns a k-uniform morphism ϕ parameterized by a and b with gcd(a, b) = 1. The ratio a b is restricted to some interval, the lower endpoint of which is I min , and we have k = sa − tb for some integers s, t. There is a divisibility condition on b coming from Lemma 18, namely gcd(b, k) = 1. Since a and b are relatively prime, gcd(b, k) = 1 is equivalent to gcd(b, s) = 1; we write the latter since it is more explicit.
For each morphism, a b -power-freeness is proved completely automatically. The computations are available from the web site of the second-named author 2 . Theorems 21 and 22 are each proved in approximately 1 second, including the time required to find . However, longer morphisms require more time; Theorem 50 took more than 7 hours to prove on a modern laptop (although this could be reduced by computing in parallel). Theorems 21 and 31 include the lower endpoint of their intervals; a separate step establishes each theorem at this endpoint.
The first theorem concerns the morphism in Theorem 16. We have already proven that ϕ is a b -power-free, but we include it here for completeness. Previously we showed that ϕ locates words of length a; our automatic procedure reduces this length.
with 12 nonzero letters, locates words of length 5a − 3b and is a b -power-free. Theorem 30. Let a, b be relatively prime positive integers such that 6 5 < a b < 5 4 and gcd(b, 7) = 1. Then the (7a − 4b)-uniform morphism with 13 nonzero letters, locates words of length 5a − 5b and is a b -power-free.
with 14 nonzero letters, locates words of length 6a − b and is a b -power-free. Theorem 33. Let a, b be relatively prime positive integers such that 7 5 < a b < 10 7 and gcd(b, 6) = 1. Then the (6a − b)-uniform morphism with 14 nonzero letters, locates words of length 2a and is a b -power-free. Theorems 32 and 33 contain a new obstacle, which is that their morphisms do not satisfy the conditions of Lemma 17 on the given intervals. Namely, for the morphism ϕ in Theorem 32, if a b = 17 12 then the word ϕ(0) is a square, and therefore ϕ does not locate words of any length . This rational does not satisfy gcd(b, 6) = 1, so it is excluded by the hypotheses, but the algorithm for finding does not take this into account. In practice, when the algorithm is unable to find a suitable , we perform a separate search for obstructions, which reveals the square for a b = 17 12 . (This separate search could also be performed preemptively, but it is not exhaustive since we only check rationals with small denominators.) Adding the assumption a b = 17 12 as input then effectively causes 17 12 to become an interior endpoint, splitting the interval into the two subintervals 11 8 < a b < 17 12 and 17 12 < a b < 3 2 . Similarly, the morphism in Theorem 33 produces a square ϕ(0) for a b = 65 46 ; again this rational does not satisfy the gcd condition but must be taken into account to find . For some morphisms (namely, those in Theorems 38, 43, 46, 48, and 50) there exists rationals for which ϕ(0) is a perfect power but that are not excluded by the gcd condition. These rationals must be added to the hypotheses, and this is the second reason why there might be exceptional rationals in an interval.
Theorem 40. Let a, b be relatively prime positive integers such that 15 13 < a b < 22

19
and gcd(b, 10) = 1. Then the (10a − 5b)-uniform morphism with 30 nonzero letters, locates words of length a and is a b -power-free. Theorem 41. Let a, b be relatively prime positive integers such that 9 7 < a b < 4 3 and gcd(b, 24) = 1. Then the (24a − 15b)-uniform morphism with 37 nonzero letters, locates words of length 2a and is a b -power-free. Theorem 42. Let a, b be relatively prime positive integers such that 10 9 < a b < 19 17 and gcd(b, 12) = 1. Then the (12a − 7b)-uniform morphism with 38 nonzero letters, locates words of length a and is a b -power-free. Theorem 43. Let a, b be relatively prime positive integers such that 11 10 < a b < 21 19 and a b = 32 29 and gcd(b, 13) = 1. Then the (13a − 8b)-uniform morphism with 42 nonzero letters, locates words of length a and is a b -power-free. Theorem 44. Let a, b be relatively prime positive integers such that 12 11 < a b < 23 21 and gcd(b, 14) = 1. Then the (14a − 9b)-uniform morphism with 46 nonzero letters, locates words of length 10a − 10b and is a b -power-free. Theorem 45. Let a, b be relatively prime positive integers such that 23 21 < a b < 34 31 and gcd(b, 14) = 1. Then the (14a − 9b)-uniform morphism with 46 nonzero letters, locates words of length a and is a b -power-free. Theorem 46. Let a, b be relatively prime positive integers such that 9 8 < a b < 26 23 and a b = 35 31 and gcd(b, 13) = 1. Then the (13a − 5b)-uniform morphism with 54 nonzero letters, locates words of length a and is a b -power-free.
Theorem 47. Let a, b be relatively prime positive integers such that 11 9 < a b < 16 13 and gcd(b, 38) = 1. Then the (38a − 15b)-uniform morphism with 102 nonzero letters, locates words of length 2a and is a b -power-free.
Theorem 49. Let a, b be relatively prime positive integers such that 7 6 < a b < 13 11 and a b = 20 17 and gcd(b, 66) = 1. Then the (66a − 28b)-uniform morphism with 191 nonzero letters, locates words of length 2a and is a b -power-free.
Theorem 50. Let a, b be relatively prime positive integers such that 10 9 < a b < 29 The morphisms in Theorems 29, 34, 36, and 37 appear to belong to a general family of parameterized morphisms with 4r nonzero letters for each r ≥ 3.
Conjecture 51. Let r ≥ 3 be an integer. Let a, b be relatively prime positive integers such that 2r+1 2r < a b < 2r 2r−1 and a b = 4r+1 4r−1 and gcd(b, 2r + 1) = 1. Let with 4r nonzero letters, is a b -power-free. Because of the additional symbolic parameter r, proving Conjecture 51 is beyond the scope of our code. If this conjecture is true, then it exhibits a b -power-free morphisms with a b arbitrarily close to 1. For r = 2 the morphism in Conjecture 51 is not defined, due to the factor B r−3 . However, moving from the free monoid to the free group on Z ≥0 allows us to interpret the factor Y B r−3 for r = 2 as 0 −2a+3b−1 1 · (0 2a−2b−1 1) −1 = 0 −4a+5b . By doing this, we obtain the morphism in Theorem 27. 4.6. Finding morphisms experimentally. The statements of the theorems in Section 4.5, with the exception of Theorem 27, were discovered by computing prefixes of w a/b for 910 rational numbers in the interval 1 < a b < 2. In all, we computed over 256 million letters. For each a b , we attempted to find an integer k such that partitioning (the prefix of) w a/b into rows of length k as in Figures 1 and 2 produces an array with k − 1 eventually periodic columns and one self-similar column in which w a/b reappears in some modified form.
In some cases, k can be found easily by determining the largest integer c that occurs in the prefix, computing the positions where it occurs, and computing the gcd of the successive differences of these positions. If w a/b = ϕ ∞ (0) for some ϕ(n) = u (n+d) where c does not occur in u, then this gcd is a multiple of |ϕ(n)| = k. When this method does not identify a candidate k, one can look for periodic blocks in the difference sequence of the positions of 1 (or some larger integer), and add the integers in the repetition period to get the length of the corresponding repeating factor of w a/b ; this procedure can detect repetitions of ϕ(0) in w a/b .
We identified conjectural structure in w a/b for 520 of the 910 rational numbers. (Note that these 910 numbers were not chosen uniformly; some were chosen to bound the interval endpoints for symbolic morphisms that had already been conjectured.) Of these 520 words, 510 have the property that w a/b + d (the word obtained by incrementing each letter by d) appears in the self-similar column for some integer d ≥ 0.
The remaining 10 words do not have a constant difference d. For the 4 rationals 59 48 , 65 57 , 73 60 , 113 99 the word w a/b reappears in the self-similar column with its letters incremented by a periodic but not constant sequence. For the 6 rationals 4 3 , 13 12 , 15 14 , 28 25 , 37 34 , 64 59 the increment appears to depend on the letter, as in Theorem 7. To find families of words w a/b with related structure among the 510 with constant d, for each word we record • the difference d, • the number of columns k, • the index of the self-similar column, and • the number of transient rows. For words w a/b such that all columns except the self-similar column are eventually constant, we build a word u of length k − 1 from the eventual values of these columns. Letting ϕ(n) = u (n + d), the structure of w a/b is potentially related to ϕ ∞ (0).
If u 1 and u 2 , arising from two different words, have the same subsequence c 1 , . . . , c r−1 of nonzero letters, we can look for a symbolic morphism ϕ(n) = u (n+d) that generalizes the two morphisms ϕ 1 (n) = u 1 (n + d) and ϕ 2 (n) = u 2 (n + d) by writing u = 0 i1a+j1b−1 c 1 · · · 0 ir−1a+jr−1b−1 c r−1 0 ira+jrb−1 and solving each linear system ia+jb = l using the two rationals a b to determine i, j. If there is not a unique solution for some block, discard this pair of rationals; otherwise this gives a unique symbolic morphism.
For each pair of words with the same d and same subsequence of nonzero letters, we construct a symbolic morphism, if possible. If multiple pairs of rationals give the same symbolic morphism, this suggests a general family. On the other hand, a symbolic morphism is likely not meaningful if it only appears for one pair of rationals and contains run lengths where the coefficients of a, b are rational numbers with large denominators. In practice, the coefficients of a, b in all morphisms in Section 4.5 are integers, although it is conceivable that families with non-integer coefficients exist (in which case a b would be restricted by a gcd condition on a or b). For each symbolic morphism, we then attempt to determine an interval I min < a b < I max on which the morphism is a b -power-free. Each block 0 ia+jb−1 c in ϕ(n) restricts the values that a b can take, since the run length must be non-negative. We solve the homogeneous equation ia+jb = 0 to get a lower bound or upper bound on a b for each block. We use the maximum lower bound and minimum upper bound as initial guesses for the interval endpoints. If this guessed interval is too wide, then running the algorithm identifies obstructions to a b -power-freeness, and we shorten the interval by removing the subintervals on which a b -power-freeness failed to be verified.
Many symbolic morphisms do not turn out to be a b -power-free on a general interval. A common problem is that solving a homogeneous equation to split an interval gives a value that is not in the interval. A particularly disappointing case occurs among morphisms with 14 nonzero letters. We identified 30 rational numbers in the interval 4 3 < a b < 3 2 for which the correct value of k for w a/b seems to be 6a − b and which have 14 eventually nonzero columns when partitioned into rows of length k. One might expect the structure of all these words to be explained by the same symbolic morphism, but in fact no three of these words are captured by the same symbolic morphism. We only found suitable intervals for two of the resulting morphisms (Theorems 32 and 33). Both morphisms use 24 17 as one of their sources, which leaves 27 of the 30 rationals without a symbolic morphism.

Families of words w a/b
In Section 4 we identified a number of symbolic a b -power-free morphisms that were derived from words w a/b . In this section we discuss exact relationships between some of these morphisms and w a/b . We begin with the morphism in Theorem 23.
Theorem 52. Let a, b be relatively prime positive integers such that 3 2 < a b < 5 3 and gcd(b, 5) = 1. Let ϕ be the (5a − 4b)-uniform morphism defined by Then w a/b = ϕ ∞ (0). In particular, ρ( a b ) = 5a − 4b. Proof. The morphism ϕ is a b -power-free by Theorem 23, so it suffices to show that ϕ ∞ (0) is the lexicographically least a b -power-free word. Write ϕ(n) = u (n + 1). Decrementing one of the three 1 letters in u to 0 introduces the a b -power (0 b ) a/b , (0 b−1 1) a/b , or (0 −a+2b−1 10 a−b ) a/b . The word u 0 ends with (0 b−1 1) a/b , and the induction argument showing that decrementing n + 1 to c ≥ 1 introduces an a bpower works exactly as in the proof of Theorem 16. Namely, decrementing n + 1 to c corresponds, under ϕ, to decrementing an earlier letter n to c − 1.
The morphisms in Theorems 21 and 23 are the only morphisms in Section 4.5 that were derived from words w a/b with no transient. We can see w a/b = ϕ ∞ (0) for the other morphisms, since the length-a prefix of w a/b is 0 a−1 1 (as in Proposition 12) but the length-a prefix of ϕ ∞ (0) is not 0 a−1 1. To account for a transient, we extend ϕ to the alphabet Z ≥0 ∪ {0 } and consider morphisms of the form Still, we cannot expect every morphism in Section 4.5 to be related to w a/b for each a b in its corresponding interval, since there exist rationals to which multiple theorems apply with different values of k; by Corollary 10, the word τ (ϕ ∞ (0 )) is k-regular for a unique value of k up to multiplicative dependence. For example, a b = 24 17 satisfies the conditions of Theorems 25, 26, 32, and 33. The corresponding values of k are 4a − 2b = 62 for Theorems 25 and 26, and 6a − b = 127 for Theorems 32 and 33; these two row widths are shown in Figure 4. While there are regions of the word w 24/17 that have constant columns when partitioned into rows of width 62, these do not persist, and therefore it seems the morphisms in Theorems 25 and 26 do not determine the long-term structure.
To establish the structure of a word w a/b with a transient, we must generalize our approach to proving a b -power-freeness and lexicographic-leastness. For rationals satisfying the conditions of Theorem 41, the word w a/b has a short transient. Recall that τ (0 ) = 0 and τ (n) = n for n ∈ Z ≥0 . Then w a/b = τ (ϕ ∞ (0 )). In particular, ρ( a b ) = 24a − 15b. Proof. First we show that τ (ϕ ∞ (0 )) is a b -power-free. Since a factor beginning at position i ≥ |v| in τ (ϕ ∞ (0 )) is a factor of ϕ(w) for some finite factor w of τ (ϕ ∞ (0 )). Since ϕ| Z ≥0 is a b -power-free by Theorem 41, it follows that if τ (ϕ ∞ (0 )) contains an a b -power then it contains an a b -power beginning at some position i ≤ |v| − 1.
It remains to show that τ (ϕ ∞ (0 )) contains no a b -power (xy) a/b = xyx with |x| ≤ 9(a−b) beginning at a position i ≤ |v|−1. For each m in the range 1 ≤ m ≤ 9, this is accomplished by sliding a window of length ma through τ (ϕ ∞ (0 )) from position 0 to position |v| − 1 and verifying inequality of symbolic factors as in Section 4.3. Now we show that decrementing any nonzero letter of τ (ϕ ∞ (0 )) introduces an a b -power. Decrementing one of the three 1 letters in the prefix τ (v) to 0 introduces the a b -power (0 b ) a/b or (0 b−1 1) a/b . Every other nonzero letter is a factor of ϕ(n) for some integer n. Write ϕ(n) = u (n + 2) for n ∈ Z ≥0 . The word u 2 contains 37 nonzero letters. One checks that decrementing each 1 in u 2, except the first two, to 0 and each 2 in u 2 to 0 or 1 introduces an a b -power of length a or 2a. For the first two 1s, there are two cases each, depending on whether u is immediately preceded by τ (v) or ϕ(n). Decrementing the first 1 to 0 introduces the a b -power if preceded by ϕ(n). Decrementing the second 1 to 0 introduces the a b -power if preceded by ϕ(n). As in the proof of Theorem 52, decrementing n + 2 to c ≥ 2 corresponds to decrementing an earlier letter n to c − 2 and therefore, inductively, introduces an a b -power. In addition to the presence of transients, another complication is that some words w a/b reappear with finitely many modified letters in the self-similar column. As shown in Figure 5, w 19/16 has k − 1 eventually constant columns when partitioned into k = 53 columns. After 4 transient rows, the self-similar column consists of w 19/16 + 1 with the letter at position 18 changed from 2 to 0. The previous letter in w 19/16 is 1, despite it being in an eventually-0 column. However, changing this 1 to 0 introduces an a b -power ending at that position, where we work in the free group on Z ≥0 in order to remove some letters from the end of ϕ(0). This happens because the prefix v and ϕ(0) have a nonempty common suffix. In contrast, in Theorem 53 the prefix v ends with 1 whereas ϕ(n) ends with n + 2, so we aren't at risk of completing the a-power before we reach the self-similar column on the (a + 1)th row.
To capture this modification, we introduce a new letter 1 , define τ (1 ) = 1, and define ϕ(1 ) to be identical to ϕ(1) except in two positions. Only minor changes to the proof of Theorem 8 are necessary to establish that the sequence of letters in τ (ϕ ∞ (0 )) is k-regular. The following conjecture claims the structure of w 19/16 is related to the morphism in Theorem 29. Note however that the interval is shorter than in Theorem 29.
Exceptional rationals can arise when the morphism fails to produce an a b -powerfree word. For example, 37 34 ∈ Q 6 , since ϕ(0 ) contains an 37 34 -power of length 8 · 37 ending at position 410. The correct value of k for w 37/34 seems to be 777.
Exceptional rationals can also arise when a morphism fails to produce a lexicographically least word. The following conjecture gives the structure of w a/b for most rationals satisfying the conditions of Theorem 43. However and define ϕ(1 ) to be the word obtained by taking ϕ(1) and changing the 0 at position 32a − 29b − 1 to 1 and the last letter to 0, then w a/b = τ (ϕ ∞ (0 )). In particular, ρ( a b ) = 13a − 8b. For rationals satisfying the conditions of Theorem 46, the word w a/b appears to be given by the following. and define ϕ(1 ) to be the word obtained by taking ϕ(1) and changing the 0 at position 36a − 31b − 1 to 1 and the last letter to 0, then w a/b = τ (ϕ ∞ (0 )). In particular, ρ( a b ) = 13a − 5b. Of the 29 symbolic morphisms given in Section 4.5 that were derived from words w a/b , at this point we have given a theorem or conjecture relating 9 of them to an infinite family of words w a/b . The 10 morphisms in Theorems 24, 25, 26, 28, 30, 32, 33, 38, 45, and 49 were derived from at most four each and therefore are perhaps not likely to be related to infinitely many words w a/b . It seems likely that the final 10 morphisms do govern the large-scale structure of infinitely many words w a/b . However, none of these families support a symbolic prefix v of the form we have seen previously, since the number of nonzero letters in the prefix is not constant. Specifying the general structure of these words will therefore require a new idea. We state the following conjectures in terms of ρ only.
For the two families of words that give rise to the morphisms in Theorems 35 and 48, the index of the self-similar column is a linear combination of a, b, which might indicate some structure that would be useful in a proof. Note that the intervals are shorter than the intervals for a b -power-freeness. Conjecture 58. Let a, b be relatively prime positive integers such that 28 25 < a b < 9 8 and gcd(b, 10) = 1. Then ρ( a b ) = 10a − 8b. Conjecture 59. Let a, b be relatively prime positive integers such that 26 23 < a b < 17 15 and gcd(b, 53) = 1. Then ρ( a b ) = 53a − 30b. The remaining morphisms are those in Theorems 22, 31, 39, 40, 42, 44, 47, and 50. For words related to these, the index of the self-similar column is not a linear combination of a, b. The morphism in Theorem 22 seems to be related to w a/b only for odd b, so we add this condition to Conjecture 60. Additionally, the morphism in Theorem 31 seems to be related to w a/b only for b divisible by 3, so we add this condition to Conjecture 61. Then ρ( a b ) = 67a − 30b. Our method for identifying families of related rationals is restrictive in several ways. For example, we did not look for relationships among words with different numbers of nonzero letters.
Additionally, we do not know how to prove a b -power-freeness for families such as the following two families (with 10 nonzero letters each), where there is an extra congruence condition on the denominator. For the rationals in Conjecture 69, the word w a/b contains 6 columns that are not eventually constant but are eventually periodic with repetition period 01, reminiscent of w 3/2 .

Sporadic words w a/b
Of the 520 rational numbers a b for which we identified conjectural structure in w a/b , there are 277 that fall under Theorem 16 or one of the theorems or conjectures in Section 5. In this section we establish the structure of w a/b for some of the remaining rationals. In particular, we prove Theorems 3-5. For proving a b -powerfreeness, much of the algorithm is the same as for symbolic a b . There are some differences, however. The most obvious difference is that we need not work with run-length encodings of words; we can manipulate ϕ(n) as a word on the alphabet Z ≥0 ∪ {n + d}. Computationally this is much faster, especially considering the scale involved: Whereas the largest symbolic morphism we identified was the morphism in Theorem 50 with 279 nonzero letters, the 50847-uniform morphism for w 7/4 has 11099 nonzero letters.
Another difference is how we find an integer such that ϕ locates words of length . Rather than use the method of Section 4.4, we compute the minimum possible as follows. Begin with = 0, and maintain a set of sets of the positions of length-factors of ϕ(n) that are not unequal (that is, each pair of these factors is equal for some value of n). Initially, this set is {{0, 1, . . . , |ϕ(n)| − 1}}, since all length-0 factors of ϕ(n) are equal. We treat ϕ(n) as a cyclic word so we visit all factors of ϕ(n)ϕ(n) · · · . Then increase by 1, and update the sets of positions for the new length by extracting a single letter from ϕ(n) for each position; in this way we avoid holding many large words in memory. During this update, a set of positions breaks into multiple sets if it contains a pair of positions corresponding to unequal words of length that were not unequal for − 1. Delete any sets containing a single position, since the corresponding factor of length or greater is uniquely located modulo |ϕ(n)|. When the set of position sets becomes empty, then ϕ locates words of length .
We use a variant of Proposition 19 in which m max := a−b − 1, since a−b is an explicit rational number. We verify the conclusion of Lemma 20 directly by checking that mmaxa−1 k + 1 ≤ a − 1. The structure of words w a/b with no transient can now be established automatically. We condense results for 24 words (including Theorems 3 and 4) into the following theorem. In particular, this establishes the value of ρ( a b ) for these 24 rationals.
Theorem 70. For each a b in Table 1, there is a k-uniform morphism ϕ(n) = u (n + d) such that w a/b = ϕ ∞ (0). Moreover, ϕ locates words of length .  Perhaps w 71/50 can be generalized to an infinite family using this structure. Words w a/b with a transient can be handled as in Section 5. We now prove Theorem 5 on the structure of w 6/5 . Theorem 5. There exist words u, v of lengths |u| = 1001 − 1 and |v| = 29949 such that w 6/5 = τ (ϕ ∞ (0 )), where Proof. Let k = 1001. Compute the prefix of v u of w 6/5 where |v | = 29949 and |u| = k − 1. Let v = 0 0 4 1 5 0 1 2 1 · · · 1 1 2 1 be the word obtained by changing the first letter of v to 0 .
It remains to show that τ (ϕ ∞ (0 )) contains no 6 5 -power (xy) 6/5 = xyx with |x| ≤ m max (a − b) beginning at a position i ≤ |v| − 1. For each m in the range 1 ≤ m ≤ m max , this could be accomplished by sliding a window of length ma through τ (ϕ ∞ (0 )) from position 0 to position |v| − 1 and verifying inequality of factors. But since we already computed the length-50000 prefix of w 6/5 in order to guess its structure, and (|v| − 1) + m max a − 1 = 48161 ≤ 50000, we simply check that this prefix agrees with τ (ϕ ∞ (0 )) and conclude that there are no 6 5 -powers in that range. Now we show that decrementing any nonzero letter of τ (ϕ ∞ (0 )) introduces a 6 5 -power. Decrementing any nonzero letter of τ (v) introduces a 6 5 -power, since we defined v = τ (v) to be a prefix of w 6/5 . Every other nonzero letter is a factor of ϕ(n) for some integer n. Decrementing any but four of the nonzero letters in u 3 to 0 introduces a 6 5 -power of length ma for some m ≤ 25. Decrementing any 2 or 3 in u 3 to 1 introduces a 6 5 -power of length ma for some m ≤ 20. Decrementing any of the first five of six 3s in u 3 to 2 introduces a 6 5 -power of length ma for some m ≤ 4. For the remaining four 0s and one 2, we consider two cases, depending on whether u is immediately preceded by τ (v) or ϕ(n). In both cases, decrementing the last letter of u 3 to 2 introduces a 6 5 -power of length 233a, and decrementing one the four remaining nonzero letters to 0 introduces a 6 5 -power of length ma for some m ≤ 82. As before, decrementing n + 3 to c ≥ 3 corresponds to decrementing an earlier letter n to c − 3 and therefore, inductively, introduces a 6 5 -power. We have automated the steps of the preceding proof as well. One checks that τ (v) is a prefix of w 4/3 .
Showing that τ (ϕ ∞ (0 )) is 4 3 -power-free with our code requires an extra step, since the morphism effectively uses two values of d. Define ϕ 1 (n) = u (n + 1) for all n ≥ 0. We use the morphism ϕ 1 and specify as an assumption that n = 0 or n ≥ 2. The code then verifies that ϕ| Z ≥0 locates words of length 14 and is 4 3 -power-free. It follows that if τ (ϕ ∞ (0 )) contains a 4 3 -power then it contains a 4 3 -power beginning at some position i ≤ |v| − 1.