A square root map on Sturmian words

We introduce a square root map on Sturmian words and study its properties. Given a Sturmian word of slope $\alpha$, there exists exactly six minimal squares in its language (a minimal square does not have a square as a proper prefix). A Sturmian word $s$ of slope $\alpha$ can be written as a product of these six minimal squares: $s = X_1^2 X_2^2 X_3^2 \cdots$. The square root of $s$ is defined to be the word $\sqrt{s} = X_1 X_2 X_3 \cdots$. The main result of this paper is that that $\sqrt{s}$ is also a Sturmian word of slope $\alpha$. Further, we characterize the Sturmian fixed points of the square root map, and we describe how to find the intercept of $\sqrt{s}$ and an occurrence of any prefix of $\sqrt{s}$ in $s$. Related to the square root map, we characterize the solutions of the word equation $X_1^2 X_2^2 \cdots X_n^2 = (X_1 X_2 \cdots X_n)^2$ in the language of Sturmian words of slope $\alpha$ where the words $X_i^2$ are minimal squares of slope $\alpha$. We also study the square root map in a more general setting. We explicitly construct an infinite set of non-Sturmian fixed points of the square root map. We show that the subshifts $\Omega$ generated by these words have a curious property: for all $w \in \Omega$ either $\sqrt{w} \in \Omega$ or $\sqrt{w}$ is periodic. In particular, the square root map can map an aperiodic word to a periodic word.


Introduction
Kalle Saari studies in [16,17] optimal squareful words which are aperiodic words containing the least number of minimal squares (that is, squares with no proper square prefixes) such that every position starts a square. Saari proves that an optimal squareful word always contains exactly six minimal squares, and he characterizes these squares; less than six minimal squares forces a word to be ultimately periodic. Moreover, he shows that Sturmian words are a proper subclass of optimal squareful words.
We propose a square root map for Sturmian words. Let s be a Sturmian word of slope α, and write it as a product of the six minimal squares in its language L(α): s = X 2 1 X 2 2 X 2 3 · · · . The square root of s is defined to be the word √ s = X 1 X 2 X 3 · · · . The main result of this paper is that the word √ s is also a Sturmian word of slope α. More precisely, we prove that the square root of the Sturmian word s x,α of intercept x and slope α is s ψ(x),α where ψ(x) = 1 2 (x + 1 − α). In addition to proving that the square root map preserves the language of a Sturmian word s, we show how to locate any prefix of √ s in s. We also characterize the Sturmian words of slope α which are fixed points of the square root map; they are the two Sturmian words 01c α and 10c α where c α is the infinite standard Sturmian word of slope α. The majority of the proofs of results on Sturmian words rely heavily on the interpretation of Sturmian words as rotation words. Continued fractions and results from Diophantine approximation theory play a key role in several proofs.
Solutions of the word equation X 2 1 · · · X 2 n = (X 1 · · · X n ) 2 where the words X 2 i are among the six minimal squares in L(α) for some fixed irrational α are closely linked to the square root map. The study of these solutions to this word equation arises naturally from the study of fixed points of the square root map. The Sturmian fixed points of the square root map are fixed because they have arbitrarily long prefixes X 2 1 · · · X 2 n which satisfy the word equation. We characterize these specific solutions, i.e., those primitive words w such that w 2 ∈ L(α) and w 2 can be written as a product of minimal squares X 2 1 · · · X 2 n satisfying the word equation. On the circle [0, 1), the interval [w] of such a word w can be seen to satisfy the square root condition ψ([w 2 ]) ⊆ [w], so we instead study and characterize the primitive words satisfying this square root condition. The result is that the specific solutions to the word equation (or, equivalently, the primitive words satisfying the square root condition) are the reversals of standard and semistandard words of slope α (see Subsection 2.3 for a definition) and the reversed standard words with the first two letters exchanged. In particular, all of these specific solutions are nonperiodic. It was known that the word equation (X 2 1 · · · X 2 n ) = X 2 1 · · · X 2 n has nonperiodic solutions [7], but according to our knowledge no large families of nonperiodic solutions have been identified until our result. Word equations of the type X k 1 · · · X k n = (X 1 · · · X n ) k have been considered by Štěpán Holub [6,7,8]. The final central topic of this paper concerns the square root map in a more general setting. The square root map can be defined not only for Sturmian words but for any optimal squareful word. We construct an infinite family of non-Sturmian, linearly recurrent optimal squareful words Γ with properties similar to Sturmian words. The words Γ are fixed points of the square root map. They are constructed by finding non-Sturmian solutions of the word equation X 2 1 · · · X 2 n = (X 1 · · · X n ) 2 and by building infinite words having arbitrarily long squares of such solutions as prefixes. The subshifts Ω generated by the words Γ exhibit behavior similar to Sturmian subshifts. The square root map preserves the language of several but not every word in Ω. Curiously, if the language of a word in Ω is not preserved under the square root map, then the image must be periodic. This result is very surprising since it is contrary to the plausible hypothesis that the square root of an aperiodic word is aperiodic.
The paper is organized as follows. In Section 3 we prove that the square root map preserves the language of a Sturmian word. As a corollary we obtain a description of those Sturmian words which are fixed points of the square root map. In Section 3 we observe that the intervals of the minimal squares in L(α) satisfy the square root condition. In Section 4 we characterize all words w 2 ∈ L(α) satisfying the square root condition. The result is that w 2 with w primitive satisfies the square root condition if and only if w is a reversed standard or semistandard word or a reversed standard word with the first two letters exchanged. Section 5 contains a proof of the characterization of the specific solutions of the word equation X 2 1 · · · X 2 n = (X 1 · · · X n ) 2 mentioned earlier. We show that a primitive word w satisfies the square root condition if and only if w 2 can be written as a product of minimal squares satisfying the word equation. In Section 6 we show how to locate prefixes of √ s in s. As an important step in proving this, we provide necessary and sufficient conditions for a Sturmian word to be a product of squares of reversed standard and semistandard words. We give a formula describing the square root of the Fibonacci word in Section 7. Section 8 is devoted to constructing the non-Sturmian fixed points Γ mentioned above and to demonstrating that the languages of the words in their subshifts are preserved or they are mapped to periodic words. We conclude the paper by giving some remarks on possible generalizations in Section 9 and by discussing a few open problems in Section 10.
A short version of this paper was published as an extended abstract in the proceedings of WORDS 2015 [13].

Notation and Preliminary Results
In this section we review notation and basic concepts and results of word combinatorics, optimal squareful words, continued fractions, and Sturmian words. Most of the definitions and results provided here about words can be found in Lothaire's book [11].
An alphabet A is a finite non-empty set of letters, or symbols. A (finite) word over A is a finite sequence of letters of A obtained by concatenation. The concatenation of two words u = a 0 · · · a n−1 and v = b 0 · · · b m−1 is the word u · v = uv = a 0 · · · a n−1 b 0 · · · b m−1 . In this paper we consider only binary words, that is, words over an alphabet of size two. Most of the time we take A to be the set {0, 1}. The set of nonempty words over A is denoted by A + . We denote the empty word by ε and set A * = A + ∪ {ε}. A nonempty subset of A * is called a language. Let w = a 0 a 1 · · · a n−1 be a word of n letters. We denote the length n of w by |w|; by convention |ε| = 0. The set of proper powers of a word w is denoted by w + .
An infinite word w over the alphabet A is a function from the nonnegative integers to A. We write concisely w = a 0 a 1 a 2 · · · with a i ∈ A. The set of infinite words over A is denoted by A ω . An infinite word w is said to be ultimately periodic if we can write it in the form w = uv ω = uvvv · · · for some words u, v ∈ A * . If u = ε, then w is said to be periodic, or purely periodic. An infinite word which is not ultimately periodic is aperiodic. The shift operator T acts on infinite words as follows: T(a 0 a 1 a 2 . . .) = a 1 a 2 · · · .
A finite word u is a factor of the finite or infinite word w if we can write w = vuz for some v ∈ A * and z ∈ A * ∪ A ω . If v = ε, then the factor u is called a prefix of w. If z = ε, then we say that u is a suffix of w. The set of factors of w, the language of w, is denoted by L(w). If w = a 0 a 1 · · · a n−1 , then we let w[i, j] = a i · · · a j whenever the choices of positions i and j make sense. This notion is extended to infinite words in a natural way. An occurrence of u in w is a position i such that w[i, i + |u| − 1] = u. If such a position exists, then we say that u occurs in w.
A positive integer p is a period of w = a 0 · · · a n−1 if a i = a i+p for 0 ≤ i ≤ n − p − 1. If the finite word w has period p and |w|/p ≥ α for some real α such that α ≥ 1, then w is called an α-repetition. An α-repetition is minimal if it does not have an α-repetition as a proper prefix. If w = u 2 , then w is a square with square root u. A square is minimal if it does not have a square as a proper prefix. A word w is primitive if it is of the form z n if and only if n = 1. Equivalently, a word w is primitive if and only if w occurs in w 2 exactly twice. The primitive root of w is the unique primitive word u such that w = u n for some n ≥ 1. Let w = v ω be a periodic infinite word. The minimal period of w is defined to be the primitive root of v.
Let w = a 0 a 1 · · · a n−1 be a word. The reversal w of w is the word a n−1 · · · a 1 a 0 . If w = w, then we call w a palindrome. Let C be the cyclic shift operator defined by the formula C(a 0 a 1 · · · a n−1 ) = a 1 · · · a n−1 a 0 . The words w, C(w), C 2 (w), . . . , C |w|−1 (w) are the conjugates of w. If u is a conjugate of w, then we say that u is conjugate to w.
An infinite word w is recurrent if each of its factors occurs in it infinitely often. Let (i n ) n≥1 be the sequence of consecutive occurrences of a factor u in a recurrent word w. The return time of u is the quantity which can be infinite. The factors w[i j , i j+1 − 1], j ≥ 1 are the returns to u in w. If the return time of each factor of w is finite, then the word w is uniformly recurrent. Equivalently, w is uniformly recurrent if for each factor u of w there exists an integer R such that every factor of w of length R contains an occurrence of u. If there exists a global constant K such that the return time of any factor u of w is at most K|u|, then we say that w is linearly recurrent. Clearly a linearly recurrent word is uniformly recurrent. The index of a factor u of an infinite word w is defined to be sup{n : u n ∈ L(w)}.
If w is uniformly recurrent and aperiodic, then the index of every factor of w is finite.
A subshift Ω is a subset of A ω such that for some language L such that L ⊆ A * . If we set above L = L(w) where w is an infinite word, then we say that the subshift Ω is generated by w. Subshifts are clearly shift-invariant. If every word in a subshift is aperiodic, then we call the subshift aperiodic. A subshift is minimal if it does not contain nonempty subshifts as proper subsets. A nonempty subshift is minimal if and only if it is generated by a uniformly recurrent word.

Optimal Squareful Words
In [17] Kalle Saari considers α-repetitive words. An infinite word is α-repetitive if every position in the word starts an α-repetition and the number of distinct minimal α-repetitions occurring in the word is finite. If α = 2, then α-repetitive words are called squareful words. This means that every position of a squareful word begins with a minimal square. Saari proves that if the number of distinct minimal squares occurring in a squareful word is at most 5, then the word must be ultimately periodic. On the other hand, if a squareful word contains at least 6 distinct minimal squares, then aperiodicity is possible. Saari calls the aperiodic squareful words containing exactly 6 minimal squares optimal squareful words. Further, he shows that optimal squareful words are always binary and that the six minimal squares must take a very specific form: Proposition 2.1. Let w be an optimal squareful word. If 10 i 1 occurs in w for some i > 1, then the roots of the six minimal squares in w are for some a ≥ 1 and b ≥ 0.
The optimal squareful words containing the minimal square roots of (1) are called optimal squareful words with parameters a and b. For the rest of this paper we reserve this meaning for the symbols a and b. Furthermore, we agree that the symbols S i always refer to the minimal square roots (1).

Continued Fractions and Rational Approximations
In this section we review results on continued fractions and best rational approximations of irrational numbers needed in this paper. Good references on these subjects are the books of Khinchin [9] and Cassels [2]. Every irrational real number α has a unique infinite continued fraction expansion α = [a 0 ; a 1 , a 2 , a 3 , . . .] = a 0 + 1 with a 0 ∈ Z and a k ∈ N for all k ≥ 1. The numbers a i are called the partial quotients of α. We focus here only on irrational numbers, but we note that with small tweaks much of what follows also holds for rational numbers, which have finite continued fraction expansions.
The convergents c k = p k q k of α are defined by the recurrences The sequence (c k ) k≥0 converges to α. Moreover, the even convergents are less than α and form an increasing sequence and, on the other hand, the odd convergents are greater than α and form a decreasing sequence. If k ≥ 2 and a k > 1, then between the convergents c k−2 and c k there are semiconvergents (called intermediate fractions in Khinchin's book [9]) which are of the form When the semiconvergents (if any) between c k−2 and c k are ordered by the size of their denominators, the sequence obtained is increasing if k is even and decreasing if k is odd. Note that we make a clear distinction between convergents and semiconvergents, i.e., convergents are not a specific subtype of semiconvergents.
A rational number a b is a best approximation of the real number α if for every fraction c d such In other words, any other multiple of α with a coefficient at most b is further away from the nearest integer than bα is. The next important proposition shows that the best approximations of an irrational number are connected to its convergents (for a proof see Theorems 16 and 17 of [9]).

Proposition 2.3. The best rational approximations of an irrational number are exactly its convergents.
We identify the unit interval [0, 1) with the unit circle T. Let α ∈ (0, 1) be irrational. The map where {x} stands for the fractional part of the number x, defines a rotation on T. The circle partitions into the intervals (0, 1 2 ) and ( 1 2 , 1). Points in the same interval of the partition are said to be on the same side of 0 and points in different intervals are said to be on the opposite sides of 0. (We are not interested in the location of the point 1 2 .) The points {q k α} and {q k−1 α} are always on the opposite sides of 0. The points {q k,ℓ α} with 0 < ℓ ≤ a k always lie between the points {q k−2 α} and {q k α}; see (4).
We measure the shortest distance to 0 on T by setting We have the following facts for k ≥ 2 and for all l such that 0 < l ≤ a k : We can now interpret Proposition 2.3 as Note that rotating preserves distances; a fact we will often use without explicit mention. In particular, the distance between the points {nα} and {mα} is |n − m|α . Thus by (5) the minimum distance between the distinct points {nα} and {mα} with 0 ≤ n, m < q k is at least q k−1 α . Formula (5) tells what is the point closest to 0 among the points {nα} for 1 ≤ n ≤ q k − 1. We are also interested in knowing the point closest to 0 on the side opposite to {q k−1 α}. The next result is very important and concerns this; see [12,Proposition 2.2.].

Sturmian Words
Sturmian words are a well-known class of infinite, aperiodic binary words with minimal factor complexity. They are defined as the infinite words having n + 1 factors of length n for every n ≥ 0. For our purposes it is more convenient to view Sturmian words as the infinite words obtained as codings of orbits of points in an irrational circle rotation with two intervals; see [14,11]. Let us make this more precise. The frequency α of letter 1 (called the slope) in a Sturmian words exists, and it is irrational. Divide the circle T into two intervals I 0 and I 1 defined by the points 0 and 1 − α, and define the coding function ν by setting The coding of the orbit of a point x is the infinite word s x,α obtained by setting its n th , n ≥ 0, letter to equal ν(R n (x)) where R is the rotation by angle α. This word is Sturmian with slope α, and conversely every Sturmian word with slope α is obtained this way. To make the definition proper, we need to define how ν behaves in the endpoints 0 and 1 − α. We have two options: The difference is seen in the codings of the orbits of the special points {−nα}, and both options are needed to be able to obtain every Sturmian word of slope α as a coding of a rotation. However, in this paper we are not concerned about this choice. We make the convention that I(x, y) with x = y and x, y = 0 is either of the half-open intervals of T separated by the points x and y (taken modulo 1 if necessary) not containing the point 0 as an interior point. The interval I(x, 0) = I(0, x) is either of the half-open intervals separated by the points 0 and x having smallest length (the case x = 1 2 is not important in this paper). Since the sequence ({nα}) n≥0 is dense in [0, 1)-as is wellknown-every Sturmian word of slope α has the same language (that is, the set of factors); this language is denoted by L(α). Further, all Sturmian words are uniformly recurrent.
For every factor w = a 0 a 1 · · · a n−1 of length n there exists a unique subinterval [w] of T such that s x,α begins with w if and only if x ∈ [w]. Clearly We denote the length of the interval [w] by |[w]|. The points 0, {−α}, {−2α}, . . . , {−nα} partition the circle into n + 1 intervals, which have one-to-one correspondence with the words of L(α) of length n. Among these intervals the interval containing the point {−(n + 1)α} corresponds to the right special factor of length n. A factor w is right special if both w0, w1 ∈ L(α). Similarly a factor is left special if both 0w, 1w ∈ L(α). In a Sturmian word there exists a unique right special and a unique left special factor of length n for all n ≥ 0. The language L(α) is mirror-invariant, that is, for every w ∈ L(α) also w ∈ L(α). It follows that the right special factor of length n is the reversal of the left special factor of length n. Sturmian words are also balanced; that is, the number of occurrences of the letter 1 in any two factors of the same length differ at most by 1.
Given the continued fraction expansion of an irrational α ∈ (0, 1) as in (2), we define the corresponding standard sequence (s k ) k≥0 of words by As s k is a prefix of s k+1 for k ≥ 1, the sequence (s k ) converges to a unique infinite word c α called the infinite standard Sturmian word of slope α, and it equals s α,α . Inspired by the notion of semiconvergents, we define semistandard words for k ≥ 2 by s k,ℓ = s ℓ k−1 s k−2 with 1 ≤ ℓ < a k . Clearly |s k | = q k and |s k,ℓ | = q k,ℓ . Instead of writing "standard or semistandard", we often simply write "(semi)standard". The set of standard words of slope α is denoted by Stand(α), and the set of standard and semistandard words of slope α is denoted by Stand + (α). (Semi)standard words are left special as prefixes of the word c α . Every (semi)standard word is primitive [11,Proposition 2.2.3]. An important property of standard words is that the words s k and s k−1 almost commute; namely s k s k−1 = wxy and s k−1 s k = wyx for some word w and distinct letters x and y. For more on standard words see [11,1].
The only difference between the words c α and c α where α = [0; 1, a 2 , a 3 , . . .] and α = [0; a 2 + 1, a 3 , . . .] is that the roles of the letters 0 and 1 are reversed. We may thus assume without loss of generality that a 1 ≥ 2. For the rest of this paper we make the convention that α stands for an irrational number in (0, 1) having the continued fraction expansion as in (2) with a 1 ≥ 2, i.e., we assume that 0 < α < 1 2 . The numbers q k and q k,ℓ refer to the denominators of the convergents of α, and the words s k and s k,ℓ refer to the standard or semistandard words of slope α.

Powers in Sturmian Words
In this section we review some known results on powers in Sturmian words, and prove helpful results for the next section.
If a square w 2 occurs in a Sturmian word of slope α, then the length of the word w must be a really specific number, namely a denominator of a convergent or a semiconvergent of α. The proof can be found in [3,Theorem 1] or [12,Proposition 4.1].
Next we need to know when conjugates of (semi)standard words occur as squares in a Sturmian word.
Proposition 2.6. The following holds: (i) A factor w ∈ L(α) is conjugate to s k for some k ≥ 0 if and only if |w| = |s k | and w 2 ∈ L(α).
(ii) Let w be a conjugate of s k,ℓ with k ≥ 2 and 0 < ℓ < a k . Then w 2 ∈ L(α) if and only if the intervals [w] and [s k,ℓ ] have the same length.
(iii) Let n = q 0 , n = q 1 , or n = q k,ℓ with k ≥ 2 and 0 < ℓ ≤ a k , and let s be the (semi)standard word of length n. A factor w ∈ L(α) of length n is conjugate to s if and only if w and s have equally many occurrences of the letter 0.
Proof. Claim (i) is a direct consequence of [3,Theorem 3] or alternatively [12,Theorem 4.5]. Claim (ii) can be inferred from Theorems 4.3 and 4.5 of [12]. Finally, claim (iii) is evident from the proof of [12,Theorem 4.3], but a short proof can be given: the idea is that every factor of length n except one exceptional factor v is conjugate to s since s 2 occurs in L(α) by (i) and (ii). As not every factor of length n may have the same number of letters 0 (a right special factor always extends to two factors having different number of letters 0), it must be that v has a different number of letters 0 than any conjugate of s.
We also need to know the index of certain factors of Sturmian words. The following proposition follows directly from Theorems 3 and 4 of [3] or from [12,Theorem 4.5].
Proposition 2.7. The index of the standard word s k in L(α) is a k+1 + 2 for k ≥ 2 and a 2 + 1 for k = 1. The index of the semistandard word s k,ℓ in L(α) with k ≥ 2 and 0 < ℓ < a k is 2.
Note that a square root map can be defined for any optimal squareful word. However, now we only focus on Sturmian words; we study later the square root map for other optimal squareful words in Section 8.
We aim to prove the surprising fact that given a Sturmian word s the word √ s is also a Sturmian word having the same slope as s. Moreover, knowing the intercept of s, we can compute the intercept of √ s. In the proof we need a special function ψ : T → T defined as follows. For x ∈ (0, 1) we set and we set The mapping ψ moves a point x on the circle T towards the point 1 − α by halving the distance between the points x and 1 − α. The distance to 1 − α is measured in the interval I 0 or I 1 depending on which of these intervals the point x belongs to. We can now state the result. For a combinatorial version of the above theorem see Theorem 6.5 in Section 6. The main idea of the proof is to demonstrate that the square root map is actually the symbolic counterpart of the function ψ. We begin with a definition.
Note that if the interval [w] in the above definition has 1 − α as an endpoint, then w automatically satisfies the square root condition. This is because ψ moves points towards the point 1 − α but does not map them over this point. Actually, if w satisfies the square root condition, then necessarily the interval [w] has 1 − α as an endpoint (see Corollary 4.3).

Proof of Theorem 3.2.
Write s x,α = X 2 1 X 2 2 X 2 3 · · · as a product of minimal squares. Since the minimal square X 2 1 satisfies the square root condition by Lemma 3.4, we have that . Thus by shifting s x,α the amount 2|X 1 | and by applying the preceding reasoning, we conclude that s ψ(x),α shifted by the amount |X 1 | begins with X 2 . Therefore the words √ s x,α and s ψ(x),α agree on their first |X 1 | + |X 2 | letters. By repeating this procedure, we conclude that Theorem 3.2 allows us to effortlessly characterize the Sturmian words which are fixed points of the square root map. Proof. The only fixed point of the map ψ is the point 1 − α. Having this point as an intercept, we obtain two Sturmian words: either 01c α or 10c α , depending on which of the intervals I 0 and I 1 the point 1 − α belongs to.
The set {01c α , 10c α } is not only the set of fixed points but also the unique attractor of the square root map in the set of Sturmian words of slope α. When iterating the square root map on a fixed Sturmian word s x,α , the obtained word has longer and longer prefixes in common with either of the words 01c α and 10c α because ψ n (x) tends to 1 − α as n increases.

One Characterization of Words Satisfying the Square Root Condition
In the previous section we saw that the minimal squares, which satisfy the square root condition, were crucial in proving that the square root of a Sturmian word is again Sturmian with the same slope. The minimal squares of slope α are not the only squares in L(α) satisfying the square root condition; in this section we will characterize combinatorially such squares. To be able to state the characterization, we need to define the set of reversed standard words of slope α. Similarly we set We also need the operation L which exchanges the first two letters of a word (we do not apply this operation to too short words).
The main result of this section is the following.
As we remarked in Section 3, a square w 2 ∈ L(α) trivially satisfies the square root condition if its interval [w] has 1 − α as an endpoint. Our aim is to prove that the converse is also true. We begin with a technical lemma. Lemma 4.2. Let n = q 1 or n = q k,l for some k ≥ 2 with 0 < l ≤ a k , and let i be an integer such that to the above and rearranging, we have that Suppose now first that n = q k,l for some k ≥ 2 and 0 < l ≤ a k . Since i − 1 < n, Proposition 2.4 and (7) imply that i − 1 = mq k−1 for some 1 ≤ m ≤ min{l, a k − l + 1}. As {−nα} ∈ I 1 , the point {−q k−1 α} must lie on the opposite side of 0 in the interval I 0 . Therefore {−(i − 1)α} ∈ I 0 . Then by (7), the point {−iα} must lie in I 1 . This is a contradiction. Suppose then that n = q 1 . It is easy to see that (7) cannot hold for any i greater than 1. This concludes the proof. Proof. Let n = |w|. Proposition 2.5 implies that n = q 0 , n = q 1 , or n = q k,l for some k ≥ 2 with 0 < l ≤ a k . Say n = q 0 = 1. As the only factor of length 1 occurring as a square is 0, the claim holds as [0] = I 0 = I(0, 1 − α). Suppose then that n = q 1 or n = q k,l .
Thus also in this case necessarily j = 1. The case where [w] ⊆ I 1 is proven symmetrically using the latter symmetric assertion of Lemma 4.2.
Next we study in more detail the properties of squares w 2 ∈ L(α) whose interval has 1 − α as an endpoint.

Proposition 4.4.
Consider the intervals of factors in L(α) of length n = q 1 or n = q k,l with k ≥ 2 and 0 < l ≤ a k . Let u and v be the two distinct words of length n having intervals with endpoint 1 − α. Then the following holds.
(i) There exists a word w such that u = xyw and v = yxw = L(u) for distinct letters x and y.
(ii) Either u or v is right special.
(iii) If µ is the right special word among the words u and v, then µ 2 ∈ L(α).
(iv) If λ is the word among the words u and v which is not right special, then λ 2 ∈ L(α) if and only if n = q 1 or l = a k .
Proof. Suppose first that n = q 1 . Then it is straightforward to see that the factors u and v of length n having intervals with endpoint 1 − α are 010 a 1 −2 = S 2 and 10 a 1 −1 = S 4 . Clearly S 4 is right special and L(S 4 ) = S 2 . Moreover S 2 2 , S 2 4 ∈ L(α). Assume that n = q k,l for some k ≥ 2 with 0 < l ≤ a k . By Proposition 2.4 the point {−nα} is the point closest to 0 on the side opposite to the point . This means that the word u is right special, proving (ii). Further, the endpoint of [u] which is not 1 − α must be after a rotation the next closest point to 0 on the side opposite to the point {−q k−1 α}. Thus by Proposition 2.
Since the points x = {(−(q k,l−1 + 1)α} and y = {−(q k−1 + 1)α} are on the opposite sides of the point 1 − α and the points {x + α} and {y + α} are on the opposite sides of the point 0, it follows that u begins with cd and v begins with dc for distinct letters c and d. Assume on the contrary that u = cdzeu ′ and v = dcz f v ′ for distinct letters e and f . In particular, |z| ≤ n − 3. This means that the point Suppose that x ′ is closer to 1 − α than x. Since x ′ is on the same side of the point 1 − α as x, it follows that Since q k,l−1 − |z| − 2 < q k,l−1 , by Proposition 2.4 it must be that q k,l−1 − |z| − 2 ≤ 0. However, as q k,l−1 α = − q k,l−1 α , it follows by Proposition 2.4 that |z| + 2 − q k,l−1 = mq k−1 for some m ≥ 1. Thus |z| + 2 ≥ q k,l−1 + q k−1 = q k,l = n. This is, however, a contradiction as |z| ≤ n − 3.
Suppose then that y ′ is closer to 1 − α than y. Similar to above, it follows that This is again a contradiction with the fact that |z| ≤ n − 3. Thus we conclude that u = cdw and v = dcw for some word w proving (i). As n = q k,l , it must be that the right special word of length n equals s k,l . Since u and v are conjugate by Proposition 2.6 (iii), Proposition 2.6 implies that if l = a k , then u 2 , v 2 ∈ L(α). Suppose that l = a k . By Proposition 2.6, the word s k,l occurs as a square in L(α). Since L(α) is mirror-invariant, also Proof of Theorem 4.1. If |w| = 1, then clearly w = 0 = s 0 , so the claim holds. We may thus focus on the case that |w| > 1. Suppose that w 2 ∈ L(α) satisfies the square root condition. By Corollary 4.3 the interval [w] has 1 − α as an endpoint. Moreover, Proposition 2.5 implies that |w| = q 1 or |w| = q k,l for some k ≥ 2 with 0 < l ≤ a k . Thus from Proposition 4.4 it follows that w = s or w = L( s ) where s is the (semi)standard word of length |w|. By Proposition 4.4 we have that s 2 ∈ L(α). Moreover, by Proposition 4.4 we have that L( s ) 2 ∈ L(α) if and only if |w| = q k for some k ≥ 1. Thus w ∈ RStand + (α) ∪ L(RStand(α)).
Suppose then that w ∈ RStand + (α) ∪ L(RStand(α)). Note first that L(w) has the same number of letters 0 as w, so w is conjugate to L(w) by Proposition 2.6. Thus it follows from Proposition 2.6 that w 2 ∈ L(α). Let u and v be the factors of length |w| having endpoint 1 − α. By Proposition 4.4 the word u must be right special and v = L(u). Since the right special factor of length |w| is unique, either w = u or L(w) = u. Thus the interval [w] has 1 − α as an endpoint. Then clearly w 2 satisfies the square root condition.

Characterization by a Word Equation
It turns out that the squares of slope α satisfying the square root condition have also a different characterization in terms of specific solutions of the word equation in the language L(α). We are interested only in the solutions of (8) where all words X i are minimal square roots (1), i.e., primitive roots of minimal squares. Thus we give the following definition. (8) if w can be written as a product of minimal square roots w = X 1 X 2 · · · X n which satisfy the word equation (8). The solution is trivial if (8) and w 2 ∈ L(α).

Definition 5.1. A nonempty word w is a solution to
All minimal square roots of slope α are trivial solutions to (8). One example of a nontrivial solution is w = S 2 S 1 S 4 in the language of the Fibonacci word (i.e., in the language of slope [0; 2, 1, 1, . . .]) since w 2 = (01010) 2 = (01) 2 · 0 2 · (10) 2 = S 2 2 S 2 1 S 2 4 . Note that in the language of any Sturmian word there are only finitely many trivial solutions as the index of every factor is finite.
Note that the factorization of a word as product of minimal squares is unique. Indeed, if where the squares X 2 i and Y 2 i are minimal, then either X 2 1 is a prefix of Y 2 1 or vice versa. Therefore by minimality The uniqueness of the factorization follows.
Our aim is to complete the characterization of Theorem 4.1 as follows.
Theorem 5.2. Let w ∈ L(α). The following are equivalent: For later use in Section 8 we define the language L(a, b).
Observe that by Proposition 2.2 every factor in L(a, b) is a factor of some optimal squareful word with parameters a and b. Moreover, if α = [0; a + 1, b + 1, . . .], then L(α) ⊆ L(a, b).
Definition 5.4. The language Π(a, b) consists of all nonempty words in L(a, b) which can be written as products of the minimal squares (1).
Let w ∈ Π(a, b), that is, w = X 2 1 · · · X 2 n for minimal square roots X i . Then we can define the square root of w by setting √ w = X 1 · · · X n . We need two technical lemmas. Their proofs are straightforward case-by-case analysis. The statement of Lemma 5.5 has a technical condition for later use in Section 8, which is perhaps better understood if the reader first reads the proof of Lemma 5.6 up to the point where Lemma 5.5 is invoked.

Lemma 5.5. Let u and v be words such that
• u is a nonempty suffix of S 6 , • v begins with xy for distinct letters x and y, Suppose there exists a minimal square X 2 such that |X 2 | > |u| and X 2 is a prefix of uv or uL(v). Then there exist minimal squares Y 2 1 , . . . , Y 2 n such that X 2 and Y 2 1 · · · Y 2 n are prefixes of uv and uL(v) of the same length and X = Y 1 · · · Y n .
Proof. Let Z 2 be a minimal square such that |Z 2 | > |u| and Z 2 is a prefix of uv or uL(v). It is not obvious at this point that Z exists but its existence becomes evident as this proof progresses. By symmetry we may assume that Z 2 is a prefix of uv. To prove the claim we consider different cases depending on the word Z.
Case A. Z = S 1 = 0. Since u is a nonempty suffix of S 6 and |Z 2 | > |u|, it must be that u = 0. As v begins with 0, we have that v begins with 01 by assumption. Since v ∈ L(a, b) and |v| ≥ |S 6 |, the word v begins with either 010 a 10 a or 010 a+1 10 a . In the latter case L(v) would begin with 10 a+2 1 contradicting the assumption L(v) ∈ L(a, b). Hence v begins with 010 a 10 a . It follows that uv has 0010 a 10 a as a prefix, that is, uv begins with S 2 1 S 2 4 . On the other hand, the word uL(v) has the word S 2 3 = 010 a+1 10 a as a prefix. Since S 3 = S 1 S 4 , the conclusion of the claim holds.
Case B. Z = S 2 = 010 a−1 . If u = 0, then v has 10 a 10 a as a prefix and, consequently, L(v) has 10 a−1 10 a as a prefix contradicting the fact that L(v) ∈ L(a, b). Therefore by the assumptions that u is a nonempty suffix of S 6 and |Z 2 | > |u|, it follows that u = 010 a . Thus v has 10 a as a prefix. Using the fact that L(v) ∈ L(a, b), we see that v begins with 10 a+1 and L(v) begins with 010 a . Hence uv has S 2 2 S 2 1 as a prefix, and uL(v) has S 2 3 as a prefix. Since S 2 S 1 = S 3 , we conclude, as in the previous case, that the conclusion holds.
Case C. Z = S 3 = 010 a . Using again the fact that u is a suffix of S 6 and |Z 2 | > |u|, we see that either u = 0 or u = 010 a . In the first case v begins with 10 a+1 10 a and L(v) begins with 010 a 10 a . Hence the word uL(v) has S 2 1 S 2 4 as a prefix. As S 1 S 4 = S 3 , the conclusion follows. Let us then consider the other case. Now L(v) begins with 10 a+1 , so the word uL(v) has S 2 2 S 2 1 as a prefix. Again, the conclusion follows since S 2 S 1 = S 3 .
Case D. Z = S 4 = 10 a . Now the only option is that u = 10 a . Using the fact that v ∈ L(a, b), we see that v cannot begin with 10 a 1, so v must have 10 a+1 as a prefix. Further, since |v| ≥ |S 6 |, it must be that S 6 is a prefix of v. If S 6 1 would be a prefix of v, then the word L(v) would have the word (10 a ) b+2 1 as a factor contradicting the fact that L(v) ∈ L(a, b). Thus S 6 0 is a prefix of v. Since v ∈ L(a, b) and |v| ≥ |S 5 S 6 |, we have that S 6 0(10 a ) b+1 = S 2 5 10 a is a prefix of v. Consequently, the word L(v) begins with 0(10 a ) b+1 10 a+1 (10 a ) b+1 , so uL(v) has S 2 6 as a prefix. Assume first that b is odd. It is straightforward to see that in this case 0(10 a ) b 10 a+1 (10 a ) b+1 = (S 2 2 ) (b+1)/2 S 2 1 (S 2 4 ) (b+1)/2 . Thus for the prefix 10 a S 5 10 a of uv we have that , the conclusion follows as before. Assume then that b is even. It is now easy to show that 4 , the conclusion again follows. Case E. Z = S 5 = 10 a+1 (10 a ) b . Now either u = 10 a or u = 10 a+1 (10 a ) b+1 . In the first case v must begin with 0(10 a ) b 10 a+1 (10 a ) b . However, this implies that L(v) begins with 10 a+1 (10 a ) b−1 10 a+1 (10 a ) b contradicting the fact that L(v) ∈ L(a, b). Consider then the latter case where v begins with 0(10 a ) b . As L(v) ∈ L(a, b) and |v| ≥ |S 6 |, it must be that L(v) begins with 10 a+1 (10 a ) b+1 . Hence the word uL(v) has S 2 6 as a prefix. Since the word v begins with 0(10 a ) b+2 , the word uv has S 2 5 S 2 4 as a prefix. The conclusion follows as S 5 S 4 = S 6 . Case F. Z = S 6 = 10 a+1 (10 a ) b+1 . Now there are two possibilities: either u = 10 a or u = 10 a+1 (10 a ) b+1 . In the first case v begins with 0(10 a ) b+1 10 a+1 (10 a ) b+1 , so L(v) begins with 10 a+1 (10 a ) b 10 a+1 (10 a ) b+1 . The word uL(v) has S 2 4 0(10 a ) b 10 a+1 (10 a ) b+1 as a prefix. Proceeding as in the Case D depending on the parity of b, we see that the conclusion holds. Consider then the latter case u = 10 a+1 (10 a ) b+1 . The word v must begin with u, so L(v) has 0(10 a ) b+2 as a prefix. Clearly the word uL(v) has S 2 5 S 2 4 as a prefix. As S 6 = S 5 S 4 , the conclusion follows. A more intuitive way of stating Lemma 5.5 is that under the assumptions of the lemma swapping two adjacent and distinct letters which do not occur as a prefix of a minimal square affects a product of minimal square only locally and does not change its square root. Lemma 5.6. Let w be a primitive solution to (8) having the word S 6 = 10 a+1 (10 a ) b+1 as a suffix such that w 2 , L(w) ∈ L(a, b). Then wL(w) ∈ Π(a, b) and wL(w) = w.
Proof. If w = S 6 , then it is easy to see that wL(w) = S 2 5 S 2 4 and w = S 5 S 4 , so the claim holds. We may thus suppose that S 6 is a proper suffix of w.
Since w is a solution to (8), we have that w 2 = X 2 1 · · · X 2 n and w = X 1 · · · X n for some minimal square roots X i . It must be that n > 1 as if n = 1 then w = X 1 , and it is not possible for S 6 to be a proper suffix of w. Assume for a contradiction that X 1 = S 1 . Since X 1 X 2 is a prefix of w 2 , it follows that X 2 begins with the letter 0. If X 2 = S 1 , then X 1 X 2 begins with 001 but X 2 1 X 2 2 begins with 000, which is impossible. Hence X 2 = S 1 , and by repeating the argument it follows that X k = S 1 for all k such that 1 ≤ k ≤ n. Thus w cannot have S 6 as a suffix, so we conclude that X 1 = S 1 . Hence w always begins with 01 or 10.
We show that |X 2 1 | < |w|. Assume on the contrary that |X 2 1 | ≥ |w|. Since w has the word S 6 as a suffix, it follows that S 6 is a factor of X 2 1 . It follows that X 1 is one of the words S 5 , S 6 or S 3 (if b = 0). If X 1 = S 5 , then S 6 occurs in X 2 1 = 10 a+1 (10 a ) b 10 a+1 (10 a ) b only as a prefix. Thus w = S 6 contradicting the fact that S 6 is a proper suffix of w. If X 1 = S 6 , then S 6 occurs in X 2 1 = 10 a+1 (10 a ) b+1 10 a+1 (10 a ) b+1 as a prefix and as a suffix. Since w = S 6 , it must be that w = X 2 1 contradicting the primitivity of w. Let finally b = 0 and X 1 = S 3 . Then S 6 occurs in X 2 1 = 010 a+1 10 a as a suffix. Hence w = X 2 1 contradicting again the primitivity of w. Now there exists a maximal r such that 1 ≤ r < n and X 2 1 · · · X 2 r is a prefix of w. Actually X 2 1 · · · X 2 r is a proper prefix of w, as otherwise w 2 = (X 2 1 · · · X 2 r ) 2 = (X 1 · · · X r X 1 · · · X r ) 2 , so w = (X 1 · · · X r ) 2 contradicting the primitivity of w. Thus when factorizing wL(w) and w 2 as products of minimal squares, the first r squares are equal. Let u be the nonempty word such that w = X 2 1 · · · X 2 r u. By the definition of the number r, we have that u is a proper prefix of X 2 r+1 . Suppose for a contradiction that |u| > |S 6 |. It follows that u has S 6 as a proper suffix. This leaves only the possibilities that X r+1 is either of the words S 5 or S 6 . However, if X r+1 = S 5 , then S 6 cannot be a proper suffix of u, and if X r+1 = S 6 , then r is not maximal. We conclude that |u| ≤ |S 6 |.
Next we show that w must satisfy |w| ≥ |S 5 S 6 |. Suppose first that w begins with the letter 0. Then as S 6 is a proper suffix of w and w 2 ∈ L(a, b), it must be that w begins with 0(10 a ) b+1 . Suppose that this prefix overlaps with the suffix S 6 . Then clearly w = 0(10 a ) b 10 a+1 (10 a ) b+1 = (0(10 a ) b+1 ) 2 contradicting the primitivity of w. If the prefix 0(10 a ) b+1 does not overlap with the suffix S 6 , then |w| ≥ |S 5 S 6 |. Assume then that w begins with the letter 1. Similar to above, the word w must begin with 10 a+1 (10 a ) b+1 . In this case necessarily |w| ≥ |S 5 S 6 |.
Finally, we can apply Lemma 5.5 to the words u and w with X = X r+1 . We obtain minimal squares Y 2 1 , . . . , Y 2 m such that Y 2 1 · · · Y 2 m is a prefix of uL(w) and and Y 1 · · · Y m = X r+1 · · · X r+t for some t ≥ 1. Thus The claim is proved.
We may thus suppose that |w| ≥ |S 6 |, so w has S 6 as a suffix. We proceed by induction. Now either w = s k,ℓ for some k ≥ 3 with 0 < ℓ ≤ a k or L(w) = s k for some k ≥ 3. We assume that the claim holds for every word satisfying the hypotheses which are shorter than w. Consider first the case w = s k,ℓ for some k ≥ 3 with 0 < ℓ ≤ a k . By the fact that s k−1 s k−2 = L( s k−2 ) s k−1 we obtain that Now if k = 3 and ℓ = 1, then the conclusion holds as s 3,1 = S 6 is a minimal square root. Hence we may assume that either k > 3 or k = 3 and ℓ > 1. Since s k−1 is a solution to (8), we have that s 2 k−1 = X 2 1 · · · X 2 n and s k−1 = X 1 · · · X n for some minimal square roots X i . In other words, Since | s k,ℓ−1 | ≥ |S 6 |, with an application of Lemma 5.6 we obtain that s k,ℓ−1 L( s k,ℓ−1 ) ∈ Π(a, b) and s k,ℓ−1 L( s k,ℓ−1 ) = s k,ℓ−1 .
Thus w 2 ∈ Π(a, b) and so w is a solution to (8). Consider next the case w = L( s k ) for some k ≥ 3. Similar to above, If k > 3, then the claim follows using the induction hypothesis and Lemma 5.6 as above. In the case k = 3 we have that Namely, it is not difficult to see that if b is even, then If b is odd, then Thus w is a solution to (8) also in the case k = 3.
Note that a word w in the set L(RStand + (α)) \ L(RStand(α)) is a solution to (8) but not in the language L(α). Rather, w is a solution to (8) in L(β) where β is a suitable irrational such that L(w) is a reversed standard word of slope β.
From Proposition 5.7 we conclude the following interesting fact: Corollary 5.8. There exist arbitrarily long primitive solutions of (8) in L(α).
We can now prove Theorem 5.2.

A More Detailed Combinatorial Description of the Square Root Map
Recall from Section 3 that the square root √ s of a Sturmian word s has the same factors as s. The proofs were dynamical; we used the special mapping ψ on the circle. In this section we describe combinatorially why the language is preserved; we give a location for any prefix of √ s in s. As a side product, we are able to describe when a Sturmian word is uniquely factorizable as a product of squares of reversed (semi)standard words.
Obviously the square root X 1 = 010 of (010) 2 occurs as a prefix of f . Equally clearly the word 010 · 100 = (010) 2 (100) 2 occurs, not as a prefix, but after the prefix X 1 of f . Thus the position of the first occurrence of 010 · 100 shifted |X 1 | = 3 positions from the position of the first occurrence of X 1 . However, when comparing the position of the first occurrence of (010) 2 (100) 2 (10) 2 with the first occurrence of 010 · 100, we see that there is no further shift. By further inspection, the word (010) 2 (100) 2 (10) 2 (01) 2 0 2 (10010) 2 occurs for the first time at position |X 1 | of f . This is no longer true for the first seven minimal squares; the first occurrence of X 1 X 2 = 010 · 100 · 10 · 01 · 0 · 10010 · 01 is at position |X 1 X 2 | = 16 of f . The amount of shift from the previous position |X 1 | = 3 is |X 2 | = 13; observe that both of these numbers are Fibonacci numbers. Thus the amount of shift was exactly the length of the square roots added after observing the previous shift. As an observant reader might have noticed, both of the words X 1 and X 2 are reversed standard words, or equivalently, primitive solutions to (8). Repeating similar inspections on other Sturmian words suggests that there is a certain pattern to these shifts and that knowing the pattern would make it possible to locate prefixes of √ s in the Sturmian word s. Thus it makes very much sense to "accelerate" the square root map by considering squares of solutions to (8) instead of just minimal squares. Next we make these somewhat vague observations more precise.
Every Sturmian word has a solution of (8) as a square prefix. Next we aim to characterize Sturmian words having infinitely many solutions of (8) as square prefixes. The next two lemmas are key results towards such a characterization.

Analogous representations exist for the sets
To put it more simply: for each x = 1 − α there exists a unique reversed (semi)standard word w such that x ∈ [w 2 ]. To illustrate the proof, we begin by giving a proof sketch. Proof of Lemma 6.1. Consider the lengths of the reversed (semi)standard words beginning with the same letter as s k,ℓ . Out of these lengths we can form the unique increasing sequence (b n ) such that b 1 = q k,ℓ−1 . If we set s 1 = s k,ℓ and J 1 = I(−(b 1 + 1)α, 1 − α), then based on the observations in the proof of Proposition 4.4 we see that is the interval of s 2 , the unique reversed (semi)standard word of length b 3 beginning with the same letter as s 1 . By repeating this when n > 1, we see that the interval J n is split by the point {−(b n+1 + 1)α} and that [s 2 n ] = I(−(b n + 1)α, −(b n+1 + 1)α). Then there is a unique reversed (semi)standard word s n+1 such that [s n+1 ] = I(−(b n+1 + 1)α, 1 − α) = J n \ [s 2 n ]; we set J n+1 = [s n+1 ]. By the definition of the sequence (b n ), the words s n+1 and s 1 begin with the same letter. This yields a well-defined sequence (J n ) of nested subintervals of J 1 . It is clear that |J n | → 0 as n → ∞. It follows that The sets [s 2 n ] are by definition disjoint. The claim follows since the indexing in the claim is just another way to express the reversed (semi)standard words having lengths from the sequence (b n ).
The above proof works as it is for the cases s 0 and s 1 ; only minor adjustments in notation are needed. Lemma 6.2. Let u ∈ RStand + (α) and v ∈ RStand + (α) ∪ L(RStand + (α)). Then u 2 is never a proper prefix of v 2 .
Proof. If v ∈ RStand + (α) and |u| = |v|, then by Lemma 6.1, the intervals [u 2 ] and [v 2 ] are disjoint. Hence u 2 can never be a proper prefix of v 2 . Assume then that v ∈ L(RStand + (α)). If |v| ≤ | s 1 |, then v 2 is a minimal square, so it is not possible for u 2 to be a proper prefix of v 2 . Suppose that |v| = | s k,ℓ | for some k ≥ 2 with 0 < ℓ ≤ a k . As in the proof of Proposition 4.4, we have that [v] = I(−(q k−1 + 1)α, 1 − α). If u begins with the same letter as v and |u| < |v|, then |u| ≤ | s k−1 |. It follows, as in the proof of Lemma 6.1, that the distance between 1 − α and either of the endpoints of the interval [u 2 ] must be at least q k−1 α . Hence the intervals [v] and [u 2 ] are disjoint, so u 2 is not a proper prefix of v 2 .
Let s be a fixed Sturmian word of slope α. Since the index of a factor of a Sturmian word is finite, Lemma 6.2 and Theorem 5.2 imply that if s has infinitely many solutions of (8) as square prefixes then no word in RStand + (α) is a square prefix of s. We have now the proper tools to prove the following:

Proposition 6.3. Let s x,α be a Sturmian word of slope α and intercept x. Then s x,α begins with a square of a word in RStand + (α) if and only if
Thus by applying Lemma 6.1 to I 0 \ {1 − α} or I 1 \ {1 − α}, we see that the word s x,α begins with a square of a word in RStand + (α).
It follows that if s has infinitely many solutions of (8) as square prefixes, then s ∈ {01c α , 10c α }. Next we take one extra step and characterize when s can be written as a product of squares of words in RStand + (α). 2 1 X 2 2 · · · X 2 n c where X i ∈ RStand + (α) and c ∈ {01c α , 10c α }. If s is a product of squares in RStand + (α), then this product is unique.

Theorem 6.4. A Sturmian word s of slope α can be written as a product of squares of words in RStand + (α) if and only if s is not of the form X
Proof. This is a direct consequence of Proposition 6.3 and Lemma 6.2.
Suppose that s / ∈ {01c α , 10c α }. Then the word s has only finitely many solutions of (8) as square prefixes. We call the longest solution maximal. Observe that the maximal solution is not necessarily primitive since any power of a solution to (8) is also a solution. Sturmian words of slope α can be classified into two types.
Type A. Sturmian words s of slope α which can be written as products of maximal solutions to (8). In other words, it can be written that s = X 2 1 X 2 2 · · · where X i is the maximal solution occurring as a square prefix of the word T h i (s) where h i = |X 2 1 X 2 2 · · · X 2 i−1 |. Type B. Sturmian words s of slope α which are of the form s = X 2 1 X 2 2 · · · X 2 n c where c ∈ {01c α , 10c α } and the words X i are maximal solutions as above. Proposition 6.3 and Lemma 6.2 imply that the words X i in the above definitions are uniquely determined and that the primitive root of a maximal solution is in RStand + (α). Consequently, a maximal solution is always right special. When finding the factorization of a Sturmian word as a product of squares of maximal solutions, it is sufficient to detect at each position the shortest square of a word in RStand + (α) and take its largest even power occurring in that position.
Keeping the Sturmian word s of slope α fixed, we define two sequences (µ k ) and (λ k ). We set µ 0 = λ 0 = ε. Following the notation above, we define depending on the type of s as follows.
(A) If s is of type A, then we set for all k ≥ 1 that (B) If s is of type B, then we set for 1 ≤ k ≤ n that µ k = X 2 1 X 2 2 · · · X 2 k and λ k = X 1 X 2 · · · X k , and we let µ n+1 = X 2 1 X 2 2 · · · X 2 n c and λ n+1 = X 1 X 2 · · · X n c.
Compare these definitions with the example in the beginning of this section; the words X 1 and X 2 are maximal solutions in the Fibonacci word (which is of type A).
We are finally in a position to formulate precisely the observations made in the beginning of this section and state the main result of this section. Moreover, the first occurrence of the prefix λ k+1 with 0 ≤ k ≤ n − 1 is at position |λ k | of s, and the first occurrence of any prefix of √ s having lenght greater than |λ n | is at position |λ n | of s. In particular √ s is a Sturmian word with slope α.
The theorem only states where the prefixes λ k of √ s occur for the first time. For the first occurrence of other prefixes of √ s we do not have a guaranteed location. To illustrate the theorem, consider next τ, the eighth shift of the Fibonacci word. If we write under the word τ each of the corresponding words λ k at the position of their first occurrence we get the picture in Figure 2. Theorem 6.5 shows that the nice pattern where the words λ k overlap continues indefinitely and, moreover, that if we replace τ with any other Sturmian word (of type A) we obtain a similar picture. Most of the results of this paper were motivated by the discovery of this pattern.
Before proving the theorem we need one more result. Proof. This proof might be tricky to follow. We advise the reader to keep the picture of Figure 3 in mind while reading the proof. This picture depicts only the Case A below but is surely helpful. The assertion is evident when k = 0. Suppose that k > 0 and assume that λ k is right special and that λ k is a suffix of the word µ k . It is equivalent to say that {−( . We write simply λ = λ k , µ = µ k , and X = X k+1 . This proof utilizes only the facts that µX 2 ∈ L(α) and that λ is right special and a suffix of the word µ, not the structure of the words λ and µ implied by their definitions. Thus without loss of generality, we may assume that X is primitive. Consequently, X ∈ RStand + (α). It follows that .
, so by the above we are forced to conclude that [µ k+1 ] = ∅. This is a contradiction since X is chosen in such a way that [µ k+1 ] = [µX 2 ] = ∅. We conclude that , so the word λX is right special. We have two cases depending on the length of the interval This proves that λX = λ k+1 is a suffix of µ k+1 . [µ] proving that also in this case λX = λ k+1 is a suffix of µ k+1 .
Note that even though λ k is right special and always a suffix of µ k , it is not necessary for µ k to be right special.
Proof of Theorem 6.5. Since Sturmian words of type B differ from Sturmian words of type A essentially only by the fact that the sequence of maximal solutions is finite, it is in this proof enough to consider the case that s is of type A. Proposition 6.6 says that λ k is always a suffix of µ k for all k ≥ 0. Since |µ k | = 2|λ k |, it follows that the word T |λ k | (s) has the word λ k as a prefix. Therefore √ s = lim k→∞ T |λ k | (s). It remains to prove that the first occurrence of λ k+1 in s is at position |λ k | of s for all k ≥ 0. It is clear that the first occurrence of λ 1 = X 1 is at position |λ 0 | = 0. Assume that k > 0, and suppose for a contradiction that λ k+1 occurs before the position |λ k |. Since λ k is a prefix of λ k+1 , by induction we see that λ k+1 cannot occur before the position |λ k−1 |. This means that an occurrence of X k X k+1 begins in s at position ν such that |µ k−1 | ≤ ν < |µ k−1 X k |; see Figure 4. Observe that s has at position |µ k−1 | an occurrence of X 2 k . Write now X k = w t with w ∈ RStand + (α). Since w is primitive, we must have that ν = |µ k−1 | + r|w| with 0 ≤ r < t. Thus X k+1 occurs in s at position ν + |X k | = |µ k−1 | + (r + t)|w|. Since r < t, it follows that either w is a prefix of X k+1 or X k+1 is a prefix of w.
Suppose first that w is a prefix of X k+1 . If w = X k+1 , then the prefix µ k−1 X 2 k of s is followed by w 2 . Now w 2t+2 is a solution to (8) implying that X k is not a maximal solution to (8). Since this is contradictory, we infer that |w| < |X k+1 |. Since X k+1 occurs at position |µ k−1 | + (r + t)|w| < |µ k | and X k+1 has w as a prefix, it must be that X k+1 begins with wa where a is the first letter of w. Since w is right special and w 2 ∈ L(α), it follows that X 2 k+1 begins with w 2 . Like above, this implies that X k is not maximal. This is a contradiction.
Suppose then that X k+1 is a proper prefix of w. First of all, X k+1 must be primitive, as otherwise X k+1 and consequently w would have as a prefix a square of some word in RStand + (α) contradicting Lemma 6.2. The assumption that X k+1 is a prefix of w implies that X k+1 and w begin with the same letter. Like above, since w is right special and w 2 ∈ L(α), it must be that w occurs after the prefix µ k of s. Since also X 2 k+1 occurs after the prefix µ k , by Lemma 6.2 we conclude that the word w must be a proper prefix of X 2 k+1 . Observe now that the assumption that X k+1 is a proper prefix of w excludes the possibilities that w = s 0 = 0 or w = s 1 = 10 a . Therefore w = s h,ℓ for some h ≥ 2 with 0 < ℓ ≤ a h . Because |w| < 2|X k+1 |, we must have that |X k+1 | > | s h−2 |. On the other hand, since |X k+1 | < |w| and X k+1 and w begin with the same letter, the only option is that X k+1 = s h,ℓ ′ with 0 < ℓ ′ < ℓ. Now so as w is a prefix of X 2 k+1 , it must be that s h−1 = L( s h−1 ). This is a contradiction. This final contradiction ends the proof.
As a conclusion of this section, we study the lengths of the maximal solutions of (8). Namely, let s = X 2 1 X 2 2 · · · be a Sturmian word of type A factorized as a product of maximal solutions X i . Computer experiments suggest that typically the sequence (|X i |) is strictly increasing. However, there are examples where |X i | > |X i+1 | for some i ≥ 1. It is natural to ask if the lengths can decrease significantly or if oscillation is possible. It turns out that neither is possible. In Corollary 6.9 we prove that lim inf i→∞ |X i | = ∞.
First we need a result on certain periods of (semi)standard words.
Lemma 6.7. Let u, v ∈ Stand + (α) and |u| > |v|. If u is a prefix of some word in v + , then u = s k,ℓ and v = s k−1 for some k ≥ 2 with 0 < ℓ ≤ a k .
Proof. Suppose that u is a prefix of some word in v + . If u = s 1 = 0 a 1, then necessarily v = s 0 = 0. Then obviously u is not a prefix of any word in v + . Therefore u = s k,ℓ for some k ≥ 2 with 0 < ℓ ≤ a k . Suppose that k = 2. Then u = (0 a 1) ℓ 0. It is straightforward to show that v must equal to s 1 = 0 a 1; u cannot be a prefix of a word in v + if v = s 0 = 0 or v = s 2,ℓ ′ for some ℓ ′ such that 0 < ℓ ′ < ℓ. Thus we may assume that k > 2. Suppose first that |v| > |s k−1 |. Then by the assumption |u| > |v|, it must be that v = s k,ℓ ′ for some ℓ ′ such that ℓ ′ < ℓ. Since u is a prefix of some word in v + , it follows that the word w = s ℓ−ℓ ′ k−1 s k−2 is a prefix of some word in s k−2 v + . Since the word w begins with s k−1 s k−2 , we obtain that s k−2 v begins with s k−1 s k−2 , so s k−1 s k−2 = s k−2 s k−1 . This is a contradiction.
Assume then that |v| < |s k−1 |. Now the prefix s k−1 of u is a prefix of some word in v + , so by induction v = s k−2 . Now u = (s a k−1 k−2 s k−3 ) ℓ s k−2 , so as u is a prefix of some word in v + , it follows that z = s k−3 s k−2 is a prefix of some word in v + . This means that z ends with a prefix of s k−2 of length |s k−3 |. As the prefix of s k−2 of length |s k−3 | is s k−3 , the word z ends with s k−3 . Consequently s k−3 s k−2 = s k−2 s k−3 ; a contradiction.
The only remaining option is that v = s k−1 . This is certainly possible.
The next proposition describes precisely under which conditions it is possible that |X i | > |X i+1 |. Moreover, it rules out the possibility that the lengths decrease significantly or oscillate. Proposition 6.8. Let s = X 2 1 X 2 2 X 2 3 · · · be a Sturmian word of type A with slope α factorized as a product of maximal solutions X i . If |X 1 | > |X 2 |, then X 1 = s k,ℓ for some k ≥ 2 with 0 < ℓ ≤ a k − 1, the primitive root of X 2 is s k−1 , and |X 3 | > |X 1 |.
Proof. Assume that |X 1 | > |X 2 |. Let us first make the additional assumption that X 1 is primitive. In particular, X 1 ∈ RStand + (α). Let u be the primitive root of X 2 . Then u ∈ RStand + (α) and, moreover, by the assumption |X 1 | > |X 2 | it holds that |u| < |X 1 |. By Proposition 6.6 the word λ 2 = X 1 X 2 is a suffix of the word µ 2 = X 2 1 X 2 2 . Therefore X 1 is a proper suffix of X 1 X 2 , so X 1 X 2 = ZX 1 for some nonempty word Z. A standard argument shows that X 1 is a suffix of some word in X + 2 (see e.g., [10,Proposition 1.3.4]). Consequently, X 1 is a prefix of a word in u + . As |u| < |X 1 |, Lemma 6.7 implies that that X 1 = s k,ℓ and u = s k−1 for some k ≥ 2 with 0 < ℓ ≤ a k .
Suppose now that ℓ = a k . Then the word X 2 1 X 2 2 contains s k−1 s k−2 s a k +2 k−1 as a factor. Thus s a k +2 k−1 s k−2 s k−1 ∈ L(α). As s k−1 is a prefix of s k−2 s k−1 , it follows that s a k +3 k−1 ∈ L(α) contradicting Proposition 2.7. Therefore ℓ ≤ a k − 1.
Let us then relax the assumption that X 1 is primitive. Let v be the primitive root of X 1 , so that X 1 = v j for some j ≥ 1. Consider now the Sturmian word T (2j−2)|v| (s) = v 2 X 2 2 · · · . By the above arguments v = s k,ℓ for some k ≥ 2 with 0 < ℓ ≤ a k − 1 and the primitive root of X 2 is s k−1 . Further, as ℓ = a k , it follows from Proposition 2.7 that v 3 / ∈ L(α). Thus j = 1, that is, X 1 = s k,ℓ . It remains to show that |X 3 | > |X 1 |. Assume for a contradiction that |X 3 | ≤ |X 1 |. It is not possible that |X 3 | < |X 2 | as the preceding arguments show that then X 2 must be reversed semistandard word; however, X 2 is a power of the reversed standard word s k−1 . Hence by the maximality of X 2 we have that |X 3 | > |X 2 |. Let X 3 = w t with w ∈ RStand + (α) and t ≥ 1. As Assume for a contradiction that |w| < |s k−1 |. If w is semistandard, then Proposition 2.7 implies that t = 1, so t|w| > |s k−1 | cannot hold. Thus w is standard. If w = s 0 = 0, then clearly t|w| > |s k−1 | ≥ |s 1 | cannot hold as the index of the factor 0 in L(α) is a + 1. Thus w = s 0 . Suppose first that w = s k−2 . Now implying that a k−1 = 1. However, if a k−1 = 1, then a k−1 + 2 is odd, so actually 2t < a k−1 + 2.
Then a k−1 + 2 > 2t > 2a k−1 , so a k−1 < 1; a contradiction. Suppose then that w = s k−3 . Now so t > a k−2 + 1. Like previously, as X 2 3 ∈ L(α), Proposition 2.7 implies that 2t ≤ a k−2 + 2. Like above, we obtain that a k−2 < 0; a contradiction. Similar to above As 2a k−3 + 1 ≥ a k−3 + 2, we conclude that |s a k−3 +2 k−4 | < |s k−1 |. Therefore by Proposition 2.7 it is not possible that |w| ≤ |s k−4 |. In conclusion, it is not possible that t|w| > |s k−1 |. This is a contradiction. Now |w| > | s k−1 | (by the maximality of X 2 it must be that w = s k−1 ). Because |w| ≤ | s k,ℓ |, we have that w = s k,ℓ ′ for some ℓ ′ such that 0 < ℓ ′ ≤ l. Since ℓ = a k , the word w is semistandard so by Proposition 2.7 we have that t = 1. By Proposition 6.6 the word λ 3 = X 1 X 2 X 3 is a suffix of the word µ 3 = X 2 1 X 2 2 X 2 3 . It follows that s k−2 s ℓ+r where r is such that X 2 = s r k−1 . Therefore the words s k−2 and s k−1 commute; a contradiction. This final contradiction proves that |X 3 | > |X 1 |. Corollary 6.9. Let s = X 2 1 X 2 2 · · · be a Sturmian word of type A with slope α factorized as a product of maximal solutions X i . Then lim inf i→∞ |X i | = ∞.

The Square Root of the Fibonacci Word
In this section we prove a formula for the square root of the Fibonacci word. To do this we factorize the Fibonacci word as a product of maximal solutions to (8).
Lemma 7.1. For the standard words of slope Φ it holds that t k s k s k+1 s k+2 = s 2 k+2 t k+1 for all k ≥ 0.

Lemma 7.2. For the standard words of slope Φ it holds that s
Proof. If k = 0, then s 4 = 01001010 = s 2 2 t 1 . Let then k ≥ 1. Now where the last equality follows by induction. By applying Lemma 7.1 we obtain that which proves the claim.
As an immediate corollary to Lemma 7.2 we obtain a formula for the square root of the Fibonacci word.
The preceding arguments are very specific to the Fibonacci word. The reader might wonder if formulas for the square roots of other standard Sturmian words exist. Surely, for some specific words such formulas can be derived, but we believe no general factorization for the square roots of standard Sturmian words can be given. Let us give some arguments supporting our belief.
Let s = X 2 1 X 2 2 · · · be a standard Sturmian word of slope α factorized as a product of maximal solutions to (8). The word s begins with the word 0 a 1. Therefore if a > 1, then X 1 = 0 ⌊a/2⌋ . Thus if a > 1, then X 2 begins with 0 if and only if a is odd. Because of the asymmetry of the letters 0 and 1 in the minimal squares of slope α (1), the parity of the parameter a greatly influences the remaining words X i . Moreover, it is not just the partial quotient a 1 which influences the a 4 a 3 1  2  3   1  2, 5, 8 2, 4, 7 2, 5, 8  2 2, 3, 6 2, 4, 7 2, 3, 6 3 2, 3, 5 2, 4, 7 2, 3, 5 Table 1: How X 1 , X 2 , and X 3 are affected when a 3 and a 4 vary in the case that a 1 = 2 and a 2 = 1.  Table 2: How the first letter of X 4 varies when a 3 and a 4 vary in the case that a 1 = 2 and a 2 = 1.
factorization. Suppose for instance that a 1 = 2 and a 2 = 1. Table 1 shows how the values of the partial quotients a 3 and a 4 affect the words X i . The cell of the table tells to which squares of reversed standard words the words X 1 , X 2 and X 3 correspond to. For example if a 3 = 2 and a 4 = 1, then the standard Sturmian word of slope [0; 2, 1, 2, 1, . . .] begins with s 2 2 s 2 4 s 2 7 . Table 2 tells the first letter of the corresponding word X 2 4 . As can be observed from Table 2, the first letter of X 2 4 varies when a 3 and a 4 vary. Because of the asymmetry, it is thus expected that slight variation in partial quotients drastically changes the factorization as a product of maximal solutions to (8). Since similar behavior is expected from the rest of the partial quotients, it seems to us that no nice formula (like e.g., the formula of Theorem 7. As a corollary of this theorem we obtain that the word T 2a (c α ) = ∏ ∞ k=1 s k is a Sturmian word of slope α with intercept ψ({(2a + 1)α}) = aα. We have thus shown that In particular, we obtain the well-known result that the Fibonacci infinite word is a product of the reversed Fibonacci words.

A Curious Family of Subshifts
In this section we construct a family of linearly recurrent and optimal squareful words which are not Sturmian but are fixed points of the (more general) square root map. Moreover, we show that any subshift Ω generated by such a word has a curious property: for every w ∈ Ω either √ w ∈ Ω or √ w is periodic. It is evident from Proposition 2.2 that Sturmian words are a proper subclass of optimal squareful words. As Sturmian words have the exceptional property that their language is preserved under the square root map, it is natural to ask if other optimal squareful words can have this property. We show that, indeed, such words exist by an explicit construction. The idea behind the construction is to mimic the structure of the Sturmian words 01c α and 10c α . The simple reason why these words are fixed points of the square root map (thus preserving the language) is that they have arbitrarily long squares of solutions to (8) as prefixes. Thus to obtain a fixed point of the square root map, it is sufficient to find a sequence (u k ) of solutions to (8) with the property that u 2 k is a proper prefix of u 2 k+1 for all k ≥ 1. Let us show how such a sequence can be obtained. Let S be a fixed primitive solution to (8) in the language of some Sturmian word with slope [0; a + 1, b + 1, . . .] such that |S| > |S 6 |. In particular, S has the word S 6 = 10 a+1 (10 a ) b+1 as a proper suffix. Recall from the proof of Lemma 5.6 that |S| ≥ |S 5 S 6 |. We denote the word L(S) simply by L. Using the word S as a seed solution, we produce a sequence (γ k ) of primitive solutions to (8) defined by the recurrence We need to prove that the sequence (γ k ) really is a sequence of primitive solutions to (8). Before showing this, let us define The limits exist as γ 2 k is always a prefix of γ k+2 . Hence both Γ 1 and Γ 2 have arbitrarily long squares of words in the sequence (γ k ) as prefixes. Observe also that L(Γ 1 ) = L(Γ 2 ). As there is not much difference between Γ 1 and Γ 2 in terms of structure, we set Γ to be either of these words.
Taking for granted that the sequence (γ k ) is a sequence of solutions to (8), we see that √ Γ = Γ. Note that we also need to ensure that the word Γ is optimal squareful for the square root map to make sense.
Next we aim to prove the following.
Proposition 8.1. The word γ k is a primitive solution to (8) in L(a, b) for all k ≥ 1.
Before we can prove Proposition 8.1, we need to know that the words γ k are primitive and that they are factors of some optimal squareful word with parameters a and b.
Lemma 8.2. The word γ k is primitive for all k ≥ 1.
Proof. We proceed by induction. By definition γ 1 is primitive. Let k ≥ 1, and suppose for a contradiction that γ k+1 is not primitive; that is, γ k+1 = L(γ k )γ 2 k = z n for some primitive word z and n > 1. If n = 2, then obviously |γ k | must be even, and the suffix of γ k of length |γ k |/2 must be a prefix of γ k . This contradicts the primitivity of γ k . The case n = 3 would clearly imply that γ k = L(γ k ), which is not possible. Hence n > 3, and further |z| < |γ k |. As γ 2 k is a suffix of some word in z + , it follows that z = uv where vu is a suffix of γ k . On the other hand, z is a suffix of γ k , so uv = vu. Since z is primitive, the only option is that u is empty. Therefore γ k ∈ z + ; a contradiction with the primitivity of γ k . Lemma 8.3. We have that γ k , L(γ k ) ∈ L(a, b) for all k ≥ 1.
Proof. For a suitable slope α = [0; a + 1, b + 1, . . .], either of the words S and L is a reversed standard word of slope α. Thus by Theorem 5.2 both S 2 and L 2 are in L(α), so S 2 , L 2 ∈ L(a, b).
Proof of Proposition 8.1. We proceed by induction. By Lemma 8.2 the word γ k is primitive for all k ≥ 1. Lemma 8.3 tells that both of the words γ k and L(γ k ) are in L(a, b) for all k ≥ 1. By definition both γ 1 and L(γ 1 ) are solutions to (8). We may thus assume that k ≥ 1 and both γ k and L(γ k ) are solutions to (8). It follows from Lemma 5.6 that γ k L(γ k ) ∈ Π(a, b) and γ k L(γ k ) = γ k .
Therefore also L(γ k+1 ) is a solution to (8). The conclusion follows.
As we remarked earlier, we have now proved that Γ is a fixed point of the square root map. Next we show that the word Γ is aperiodic, linearly recurrent, and not Sturmian. Proof. The recurrence (10) and the definition (11) of Γ show that for all k ≥ 1 the word Γ is a product of the words γ k+1 = L(γ k )γ 2 k and L(γ k+1 ) = γ 3 k such that between two occurrences of L(γ k+1 ) there is always γ 2 k or γ 5 k . From this it follows that the return time of a factor of Γ of length γ k is at most the return time of the factor L(γ k ), which is at most 6|γ k |. Let then w be a factor of Γ such that |γ k | < |w| ≤ |γ k+1 |. Since w is a factor of some factor of Γ of length |γ k+1 |, it follows that the return time of w is at most 6|γ k+1 |. Now 6|γ k+1 | = 18|γ k | < 18|w| proving that Γ is linearly recurrent.
The preceding shows that γ k is followed in L(Γ) by both γ k and L(γ k ). As the first letters of γ k and L(γ k ) are distinct, the factor γ k is right special. Thus L(Γ) contains arbitrarily long right special factors, so Γ must be aperiodic.
Since linearly recurrent words have linear factor complexity [5,Theorem 24], it follows from Lemma 8.5 that Γ has linear factor complexity.
We observed in the previous proof that the word Γ is a product of the words S and L such that between two occurrences of L in this product there is always S 2 or S 5 . Since S and L are primitive, any word w ∈ L(Γ) which is a product of the words S and L such that |w| ≥ 6|S| must synchronize to the factorization of Γ as a product of the words S and L. That is, for any factorization Γ = uwΓ ′ we must have that |u| is a multiple of |S|.
Theorem 8.6. The word Γ is a non-Sturmian, linearly recurrent optimal squareful word which is a fixed point of the square root map.
Proof. The fact that Γ is optimal squareful and linearly recurrent follows from Lemmas 8.3 and 8.5. The argument outlined at the beginning of this section shows that Γ is a fixed point of the square root map as by Proposition 8.1 the words γ k which occur as square prefixes in Γ are solutions to (8). Finally, Γ contains the factor γ 2 2 , so Γ is not Sturmian by Lemma 8.4.
Denote by Ω the subshift consisting of the infinite words having language L(Γ). As Γ is linearly recurrent, it is uniformly recurrent, so the subshift Ω is minimal. The rest of this section is devoted to proving the result mentioned in the beginning of this section. This result is very surprising since it is contrary to the plausible hypothesis that an aperiodic word must map to an aperiodic word under the square root map.
It is not difficult to prove Theorem 8.7 for words in Ω which are products of the words S and L. We prove this special case next in Lemma 8.8. However, difficulties arise since a word in Ω can start in an arbitrary position of an infinite product of S and L. There are certain well-behaved positions in S and L which are easier to handle. Theorem 8.7 is proved for these special positions in Lemma 8.10. The rest of the effort is in demonstrating that all the other cases can be reduced to these well-behaved cases. We begin by proving the easier cases, and we conclude with the reductions.

Lemma 8.8. If a word w ∈ Ω can be written as a product of the words S and L, then
Proof. Any word u which is a product of the words S and L can be naturally written as a binary word u over the alphabet {S, L}. If such a word u has even length, then it is a word over the alphabet A = {SS, SL, LS, LL}. Using the fact that √ SS = S, √ SL = S, √ LS = L, and √ LL = L (see Lemma 5.6), we can define a square root for a word over A.
The word γ 2 k is a prefix of Γ for all k ≥ 1. Thus γ k has occurrences at positions 0 and |γ k | of Γ. Clearly |γ k | = 3 k−1 |S|, so the word γ k occurs in Γ in an even and in an odd position.
Let v be a prefix of w of length |v| = 2n|S| for some n ≥ 1, so v is a word over A.
Since v is a prefix of w, the word v is a factor of some γ k . Since γ k occurs in Γ in an even and in an odd position, the word v occurs in an even position in Γ. Hence Γ can be factored as Γ = zvt where z and t are finite or infinite words over A. Since Γ is a fixed point of the square root map, we have Definition 8.9. Let w be a word and ℓ be an integer such that 0 < ℓ < |w|. If the factor of w 3 of length |w 2 | starting at position ℓ can be written as a product of minimal squares X 2 1 , . . . , X 2 n , then we say that the position ℓ of w is repetitive. If in addition |X 2 1 · · · X 2 m | = |w| − ℓ, |w 2 | − ℓ for all m such that 1 ≤ m ≤ n, then we say that the position ℓ is nicely repetitive.
For example if a = 1, b = 0, and S = 1001001010010, then the position 1 of S is repetitive as the factor 00100101001010010010100101 of S 3 of length |S 2 | = 26 starting at position 1 is in Π(a, b). This position is not nicely repetitive as |0 2 · (10010) 2 | = 12 = |S| − 1. The position 2 of S, however, can be checked to be nicely repetitive. The position 4 of S is not repetitive as the factor 00101001010010010100101001 of length 26 starting at position 4 is not in Π(a, b).
In the upcoming proof of Theorem 8.7 we will show that if w ∈ Ω is a product of the words S and L and ℓ is a nicely repetitive position of S, then the word T ℓ (w) is always periodic. On the other hand, we show that if ℓ is not a nicely repetitive position then T ℓ (w) is always in Ω.
Next we identify some good positions in the suffix S 6 of S. As we observed in the proof of Lemma 5.6, the suffix S 6 of S restricts locally how a factorization of a word as a product of minimal squares continues after an occurrence of S 6 . Consider a product X 2 1 · · · X 2 n of minimal squares which has an occurrence of S 6 at position ℓ. Then for some m ∈ {1, . . . , n} the minimal square A consequence of the definitions is that if ℓ is a position of S such that ℓ / ∈ B S , then there exists ℓ ′ ∈ B S ∪ {|S|} such that S[ℓ, ℓ ′ − 1] ∈ Π(a, b). This fact is used later several times. Lemma 8.10. Suppose that w ∈ Ω can be written as a product of the words S and L. Assume that the position ℓ ∈ B S is nicely repetitive. Let the prefix of T ℓ (w) of length |S 2 | be factorized as a product of minimal squares X 2 1 · · · X 2 n . Then the word T ℓ (w) is periodic with minimal period X 1 · · · X n . Moreover, X 1 · · · X n is conjugate to S. Proof Sketch. As ℓ is repetitive, the factor u of length |S 2 | of S 3 starting at position ℓ is in Π(a, b). If we substitute the middle S in S 3 with L, then an application of Lemma 5.5 shows that the factor of length |S 2 | of SLS starting at position ℓ is still in Π(a, b) and that the square root of this factor coincides with the square root of u (here we need that ℓ ∈ B S ). Further analysis shows that if we substitute the words S in S 3 in any way, then the square root of the factor of length |S 2 | beginning at position ℓ is unaffected. Since ℓ is repetitive, the prefix of T ℓ+|S 2 | (w) of length |S 2 | is again in Π(a, b) and has the same square root, and so on. Thus T ℓ (w) is periodic. Since both the square of the period and S 2 occur in a suitable Sturmian word; having equals lengths, they must be conjugate by Proposition 2.6.
Proof. We have that |S| ≥ |S 5 S 6 |, so ℓ > 1. Let u be the suffix of S of length |S| − ℓ. Since ℓ is repetitive, the factor v of S 3 of length |S 2 | starting at position ℓ can be factorized as a product of minimal squares Y 2 1 · · · Y 2 m . We have that |Y 2 1 | > |u| because ℓ ∈ B S .
Next we consider how the situation changes if any of the words S in S 3 is substituted with L. Substituting the first S with L does not affect the product as ℓ > 1. Suppose then that the second word S is substituted with L. By applying Lemma 5.5 to the words u and S with X = Y 1 , we see that the factor of length |S 2 | of SLS starting at position ℓ can still be factorized as a product of minimal squares and that the square root of this factor coincides with the square root of v. Consider next what happens when the third word S is substituted with L. Let Since ℓ is nicely repetitive, we have that ℓ ′ < |S|. By the maximality of r and the definition of the set B S , we thus have that ℓ ′ ∈ B S . Applying Lemma 5.5 to the suffix of S of length |S| − ℓ ′ and S with X = Y r+1 we obtain, like above, that the product of minimal squares is affected but the square root is not. Substituting the second and third words S with L gives the same result: first proceed as above and substitute the second word S and then make the second substitution like above but apply Lemma 5.5 for the word L instead of S.
We have concluded that however we substitute the words S in S 3 , the square root of the factor of length |S 2 | beginning at position ℓ never changes. The word w is obtained from the word S ω by substituting some of the words S with L. By the preceding, the prefix of T ℓ (w) of length |S 2 | can be factorized as a product of minimal squares X 2 1 · · · X 2 n . Since ℓ is repetitive, the prefix of T ℓ+|S 2 | (w) of length |S 2 | can also be factorized as a product of some minimal squares (perhaps different) but the square root still equals X 1 · · · X n . By repeating this observation we see that . .] be a number such that a i = b i for 1 ≤ i ≤ k and b k+1 ≥ 5. Then by the definition of standard words S 5 ∈ L(β). By the preceding, the prefix of T ℓ (S 5 ) of length |S 4 | can be written as a product of minimal squares, and the square root of these minimal squares equals (X 1 · · · X n ) 2 . Since the square root of a Sturmian word of slope β is a Sturmian word of slope β, we have that (X 1 · · · X n ) 2 ∈ L(β). As |X 1 · · · X n | = |S|, it follows by Proposition 2.6 that X 1 · · · X n is conjugate to S. Since S is primitive, so is X 1 · · · X n , and hence the period X 1 · · · X n is minimal. Lemma 8.11. Every seed solution S has at least one nicely repetitive position ℓ such that ℓ ∈ B S . Proof. Suppose that S = s k,i for some k ≥ 3 and 0 < i ≤ a k . It is sufficient to show that r = | s k,i−1 | is a nicely repetitive position of S. If r / ∈ B S , then there exists r ′ ∈ B S such that S[r, r ′ − 1] ∈ Π(a, b). Since the position r is nicely repetitive, so must r ′ be. If S = L( s k,i ), then as r > 1, an application of Lemma 5.5 shows that the conclusion holds also in this case.
Observe that the word s k,i−1 is both a prefix and a suffix of S. Using the fact that s k−2 s k−3 = L( s k−3 s k−2 ) we obtain that Π(a, b). Since s k,i−1 is a solution to (8), we have that Π(a, b). Overall, the factor s k−1 L( s k−1 ) s 2 k,i−1 of S 3 of length |S 2 | starting at position r is in Π(a, b). Thus the position r of S is repetitive.
Suppose for a contradiction that the suffix of S of length |S| − r is in Π(a, b), that is, S = s k,i−1 X 2 1 · · · X 2 n for some minimal square roots X j . It follows that s k−1 = X 2 1 · · · X 2 n . Since s k−1 is a solution to (8), it follows that s k−1 = (X 1 · · · X n ) 2 . This contradicts the primitivity of s k−1 . Similarly if the suffix of S 2 of length |S 2 | − r is in Π(a, b), then s k,i−1 ∈ Π(a, b) contradicting the primitivity of s k,i−1 . We conclude that the position r is nicely repetitive. Lemma 8.10 and Lemma 8.11 now imply the following: Corollary 8.12. There exist uncountably many linearly recurrent optimal squareful words having (purely) periodic square root.
Proof. We only need to show that there are uncountably many such words. Consider the words in Ω which can be written as a product of the words S and L. Viewed over the binary alphabet {S, L}, these words form an infinite subshift Ω. Let us show that Ω is minimal. Then the conclusion follows by well-known arguments from topology: a minimal subshift is always finite or uncountable and an aperiodic subshift cannot be finite (use the fact that a perfect set is always uncountable).
Let w ∈ Ω (we use the notation of the proof of Lemma 8.8). Let u ∈ L(w) be a factor such that |u| ≥ 6. As |u| ≥ 6|S|, every occurrence of u in Γ must synchronize to the factorization of Γ as a product of S and L. It follows that every return to u in Γ is a product of S and L. Since the return time of u is finite in Γ, the return time of the word u in w is also finite. Hence Ω is minimal.
We also prove the following weaker result, which we need later. Proof. We prove first by induction that the prefix of the word S 6 s 2 k,ℓ of length 2| s k,ℓ | − |S 6 | is a product of minimal squares for k ≥ 2 and ℓ such that 0 < ℓ ≤ a k . Let us first establish the base cases.
In addition, for 0 < ℓ ≤ a 3 , we have that The case ℓ = 1 is clear. So let us assume that ℓ > 1. We have that Now s 1 s 0 s b 1 = L( s 2 ). The word L( s 2 ) is a solution to (8), so the conclusion follows as ℓ − 1 is even.
Suppose next that ℓ − 1 is odd. We need to show that s 2 s 1 s ℓ−2 2 s 0 s b 1 ∈ Π(a, b). Using the facts s 1 s 2 = L( s 2 ) s 1 and s 1 s 0 s b 1 = L( s 2 ) we obtain that s 2 s 1 s ℓ−2 2 s 0 s b 1 = s 2 L( s 2 ) ℓ−1 . By Lemma 5.6 the word s 2 L( s 2 ) is a product of minimal squares. Since ℓ − 1 is odd and L( s 2 ) is a solution to (8), the conclusion follows.
We have thus proved that the prefix of the word S 6 s 2 k,ℓ of length 2| s k,ℓ | − |S 6 | is a product of minimal squares for k ≥ 2 and ℓ such that 0 < ℓ ≤ a k . Now if S = s k,ℓ for some k ≥ 2 and ℓ such that 0 < ℓ ≤ a k , then the claim is clear by the above. Suppose that S = L( s k,ℓ ). Now if S 6 s k,ℓ / ∈ Π(a, b), then two applications of Lemma 5.5 (first with u = S 6 , v = S 2 and then with u = S 6 L, v = S) show that the claim holds. Assume that S 6 s k,ℓ ∈ Π(a, b). Since the prefix of S 6 s k,ℓ of length 2| s k,ℓ | − |S 6 | is in Π(a, b), this means that the prefix of s k,ℓ of length | s k,ℓ | − |S 6 | is in Π(a, b). It is sufficient to show that the prefixes of s k,ℓ and L( s k,ℓ ) of length 2| s 2 | are in Π(a, b). Since s 1 s 2 = L( s 2 ) s 1 , the word s 4,1 = s 2 s 3 has s 2 L( s 2 ) as a prefix. If a 3 > 1, then the word s 3 = s 1 s a 3 2 has L( s 2 ) s 2 as a prefix. Finally if a 3 = 1, then the word s 5,1 = s 3 s 4 = s 1 s 2 s 4 has L( s 2 ) 2 as a prefix. Lemma 5.6 shows that s 2 L( s 2 ), L( s 2 ) s 2 , and L( s 2 ) 2 are all in Π(a, b). The conclusion follows.
Since none of the minimal squares can be a proper prefix of another minimal square, it is easy to factorize words as products of minimal squares from left to right. Next we consider what happens if we start to backtrack from a given position to the left. Lemma 8.14 (Backtracking Lemma). Let X, Y 1 , · · · Y n be minimal square roots. Let w be a word having both of the words X 2 and Y 2 1 · · · Y 2 n as suffixes. If |X| > |Y n |, then |X| > |Y 1 · · · Y n | and the word Y 1 · · · Y n is a suffix of X.
Proof. Suppose that |X| > |Y n |. We may assume that n is as large as possible. We prove the lemma by considering different options for the word X.
Clearly we cannot have that X = S 1 . Let X = S 4 . Now X 2 can have a proper minimal square suffix only if a > 1. If a is even, then we must have that X 2 = 10 a 1(S 2 1 ) a/2 and Y n−a/2+1 = . . . = Y n = S 1 .
Again there is no choice for Y n−(a−1)/2 , and the conclusion holds. Similar considerations show that the conclusion holds if X ∈ {S 2 , S 3 }. Let then X = S 5 . It is obvious that now Y n ∈ {S 1 , S 3 , S 4 }. If Y n = S 1 or b = 0, then like above Y 1 = . . . = Y n = S 1 and Y 1 · · · Y n is a suffix of X. We may thus suppose that b > 0. Say Y n = S 3 . Then we must have b = 1 and X 2 = 10 a+1 10 a−1 Y 2 n . Like above, the remaining minimal square roots Y i with i < n must equal to S 1 and there must be ⌊(a − 1)/2⌋ of them. Since there is no further choice, the conclusion holds as clearly Y 1 · · · Y n is a suffix of X. Suppose then that b > 1. The next case is Y n = S 4 . Assume first that b is even. Then it is straightforward to see that necessarily Y n−b/2+1 = . . . = Y n = S 4 and X 2 = 10 a+1 (10 a ) b 10 a+1 (S 2 4 ) b/2 .
To aid comprehension we have separated different parts of the proof as distinct claims with their own proofs. Any new definitions and assumptions given in one of the subproofs are valid only up to the end of the subproof.
In both of these cases we deduce with the help of Lemma 8.10 that √ w ∈ Ω ∆ .
There are other interesting related questions. Consider the limit set We know very little about the limit set except in the Sturmian case when it contains the two fixed points 01c α and 10c α . For the word Γ of Section 8 we proved that the limit set contains at least two fixed points. We ask:

Question. When is the limit set nonempty? If it is nonempty, does it always contain fixed points? Can it contain points which are not fixed points?
It is a genuine possibility that the limit set is empty. Consider for instance the word ζ = τ(σ ω (6)), the morphic image of the fixed point of the morphism σ : 6 → 656556, 5 → 5 under τ : 6 → S 2 6 , 5 → S 2 5 where S 5 = 100 and S 6 = 10010 are minimal square roots of slope α = [0; 2, 1, . . .]. It is straightforward to verify that ζ is optimal squareful and uniformly recurrent and that the returns to the factor 101 in L(ζ) are 10100, 101(001) 2 00 and 101(001) 4 00. By considering all possible occurrences of the factor w = τ(56565) ∈ L(ζ) in any product of minimal squares of slope α, it can be shown that the square root of the product always contains a return to the factor 101 which is not in L(ζ). Since the factor w occurs in every point in the subshift Ω ζ generated by ζ, we conclude that Ω ζ ∩ Ω ζ = ∅.
In Section 8 we constructed infinite families of primitive solutions to (8) using the recurrence γ k+1 = L(γ k )γ 2 k . Why this construction worked was because the seed solution S and the word L = L(S) satisfy √ SS = S, √ SL = S, √ LS = L, and √ LL = L, that is, (LSS) 2 = √ LS · SL · SS = LSS. Similarly (SLLLL) 2 = SLLLL, so substituting for example S = 01010010 we obtain the primitive solution S 2 S 1 S 4 S 3 S 5 S 4 S 3 S 5 S 6 S 5 S 4 S 3 S 5 S 4 S 3 = 0101001010010010100100101001001010010010 to (8) in L(1, 0). More solutions can be obtained with analogous constructions. Restricting to the languages of optimal squareful words, we ask: Question. What are the primitive solutions w of (8) in L(a, b) such that w or w 2 is not Sturmian and w is not obtainable by the above construction?