On the proportion of prefix codes in the set of three-element codes

Let $L$ be a finite sequence of natural numbers. In Woryna (2017,2018), we derived some interesting properties for the ratio $\rho_{n,L}=|PR_n(L)|/|UD_n(L)|$, where $UD_n(L)$ denotes the set of all codes over an $n$-letter alphabet and with length distribution $L$, and $PR_n(L)\subseteq UD_n(L)$ is the corresponding subset of prefix codes. In the present paper, we study the case when the length distributions are three-element sequences. We show in this case that the ratio $\rho_{n,L}$ is always greater than $\alpha_n$, where $\alpha_n=(n-2)/n$ for $n>2$ and $\alpha_2=1/6$. Moreover, the number $\alpha_n$ is the best possible lower bound for this ratio, as the length distributions of the form $L=(1,1,c)$ and $L=(1,2,c)$ assure that the ratios asymptotically approach $\alpha_n$. Namely, if $L=(1,1,c)$, then $\rho_{n,L}$ tends to $(n-2)/n$ with $c\to\infty$, and, if $L=(1,2,c)$, then $\rho_{2,L}$ tends to $1/6$ with $c\to\infty$.


Motivation and the results
A code over a finite alphabet X is a finite sequence C = (v 1 , . . . , v m ) of words over X (so-called code-words) such that every w ∈ X * has at most one factorization into the code-words, i.e. if w = v i1 . . . v i l = v j1 . . . v j l ′ for some l, l ′ ≥ 1 and 1 ≤ i t , j t ′ ≤ m (1 ≤ t ≤ l, 1 ≤ t ′ ≤ l ′ ), then l ′ = l and i t = j t for every 1 ≤ t ≤ l (note that our definition differs a bit from the more usual one, where codes are considered as the sets of words rather than the sequences -see also [1,2]). A code C = (v 1 , . . . , v m ) is called a prefix code if it satisfies the following condition: for all 1 ≤ i, j ≤ m the code-word v i is a prefix (initial segment) of the code-word v j if and only if i = j.
For every natural number n ≥ 2 and every finite sequence L = (a 1 , . . . , a m ) of natural numbers, we consider the set U D n (L) of all codes over an n-letter alphabet X with length distribution L, i.e. a code C = (v 1 , . . . , v m ) over X belongs to U D n (L) if and only if |v i | = a i for every 1 ≤ i ≤ m. We denote by P R n (L) the corresponding subset of prefix codes in the set U D n (L).
The prefix codes form the most useful and important class of codes. Therefore it is natural to ask about the contribution of these types of codes in various classes of codes. For given L and n ≥ 2 both the sets U D n (L) and P R n (L) are finite, and such a contribution may be defined as the ratio ρ n,L = |P R n (L)| |U D n (L)| .
• m i=1 n −ai ≤ 1 (the so-called Kraft inequality). If L is constant, then the code-words of every code C = (v 1 , . . . , v m ) in U D n (L) have the same length. In particular, for all 1 ≤ i, j ≤ m, the code-word v i is a prefix of the code-word v j if and only if v i = v j , which implies that C is also a prefix code. Thus, if L is constant, the equality U D n (L) = P R n (L) holds. It follows that for every n ≥ 2 and m ≥ 1, there is a sequence L of length |L| = m such that the ratio ρ n,L is equal to 1. On the other hand, as we showed in [16], the following theorem holds: where a and b are arbitrary two different values of L and r a (resp. r b ) is the number of those elements in L which are equal to a (resp. to b).
Obviously, for every n ≥ 2 and m ≥ 1 there are infinitely many sequences L of length m such that the sets U D n (L) and P R n (L) are non-empty. Hence, given n and m, it is non-trivial to ask whether ρ n,L can be arbitrarily close to zero (depending on L). In the paper [17], we negatively answered this question by the following result: Theorem 2). If n ≥ 2 and m ≥ 1, then for every sequence L of length m such that U D n (L) = ∅, the following inequality holds where (m) n−1 is the remainder from the division of m by n − 1 and q n,m := 1, n ≥ m, By the above theorem, we see that for every n ≥ 2 and m ≥ 1 the infimum taken over all sequences L of length m such that U D n (L) = ∅ is greater than zero (note that this infimum depends only on n and m). In the paper [17], we derived various interesting properties for this infimum. For example, we showed there that for every n ≥ 2, it tends to 0 when m → ∞, and for every m ≥ 1, it tends to 1 when n → ∞.
In [17], we also derived for all a, b ≥ 1 the equality which we apply in the proof of the following The formula (2) follows from a nice characterization of two-element codes. Namely, a sequence C = (w, v) is a code if and only if vw = wv, or, equivalently, the words w, v are not the powers of the same word (see also [1] or [3]). The situation is much more complicated in the case of codes of length three, as the full characterization of such codes is not known (for some partial results see [6,11,12]). Nevertheless, the properties of three-element codes are studied within the years ( [4,6,7,8,9,10]) and some problems remain still open for these codes (see [5,10,14]).
For every n ≥ 2 let us define the number α n as follows For the main results of the present paper, we prove the following theorems Theorem 4. If n ≥ 2 and L is an arbitrary sequence of length three such that U D n (L) = ∅, then ρ n,L > α n . As a direct consequence of Theorems 4-6, we obtain the following Corollary 1. For every n ≥ 2 the infimum (1) taken over all sequences L of length |L| = 3 for which U D n (L) = ∅ is equal to α n .

The proof of Theorems 4-5
In this section, we give the proof of Theorems 4-5. To this aim, we need at first the following two propositions. Proposition 1. If n ≥ 2 and L = (a, b, c) is an arbitrary three-element sequence such that P R n (L) = ∅, then Proof (of Proposition 1). Since |P R n (L ′ )| = |P R n (L)| for every sequence L ′ obtained from L by permuting the elements, we can assume that a ≤ b ≤ c. An arbitrary prefix code (w, v, u) ∈ P R n (L) can be constructed as follows. The word w can be freely chosen among the n a words of length a. The word v can be freely chosen among the words of length b which do not have w as a prefix. The number of these words is equal to n b − n b−a . Finally, the word u can be freely chosen among the words of length c which do not have any of the words w, v as a prefix. Since the number of words of length c which have one of the words w, v as a prefix is equal to n c−a + n c−b , we can choose u among the n c − n c−a − n c−b available words. In consequence, we obtain and the desired formula follows by multiplying the above brackets.
Proof (of Proposition 2). The claim follows from the observation that a sequence (x, y, w) ∈ X × X × X c is a code if and only if x = y and w / ∈ {x, y} c . Obviously, if at least one of these conditions does not hold, then this sequence is not a code. Conversely, let us assume that x = y and w / ∈ {x, y} c . To show that (x, y, w) ∈ U D n ((1, 1, c)), we use the Sardinas-Patterson algorithm ( [15]). Namely, we define D 0 := {x, y, w} and for each i ≥ 1 we define D i as the set of all non-empty words u ∈ X * for which the following condition holds: ((1, 1, c)).
We are ready now to prove our main results.
Theorem 4. If n ≥ 2 and L is an arbitrary sequence of length three such that U D n (L) = ∅, then ρ n,L > α n .
Proof. Let L = (a, b, c). Since ρ n,L = ρ n,L ′ for every sequence L ′ obtained from L by permuting the elements, we can assume that ((a, b)), and hence, by the formulae (2), we obtain By Proposition 1 and by the inequality (3), we obtain: Let us denote by Q(a, b) the quotient on the right side of the above inequality. Then we have Since R(a) < 0, it follows by the equality If b ≥ a + 1 = 2, then for every (w, v, u) ∈ U D n (L) we have u = w c ∈ X c , and hence |U D n (L)| ≤ (n c − 1) · |U D n ((a, b))| < n c (n a+b − n).

Consequently, we have in this case
If b = a = 1, then n > 2 and by Proposition 2, we have Thus in every case, we have ρ n,L > α n , which finishes the proof of Theorem 4.

The proof of Theorem 6
In this section, we derive Theorem 6 describing the case of binary codes. Surprisingly, this is the most involving step of our study. As we will see, the main burden of the proof relies on some two propositions below (Proposition 3 and Proposition 4) deriving the cardinalities of two specially constructed subsets. In the proof of both these propositions, we will use the following auxiliary lemma. Lemma 1. Let C = (v 1 , . . . , v m ) be a sequence of non-empty words with the following property: There exist 1 ≤ µ, κ ≤ m, µ = κ such that v µ = v κ v for some v ∈ X * . If C is not a code, then the sequence Proof (of Lemma 1). Obviously, if any word w has a factorization into words from the sequence C, then it has also a factorization into words from the sequence C ′ . In fact, if w = w 1 . . . w l , where all w i (1 ≤ i ≤ l) are from C and if 1 ≤ i 0 ≤ l is the smallest number such that w i0 = v µ = v κ v, then there exists l ′ > l and the words W 1 , W 2 , . . . , W l ′ in C ′ such that W i = w i for every Suppose contrary that C is not a code and C ′ is a code. Since each word in C is non-empty, we have v = v µ . This implies that the word v does not belong to C (otherwise v = v i = v ′ i for some i = µ, and then C ′ would not be a code, contrary to our assumption).
Let w be the shortest word with the following two different factorizations from the sequence C. In particular, w 1 = u 1 . There exists the smallest number i 0 ≥ 1 such that w i0 = v µ or u i0 = v µ (otherwise the words w i and u j all belong to C ′ , which implies that C ′ is not a code). Without losing generality, we can assume that w i0 = v µ . Then there exists l ′ > l and the words Suppose that u i = v µ for every 1 ≤ i ≤ r. Then every word u i belongs to C ′ . Since W 1 . . . W l ′ = u 1 . . . u r and C ′ is a code, we obtain l ′ = r and W i = u i for every 1 ≤ i ≤ r. Now, if i 0 > 1, then w 1 = W 1 = u 1 , contrary to our assumption. If i 0 = 1, then u 1 = W 1 = v κ , u 2 = W 2 = v and hence the word v = u 2 belongs to C, contrary to our observation.
Thus, there exists the smallest number j 0 ≥ 1 such that u j0 = v µ . Then we have u 1 . . . u r = U 1 . . . U r ′ for some r ′ > r and the words U i (1 ≤ i ≤ r ′ ) from the sequence C ′ such that U i = u i for every 1 ≤ i < j 0 , U j0 = v κ and U j0+1 = v. Since w 1 . . . w l = u 1 . . . u r , we obtain W 1 . . . W l ′ = U 1 . . . U r ′ . Since C ′ is a code, we obtain l ′ = r ′ and W i = U i for every 1 ≤ i ≤ r ′ . By the minimality of i 0 , we have j 0 ≥ i 0 . If i 0 > 1, then j 0 ≥ i 0 > 1, and hence u 1 = U 1 = W 1 = w 1 , contrary to our assumption. Thus, it must be i 0 = 1, which implies W 1 = v κ , W 2 = v. Now, if j 0 ≥ 3, then u 2 = U 2 = W 2 = v, and hence the word v = u 2 would belong to C, which is impossible. If j 0 = 2, then U 2 = v κ and, since U 2 = W 2 = v, we would obtain that the word v = v κ belongs to C. Thus it must be j 0 = i 0 = 1. But then, by the definition of the numbers i 0 , j 0 , we have w 1 = u 1 = v µ . Thus the assumption that C ′ is a code, always leads to a contradiction. Consequently, C ′ is not a code, which finishes the proof of the lemma. Now, we are ready to prove our main result. Proof. Let X = {0, 1} be the binary alphabet. For every c ≥ 1, we consider the subset N U D(c) ⊆ X × X 2 × X c of those sequences which are not codes: For any letters x, y, z ∈ X, let us define the subset K x,yz (c) ⊆ N U D(c) as follows: Obviously, we have Since a sequence of words is a code if and only if its reversal is a code (see [1,2] for example), we also have The set N U D(c) is the union of the subsets K x,yz (c) (x, y, z ∈ X), and the following implication holds: Let K 1,00 (c) be the set of all words w ∈ X c such that (1, 00, w) ∈ K 1,00 (c), that is a word w ∈ X c belongs to K 1,00 (c) if and only if the sequence (1, 00, w) is not a code. We also denote by J 1,00 (c) the set of all words w ∈ X c of the form 1 i1 (00) j1 . . . 1 i k (00) j k , k ≥ 1 for some integers i l , j l ≥ 0 (1 ≤ l ≤ k).
Thus x c = x c−1 + x c−2 for every c ≥ 3. Since x 1 = 1 and x 2 = 2, we obtain x c = F c+1 for every c ≥ 1. Thus the equality |J 1,00 (c)| = F c+1 holds for every c ≥ 1. Now, if c is even, then 0 c ∈ J 1,00 (c), and consequently If c is odd, then 0 c / ∈ J 1,00 (c), and consequently This finishes the proof of Proposition 3 In the next step, we derive the number of elements of the set K 1,01 (c). We proceed in the similar way. Namely, we denote by K 1,01 (c) the set of all words w ∈ X c such that (1, 01, w) ∈ K 1,01 (c), that is a word w ∈ X c belongs to K 1,01 (c) if and only if the sequence (1, 01, w) is not a code. We also denote by J 1,01 (c) the set of all words w ∈ X c which do not contain two consecutive 0's. Suppose that there is c ≥ 1 such that K 1,01 (i) = J 1,01 (i) for every 1 ≤ i ≤ c. To show the equality K 1,01 (c + 1) = J 1,01 (c + 1), we use (as before) the double inclusion argument. So let w ∈ J 1,01 (c + 1) be arbitrary. Then there is k ≥ 1 such that where i 1 , i k ≥ 0 and i t ≥ 1 for every 1 < t < k. In the case i k ≥ 1 we can write which means that w has two different factorizations into the words 1, 01, w. If i k = 0, then we have w1 = 1 i1 (01)1 i2−1 . . . (01)1 i k−1 −1 01, which gives that the word w1 has two different factorizations into the words 1, 01, w. Thus in every case, we have w ∈ K 1,01 (c + 1). Consequently, the inclusion J 1,01 (c + 1) ⊆ K 1,01 (c + 1) holds.
Conversely, let w ∈ K 1,01 (c + 1) be arbitrary. Then one of the words 01, 1 must be a prefix of w. Thus there exists v ∈ X * such that w = 01v or w = 1v. By Lemma 1, we obtain that (1, 01, v) is not a code, which means that v ∈ K 1,01 (c − 1) in the first case and v ∈ K 1,01 (c) in the second case. By the inductive assumption, the word v does not contain two consecutive 0's. But then the word w also does not contain two consecutive 0's, which means that w ∈ J 1,01 (c + 1), and consequently K 1,01 (c + 1) ⊆ J 1,01 (c + 1). The inductive argument finishes the proof of the first part.
In consequence, we have for every c ≥ 2: Since lim