Lower Bounds on Words Separation: Are There Short Identities in Transformation Semigroups?

The words separation problem, originally formulated by Goralcik and Koubek (1986), is stated as follows. Let $Sep(n)$ be the minimum number such that for any two words of length $\le n$ there is a deterministic finite automaton with $Sep(n)$ states, accepting exactly one of them. The problem is to find the asymptotics of the function $Sep$. This problem is inverse to finding the asymptotics of the length of the shortest identity in full transformation semigroups $T_k$. The known lower bound on $Sep$ stems from the unary identity in $T_k$. We find the first series of identities in $T_k$ which are shorter than the corresponding unary identity for infinitely many values of $k$, and thus slightly improve the lower bound on $Sep(n)$. Then we present some short positive identities in symmetric groups, improving the lower bound on separating words by permutational automata by a multiplicative constant. Finally, we present the results of computer search for short identities for small $k$.


Introduction
Telling two inputs apart is one of the simplest computational problems one can imagine. As usual, the inputs are thought of as two finite words u, v over a finite alphabet Σ. Both u and v are known in advance; then one of them is fed to the algorithm which should decide whether this is u or v. For a powerful computational model, such as the RAM model, the problem can be solved with constant space (in the length of the words): we need just one register to scan the input word until we reach a position in which u and v differ and look at the symbol at this position to decide whether we see u or v (a word can be supposed to end with a unique sentinel symbol). However, if the computational model is weak, like the finite automaton, the situation changes drastically, and distinguishing two words can no longer be done with constant space. The problem of determining the minimal size of a finite automaton separating two given words is NP-hard, as follows from some known algebraic results (see the discussion below). Moreover, even if we look at the maximal possible size of such automaton for words of a given length, very little is known about the asymptotics of this value. To make it more precise, we need some definitions. We use the array notation w = w[1.
.n] to represent finite words over finite alphabet Σ when appropriate, and also the standard notions of factors, prefixes, suffixes. We write |w| for the length of w and |w| x for the number of occurrences of the letter x in w. We treat a deterministic finite automaton (dfa) as a quadruple A = {Σ, Q, δ, s}, consisting of a finite alphabet, a finite set of states, a transition function, and an initial state. We write q.w for the state of A obtained by reading the word w ∈ Σ * starting in the state q ∈ Q. The dfa A separates words u, v ∈ Σ * if s.u = s.v. (Equivalently, there exists a set T ⊂ Q of accepting states such that exactly one of the words u, v is accepted.) Let Sep(u, v) be the minimum number of states in a dfa separating u and v.
Let T k denote the semigroup of all selfmaps of the set {1, . . . , k} under the composition of maps; it is called the full transformation semigroup on k elements. An identity in a semigroup T is a pair of words (u, v) such that the images of u and v under any map Σ → T are equal as the elements of T . By the length of the identity (u, v) we mean the maximum of |u|, |v|. We write u ≡ k v to indicate the fact that (u, v) is an identity in T k . The transition semigroup of a dfa A is a subsemigroup of T |Q| consisting of all maps w : q → q.w, where w ∈ Σ * . The following simple fact connects identities and separation: Indeed, if u ≡ k v, then this identity holds for the transition semigroup of any k-state dfa A, implying q.u = q.v in it for any state q. If otherwise ρ(u) = ρ(v) in T k for some map ρ : Σ → T k , then the transformations ρ(a), a ∈ Σ can be used to define transitions in the k-state dfa separating u and v.
It is known that the problem of checking whether u ≡ k v is coNP-complete for any k > 2 [1,8]. So by Fact 1, it is NP-complete to check whether Sep(u, v) k.
Let Sep(n) = max u,v∈Σ n Sep(u, v). The problem of describing the asymptotics of Sep(n) was first posed by Goralcik and Koubek [5]. Due to Fact 1, this problem is equivalent to finding the asymptotics of the minimum length of an identity in T k . For the the electronic journal of combinatorics 23 (2016), #P00 existing results on the identities in T k see, e.g., [11] and the references therein. Up to now the shortest known identity in T k has been the unary identity where lcm(k) denotes the least common multiple of the integers 1, . . . , k. Hence, Sep(n) > k for n lcm(k)+k−1. Since log(lcm(k)) = k+o(k) by the Prime Number Theorem 1 , this inequality can be rewritten as Sep(n) log n + o(log(n)). The logarithmic lower bound was presented already in [5], while the best known upper bound for Sep(n), obtained by Robson [12], is O(n 2/5 log 3/5 n). Such a huge gap suggests that any of these bounds can be very loose. In this paper we present a new series of identities in T k . These identities are shorter than (1) whenever k is a prime or a power of an odd prime. (More precisely, if k = p i for a prime p, then our identities are approximately p/2 times shorter than (1).) As far as we know, this is the first example of identities in T k that are shorter than (1).
There are several variations of the words separation problem; see, e.g., [4]. One variation requires a separating dfa to be permutational, which means that every letter acts on the set of states as a permutation (i.e., |Q.a| = |Q| for any a ∈ Σ). We denote the analog of the function Sep for permutational automata by Sepp. Similar to Fact 1, Sepp(u, v) > k if and only if the pair (u, v) is an identity of the symmetric group S k . Such group identities in semigroup signature are called positive and denoted below by u ∼ = k v. The best known upper bound for Sepp(n) also belongs to Robson [13] and is O(n 1/2 ). To get reasonable lower bounds on Sepp(n), one should find positive identities in S k which are shorter than the unary identity x lcm(k) = 1. In general, the problem of finding short identities in finite symmetric groups has drawn some attention in the literature. The existence of an identity of length O(e √ n log n ) was proved in [3] based on Landau's bound on the maximum order of a permutation [10]. Very recently, the existence of identities of length O(e log 4 n log log n ) was established by Kozma and Thom [9] based on a new result on the diameter of the Cayley graph of S k [6]. However, the method of finding short identities in S k uses chains of iterated commutators and thus cannot be translated to produce short positive identities. So the problem of the existence of short positive identities remains open. Here we present some series of such identities, showing that Sepp(n) 3 2 log n + o(log n). Besides this, we present the results of computer-assisted studies for small k, providing, in particular, some exact values for the functions Sep and Sepp.
The rest of the paper consists of two sections. In Section 2 we present our results on Sep and the identities in T k , while in Section 3 we consider Sepp and positive identities in S k , together with the connection between Sep and Sepp.

Identities in T k
An identity (u, v) of a semigroup T is reducible if there is an identity (u , v ) of T and a nonempty word w such that either u = wu , v = wv , or u = u w, v = v w; otherwise, the identity is said to be irreducible. Since we are interested in short identities, we will consider only irreducible ones. As was already observed, the shortest irreducible unary identity of any semigroup T k is identity (1). The following easy fact is well known; a proof can be found in [4].

Fact 2. For any pair of non-unary words
of binary words such that |u | = |u|, |v | = |v|, and u ≡ k v .
Hence, in the quest for short non-unary identities in T k we restrict ourselves to identities and dfa's over the binary alphabet {x, y}. The following necessary conditions for an identity in T k are known from [4,5,12]. We illustrate this fact with Fig. 1, showing the dfa's separating u and v in the case of violation of the conditions (i)-(iii). a) l+2 l+3 x y x,y x,y · · · common prefix b)  Recall that, given a word w ∈ Σ * and a dfa A, w can be viewed as a transformation of the set of states of A. The digraph of this transformation has one or more cycles (see an example in Fig. 2). Each such sycle is referred to as a w-cycle.
An identity (u, v) is uniform if |u| = |v|. First consider non-uniform identities. Proposition 4. A unique shortest binary non-uniform irreducible identity is Proof. First we use Fact 1 to check that (2) is an identity. Consider any binary dfa A = ({x, y}, Q, δ, s), |Q| = k, and prove that A does not separate the parts of (2). To separate them, A should separate x k−2 from x k−2+lcm(k) . If the state s.x k−2 ∈ Q belongs to an x-cycle, no separation is possible, because the length of this cycle divides lcm(k). Hence s.x k−2 does not belong to an x-cycle. Then Q = {s, s.x, . . . , s.x k−1 } and the only x-cycle is the loop on the state s.x k−1 . Therefore, x k−1 acts on Q as a constant, implying that A is unable to separate the parts of (2). Now assume that u ≡ k v and |u| < |v| lcm(k) + 2k − 2 (this number is the length of identity (2)). By Fact 1, Sep(u, v) > k. Let |u| x = l, |v| x = l + m, and w.l.o.g. m > 0. If m is not divisible by lcm(k), then some i k does not divide m. In this case u and v are separated by the i-state dfa in which y is the identity map and x is a cyclic permutation. Therefore the restriction on the length of v implies m = lcm(k). By the same argument, |u| y = |v| y . So |v| − |u| = lcm(k), as well as in (2). In addition, u and v satisfy the conditions (i)-(iii) of Fact 3. Let |u| < 2k − 2. Then u is completely covered by its prefix from (i) and its suffix from (ii). Then all y's in v occur in this prefix and/or suffix. Hence v contains x k−1 ; by (iii), so does u. Let u = zx k−1 w for some words z, w. Since u is short, z (resp., v) is a part of the common prefix (resp., suffix) of u and v. So v = zx k−1+lcm(k) w. But this means that the identity u ≡ k v is reducible to (1). This contradiction proves the assumption |u| < 2k − 2 false.
Finally, let |u| = 2k − 2, z = u[1..k−2], a = u[k−1], w = u[k..2k −2]. Then u = zaw and v = zv w for some word v of length lcm(k) + 1. The equality |u| y = |v| y implies that v contains exactly one y if a = y and v = x lcm(k)+1 otherwise. Either way, v is long enough to contain the factor x k−1 , so u contains it as well. If this factor is not a suffix of u, then u ≡ k v is reducible to (1) as in the previous paragraph. Hence w = x k−1 . If v has the prefix za, then this prefix contains all y's in v; so u = zax k−1 , v = zax lcm(k)+k−1 , and again our identity is reducible to (1). Therefore u begins with zy and v begins with zx (the opposite case is impossible since |u| y = |v| y ). Note that zy is a factor of v by Fact 3(iii). Since v has a unique y outside its prefix z (it is in v ), this y is preceded by z. So v has two occurrences of z, and they together contain the same number of y's as the prefix z of u. This is possible only if z = x k−2 . Thus, each of u = x k−2 yx k−1 and v contain a single occurrence of y; say, v[l] = y. We have l > k − 1, because v begins with zx = x k−1 . If l and k − 1 are distinct modulo i for some i k, then a dfa separating u and v is easy to construct: an x-cycle of length i contains the initial vertex, and the the electronic journal of combinatorics 23 (2016), #P00 y-edges from s.x k−1 and s.x l lead to the same vertex of this cycle, so that the remaining x's will be read to different vertices. Therefore, l = k − 1 + lcm(k), implying that the identity u ≡ k v coincides with (2).
Next we switch to uniform identities. An identity (u, v) is balanced if |u| a = |v| a for any letter a.
Proposition 5. A unique shortest binary uniform unbalanced identity is Proof. Since (3) is obtained by multiplying two copies of (1), it is obviously an identity. Now consider any uniform unbalanced identity u ≡ k v of length at most lcm(k) + 2k − 2, which is the length of (3). Similar to the proof of Proposition 4, we obtain that |u| x > |v| x implies |u| x = |v| x + lcm(k) and |v| y = |u| y + lcm(k). Let u = zu w, v = zv w, where z (resp. w) is the longest common prefix (resp., suffix) of u and v. By Fact 3 we have |z| k − 2, |w| k − 1, and thus |u | lcm(k) + 1. If |u | = lcm(k) + 1, we can assume u = x lcm(k)+1 , v = y i xy j , where i, j > 0 (if u contains fewer x's, then v = y lcm(k)+1 , so we get a symmetric case). Then x k−1 is a factor of v by Fact 3, implying w = x k−1 . Now all factors of u of length k − 1 end with x, which is not the case for v; again by Fact 3, u and v cannot form an identity. Hence, |u | lcm(k). So we have u = x lcm(k) , v = y lcm(k) . Since x k−1 is a factor of v, y k−1 is a factor of u, we immediately get the identity (3) up to renaming the letters.
Proposition 6. Every T k satisfies the binary uniform balanced identity Proof. The same argument as in Proposition 4 works: for any dfa with k states either s.x k−2 = s.x k−2+lcm(k) or x k−1 is a constant map.
The summary of the proved statements is as follows: the shortest non-unary unbalanced identities in the semigroup T k have exactly the same length lcm(k) + 2k − 2 as some binary balanced identity, and are slightly longer than the unary identity of this semigroup. The question is whether there exist shorter balanced binary identities.
Remark 7. An exhaustive computer search reveals that identities (4) are the shortest binary identities in the semigroups T k for k 4. For k = 5, such a search is beyond capabilities of any computer. However, below we show that T 5 does have a shorter identity as well as infinitely many other semigroups T k . Theorem 8. Semigroup T k satisfies the following identity of length 2lcm(k − 1) + 6(k − 1): Corollary 9. If k 5 is either a prime or an odd prime power, the semigroup T k satisfies an identity which is shorter than the unary identity (1).
the electronic journal of combinatorics 23 (2016), #P00 Proof of Theorem 8. Let us take a dfa A and consider the transformation xy in it. If the state s.(xy) k−2 does not belong to any (xy)-cycle, then we see, similar to Proposition 4, that (xy) k−1 is a constant map. So in this case A does not separate the sides of (5). Assume that s.(xy) k−2 belongs to a (xy)-cycle of length m. If m < k, then all (xy)-cycles in A have length < k. Since q.(xy) k−1 belongs to some (xy)-cycle for any state q and the lengths of all (xy)-cycles divide lcm(k−1), both sides of (5) move s to the same state. Finally, let m = k. Then xy is a permutation (namely, a cycle of length k), and (xy) k = 1. Hence x, y and yx are permutations, and clearly (yx) k = 1. Deleting (yx) k from both sides of (5), we get a graphical equality, so once again we see that A is not separating. This conjecture is partially verified by the computations described in the next section.

Positive Identities in S k
The symmetric group S k satisfies the positive identity x lcm(k) = 1 and its binary counterpart x lcm(k) = y lcm(k) . By the same argument, as the one used in Propositions 4 and 5, these are the shortest unbalanced identities in S k , so all shorter positive identities are balanced. It is known that the shortest positive identity in S 3 is x 2 y 2 = y 2 x 2 (folklore). The shortest such identity in S 4 has length 11: x 6 y 2 xy 2 = y 2 xy 2 x 6 [4]. We ran a computer search for the positive identities in S 5 . Using an optimized search based on hash functions, we checked all balanced pairs (u, v) of length at most 33, arriving at the following result.
Further, we checked the identities (6) in S 6 .
Proposition 12. A unique, up to symmetry, shortest positive identity of S 6 is (6b).
Naturally enough, (6b) is not an identity in S 7 : these words are separated by a dfa in which xy and yx are different cycles of length 7. Hence, the function Sepp(n) never takes the value 6: Proof. Since identity (4) is longer than (1), Remark 7 implies the values of Sep up to n = 14 and the fact that Sep(15) > 4.
(xy) 6 (yx) 10 (xy) 4 = (yx) 4 (xy) 10 (yx) 6 palindrome Note that if zuw ≡ k zvw, where z (resp., w) is the longest common prefix (resp., suffix) of both sides, then u ∼ = k v. So, the search for the identities in T 5 can be performed by iterating over the identities of S 5 , using an exhaustive search for the candidates for z and w. Such a search, based on the identities listed in (6) and Table 1, gave us exactly one identity of T 5 , namely, the identity (5) for k = 5, that has length 48. The result of this search supports Conjecture 10.
The analisys of the identities listed in (6) and Table 1 results in finding some general classes of identities in S k . The simplest class, described in the following proposition, allows us to move up the lower bound on the function Sepp by a multiplicative constant.
Proposition 15. Let a, b be such that the order of any element of S k divides either a or b.
Proof. For any x, y ∈ S k the elements (xy) and (yx) have the same order. Then by the choice of a, b either (xy) a = 1 or (yx) b = 1, implying the result.
Theorem 16. The symmetric group S k satisfies a positive identity (7) of length e Corollary 17. Sepp(n) 3 2 log n + O log n log log n .
Proof of Theorem 16. Take a number α, 0 < α < 1. Let m = αk and P (m) be the product of all primes and prime powers from the range {m+1, . . . , k}. Choose a = lcm(m), b = lcm(k − m) · P (m), and apply Proposition 15. Indeed, the order of a permutation is the least common multiple of the length of its cycles; if a permutation has no cycle of length greater than m, than its order divides a; if such a cycle exists, than all other cycles are shorter than k − m, so the order divides b. Thus we get an identity of type (7) with the a and b chosen 3 . Since the length of this identity is 2(a + b), we want to find the value of α which delivers the minimum to a+b. Clearly, α 1/2, implying m k/2. We use standard asymptotic formulas (see, e.g., [2]) lcm(t) = e t+O( t log t ) and π(t) = t log t + O( t log 2 t ), where π(t) is the number of primes smaller than t. To estimate P (m), we note that the product of i factors equals their geometric mean taken to the ith power. Since all factors are between m and k, their mean is k/β for some β between 1 and 2. To compute the number of factors, we can use the asymptotics for π(m) (the number of prime powers smaller than t is O(π( √ t)) and thus does not affect the asymptotics). So we have Thus the minimum of a+b is reached at α = 2/3 so that m = 2k/3, and this minimum is e 2 3 k+O( k log k ) , as required.
A more involved class of equations is defined in the following proposition. The corresponding conditions can be easily extended to get identities with any even number of blocks of the form (xy) a and (yx) b , but it is not clear if it is possible to build short identities of this type for any k.
Proposition 18. Let a, b, c, d be such that every order q of an element of S k satisfies at least one the following conditions or their counterparts obtained by swapping b with c, and a with d: (i) q divides both a and c, (ii) q divides both a + c and b, (iii) q divides a and b ≡ d (mod q). Then S k satisfies the identity (xy) a (yx) b (xy) c (yx) d ∼ = k (yx) d (xy) c (yx) b (xy) a .
Proof. We again use the fact that for any x, y ∈ S k the elements (xy) and (yx) have the same order. It is easy to see that each of the conditions (i)-(iii) forces some terms to vanish from both sides of (8) in a way that the remaining words are graphically equal.
We use Propositions 15 and 18 to run further computer experiments; in Table 2 we present the parameters of the shortest identities of types (7) and (8), obtained by exhaustive search, and compare their lengths to the length lcm(k) of the unary identity. Note that the parameters a and b of the shortest identity of type (7) in most cases are equal to those chosen by the rule described in the proof of Theorem 16. For example, for k = 23 we have a = lcm(16), b = lcm(6) · 17 · 19 · 23. So it looks probable that no other way of choosing the pair (a, b) can improve the result of Theorem 16. The identities of type (8) for small k are shorter than the identities of type (7), but it is unclear whether this is true for all k. Table 2: Parameters of the shortest positive identities of types (7), (8).
Identities of type (8) Identities of type (7) k a b c d Len a b Len lcm(k) 5, 6 1 6 5 4 32 12 5 • the logarithmic lower bound for Sep(n) is improved by an additive sublogarithmic term for infinitely many values of n; • the logarithmic lower bound for Sepp(n) is improved by a factor of 3/2.
The obvious next step should be an attempt to improve the function Sep by some factor and prove a superlogarithmic lower bound for Sepp. Our general impression is that both such improvements are possible. On the other hand, we are not so optimistic about the existence of a superlogarithmic lower bound for Sep.