On simulating Turing machines with matrix semigroups with integrality tests

We present a construction to simulate Turing machines with 3 × 3 matrices over rationals. The correctness of simulation is guaranteed by testing that the matrices have integral elements during the simulation. This construction implies an undecidability result for a special identity problem for semigroups of 3 × 3 -matrices.


Introduction
In this article we prove that a Turing machine can be simulated by a matrix semigroup over rational numbers with integrality tests.That is, the generators of the semigroup are rational matrices, but the product matrices remain integral during a correct simulation.Indeed, multiplying with a matrix such that an element in a simulation matrix turns non-integral is equivalent to usage of an incorrect transition in the Turing machine side.This allows us to faithfully simulate any Turing machine.
As a consequence, we prove that the identity problem, i.e., whether the identity matrix is in the generated semigroup, is undecidable in this setting.Although this result seems to be a quite traditional undecidability result for semigroups generated by rational matrices, our original motivation for this study is quite far from traditional.Our goal is to prove an undecidability result for matrices such that in the products (simulating the computational system reduced to it) the elements of the matrices are significantly smaller than in the traditional reductions.Also, we are interested in pure modelling, i.e., in the question how to simulate a computational system operating with sequences of symbols using matrices.
Indeed, most of the known undecidability reductions for problems in integer matrix semigroups rely on the undecidability of the Post Correspondence Problem, PCP for short, or some of its variants.There are also some proofs that use the Hilbert's Tenth problem in the reduction; see for example [1].In the PCP, for given two word morphisms , ℎ ∶  * →  * , it is asked whether or not there exists a non-empty word  such the  and ℎ agree on it, that is,

ℎ(𝑤) = 𝑔(𝑤).
The traditional reduction from the PCP to integer matrices is based on an injective (-ary) representation  of words in ℕ and the coding  of pairs of words into 3 × 3 matrices so that the catenation operation of the word semigroups  * and  * are preserved in matrix multiplication.The formal definitions of  and  are given in Section 4, but let us give an example how these codings work: for example we may define  such that for all words  1 ,  1 ,  2 ,  2 ∈  * , ( 1 ,  1 )( 2 ,  2 ) = , for a large enough  ∈ ℕ.Here || denotes the length of the word , that is, the number of symbols in .Now if we simply set   = ((  ), ℎ(  )) for all letters   in the alphabet , we derive from the undecidability of the PCP that it is undecidable for the matrix semigroup generated by the matrices   whether or not there exists a matrix  in the semigroup such that  31 =  32 .
In our reduction, we shall use the above mentioned , but the mapping  is modified.The main difference is that our reduction is one step below in the reduction chain of simulation.The key in all undecidability proofs of the PCP is that the pair of morphisms in the PCP can simulate a (universal) computational system such as Turing machines [15], semi-Thue systems [9], tag systems [23], normal systems [25], just to mention some of the most well-known systems.In all of these undecidability reductions, except in the Post's original proof from the normal systems [25], the simulation of the computation of the chosen system is done so that there is a nonempty word  for constructed morphisms  and ℎ such that () = ℎ() if and only if there exists a computation from a particular configuration  of the system to the configuration  and the word  is a catenation of all configurations (including the used transitions/rules of the used systems) along this computational path. 2 In other words, the word  is very long implying that the words () and ℎ() are very long and, therefore, the elements of the matrices ((), ℎ()) become huge if we consider simulation of the reduced computational system with matrices.For example, the element (1,3) in mapping  is   where  is approximately the sum of lengths of all configurations in the computation of the system.Moreover, some of the elements never decrease when a universal system is simulated through the PCP with the products of matrices.
Our main motivation for this study is the simulation of computational system with matrices directly without remembering the whole history of computation in elements of the matrices implying that the elements are smaller.In our construction for the simulation, the elements of the matrices are integers encoding only the current configuration of the system.Therefore, the elements are much smaller than in a simulation using the PCP as a bridge from a computational system to matrices.
We apply the simulation construction and consider the existence of a particular computation of the simulated Turing machine.As a result we prove undecidability of a variant of the identity problem for matrix semigroups.The identity problem is a long-standing open problem.Unlike most other matrix semigroup problems, the three-dimensional case remains open.It was shown in [5] that the problem is undecidable for integral matrices of dimension four, and in [19], a better bound on the number of matrices in the generator set was given.For two-dimensional matrices, it is known that the identity problem is decidable for integer matrices-the problem is even -complete [3]-and undecidable over rational quaternions [2].Recently, it was shown that there is no embedding of pairs of binary words into SL(3, ℤ) [19].The result suggests that the identity problem is decidable for three-dimensional matrices as the vast majority of undecidability results rely on embeddings of pairs of binary words into matrices.Recently, there has been a surge of interest in the identity problem for different classes of matrices [11,12].
In order to prove undecidability of the identity problem with integrality tests, we use an encoding of pairs of words into matrices that allows us to simulate a Turing machine and, in particular, allows us to use the undecidability of the halting problem for the empty input in a special form.The integrality tests are then used to ensure that a faithful simulation is performed.The integrality test can be performed by checking after each matrix multiplication that the resulting matrix is integral.
Finally, note that simulation of a computational system, such as Turing machines, with integral or rational counters is by no means new.There are famous models such as the Minsky machine [22] and the Fractran model defined by Conway [10], just to mention two.Our model for the simulation, the integral/rational matrices, is significantly different as the "counters" act on matrices.
On the other hand, there is a vast literature on dynamics of loops of form where  are the variables,  is a guard condition that the assignment of  has to satisfy and  is an update function that assigns new values to  .Often, the variables are represented by -dimensional vectors over ℤ, ℚ, ℝ, ... The guard condition often defines a polytope, i.e., is a system of linear equalities and inequalities, [26,8,17,16,18], but can be defined by, e.g., Presburger formulas [13].
The update function is typically more varied as small tweaks to the function can lead to different results, but is often restricted to linear updates, i.e., multiplying by a matrix [6,7,18,20].
Our setting can be seen as a non-deterministic loop of form while (() = true) do ( ∶=  1 () or  ∶=  2 () or … or  ∶=   ()), where  is a -dimensional rational matrix, each   is a multiplication by a matrix and () returns true if and only if every component of  is integral.That is, the loop is of the form while ( ?∈ ℤ × = true) do ( ∶=  1 or  ∶=  2 or … or  ∶=   ).
Recall, that the termination of non-deterministic loops with linear guards and linear updates is undecidable [26].

Preliminaries
Let ℕ, ℤ, ℚ be the sets of the natural numbers, the integers and the rational numbers.We denote by ℙ the set of all primes.A semigroup is a set equipped with an associative binary operation.Let  be a semigroup and  be a subset of .We say that a semigroup  is generated by a subset  of  if each element of  can be expressed as a composition of elements of .In this case, we call  a generating set of  and denote  = ⟨⟩.Given an alphabet Σ = { 1 ,  2 , … ,   }, a finite word  is an element of semigroup Σ * .The empty word is denoted by .The length of a finite word  is denoted by || and || = 0.
We shall consider semigroups where the generators are  ×  matrices (over rationals) and the composition operation is the matrix multiplication.Denote by   the -dimensional identity matrix.If the dimension is clear from context, we denote the identity matrix simply by  .
Let  ⊆  × for some  ∈ ℤ + and  ∈ {ℤ, ℚ, ℝ, ℂ}.Let us define the integral set ⟨⟩ ℤ = ⟨⟩ ∩ ℤ  .That is, ⟨⟩ ℤ consists of all elements  ∈ ⟨⟩ such that  ∈ ℤ × even if generators used are not in ℤ × .It is also possible to define an element to be integral with respect to  for some  ⊆ .That is,  ∈ ⟨⟩ is integral with respect to  if  =  ′  , for some  ′ ∈ ⟨⟩ and  ∈  such that  ′  ∈ ℤ × .In Section 6, we discuss a modification of our construction that takes the integrality with respect to a set into account.
Let us next prove a simple property regarding a product of primes.This lemma will be useful in upcoming sections when we show that an incorrect product of matrices cannot result in a correct product.Lemma 1.Let  1 ,  2 , … ,   be odd pairwise different primes, where  ≥ 2.
Proof.Assume towards a contradiction that the equality holds.Now By the assumption, the above is equal to ∏  =1   .By rearranging terms, we have the equation .
The right-hand side is positive and divisible by   .On the other hand, the left-hand side is divisible by   only if   −   is.The term   −   cannot be both divisible by   and positive, hence we reach a contradiction.□

Halting problem
A Turing machine  (with a final state), TM for short, is a 7-tuple where  is a finite set of states,  0 is the initial state, ℎ ∈  is the final state, Σ is the input alphabet, Γ is the tape alphabet with Σ ⊆ Γ, and  is a partial function  × Γ →  × Γ × {, } called the transition function where  and  are special direction symbols and ⋆ ∈ Γ is the blank symbol.The TM operates on a one-way infinite tape.Note that the TM's are deterministic, however, we allow  to be a partial function, i.e., it may be undefined for some values (, ) ∈  × Γ.Therefore, if (, ) is defined, it is unique.Each transition of a TM  is of the form (, ) = (, , ).Here  refers to "direction".The values  and  refer to "left move" and "right move", respectively.As the tape is one-way infinite, we can assume that the leftmost cell on the tape is ⊳ and (, ⊳) = ( ′ , ⊳, ) for all  ∈ , and furthermore, that no other production rule writes ⊳ on the tape.
A configuration of the TM , at some point in its computation, is the current state of the machine and the content of its tape.Let the content of the tape be ⊳ ⋆ ⋆ ⋯ where ,  ∈ Γ * , assume that  is in state  reading the symbol  ∈ Γ and assume further that ⊳ is the shortest word containing all nonblank letters of the tape.Then the configuration represented by the word ⊳(, ) ∈ Γ * ( × Γ)Γ * where  is either  or ends with a nonblank letter.
A step in a computation or a move  ⊢   ′ yielding from one configuration  of  to the next one  ′ is defined in the usual way.We define here only the right-move, the left-move definition is analogous.Let the configuration be ⊳(, ) and assume that (, ) = (, , ).Then
Let ⊢ *  or ⊢ * , for short, be the reflexive and transitive closure of the relation ⊢  .Thus  ⊢ *  ′ if and only if there exists a finite sequence  =  1 ⊢  2 ⊢ ⋯ ⊢   =  ′ of configurations for some  ≥ 1 including the possibility that  =  ′ .Such a sequence is called a computation of .It is an accepting computation if the state in  ′ is the unique final state ℎ.
A seminal result in computability theory states that the halting problem of Turing machines on the empty input is undecidable; see, e.g., [21].
It is well-known that there is a myriad of ways to alter the definition of Turing machines or their structure and retain the undecidability of the halting problem.We shall modify any TM  to an equivalent TM  ′ as follows: First, we may use the second marker (⊲) to fully surround the non-empty portions of the infinite tape and additional states that move this marker if extra space is required by the machine.More precisely, when the space needs to be created on the right side of the tape, the right-marker needs to be moved one cell to the right.That is, if the machine is in state , the current symbol read is the right-marker ⊲ and there is a right-move for (, ⋆) = (, , ), then we add a new state  ⊲ and transitions (, ⊲) = ( ⊲ , ⋆, ) and ( ⊲ , ⋆) = (, ⊲, ). ( Similarly, we need to remove extra ⋆ symbols between the markers.First of all, the extra ⋆ symbols are detected by adding a check for all transitions (, ) = (, ⋆, ) (except for those that were added for adding extra space).If it is then this extra ⋆ is shifted by the right-border marker ⊲, the machine is in a new state  reading ⊲ and we add transitions (, ⊲) = ( ′ , ⋆, ) and ( ′ , ⋆) = ( ′′ , ⊲, ), (2) for new states  ′ and  ′′ and then the machine moves back to where it printed the extra ⋆ and reads the border marker next to it.
Secondly, we may assume that the first step of the TM is to write ⊳ and ⊲ on the tape.We can further assume that the tape is cleared before meeting the final state ℎ, that is the problem of halting is to decide whether or not ( 0 , ⋆) ⊢ * (ℎ, ⋆).
It is obvious that the markers may be missing at some particular point of computation, but it is clear that ( 0 , ⋆) ⊢ *  ⊳(, ) if and only if ⊳( 0 , ⋆)⊲ ⊢ *  ′ ⊳(, )⊲.Let us note that the above changes (1) and ( 2) are done in our matrix simulation with one single matrix in each case and the markers are never missing.
Theorem 3. Let  be a Turing machine with delimiters, ⊳ and ⊲, surrounding non-blank tape content and where the initial configuration  = ( 0 , ⋆).It is undecidable whether the machine reaches configuration  again.

Matrix reachability from Turing machines
In this section, we simulate a Turing machine  using a matrix semigroup.That is, we will construct a set   = { 1 ,  2 , … ,   } ⊆ ℚ 3×3 that simulates  when the integrality test is performed after each multiplication.
The main idea in an encoding of the computation of a Turing machine is to cut the configuration (, ) into two words (, ) and , embed the pair of words into a matrix, and then to use specific matrices to move one symbol from one word to another.
It is worth highlighting that commonly an -ary representation of words is done using a simple encoding of letters.Assume that the alphabet is binary, i.e., let  = {, } and  ∈  * .Let  ∶  → ℕ be defined as () = 1 and ( ∑  =1 (  ) ⋅ 3 − .See, for example, [24,14,4].We use a different encoding that allows us to construct matrices with smaller elements.
Let  be the mapping Let (, ), where ⊳ is the first symbol and ⊲ is the last symbol, be the current configuration of the deterministic TM .We represent this by Note that # is a new marker symbol to ensure the element (3, 2) of our matrices is nonzero.Also note that the element (3,1) is never zero as it has (, ) for some  ∈  and  ∈ Γ.
We are ready to define the set of matrices   .We begin with transitions added by applying (1) or ( 2) when modifying the TM, we study these cases separately.Consider a transition (, ) = (, , ) of , we add matrix for every  ∈ Γ to   .Note that  is deterministic, so the state  and symbols  are uniquely determined by (, ).Similarly, a transition (, ) = (, , ) is represented by a matrix , for every  ∈ Γ which is also added to   .Then for the transitions added when applying (1) (originally (, ⋆) = (, , )) we add Similarly, the space removal in (2) is performed by a one special left-move matrix , and left-move matrices for  ∈ Γ Note also, that there exists at most one matrix in the set   (moving either to the left or to the right) for all combinations of (, ) ∈  × Γ and  ∈ Γ as the TM  is deterministic.Now, say the configuration of the Turing machine is (, ) and that there exists a (unique) transition (, ) = (, , ).The move of the TM  is represented by a product of the two matrices
For the sake of readability, let us define a mapping  ∶ ℚ 3×3 → ℚ 4 by setting When restricted to matrices ((, ),   ) and  (,), as defined above, the mapping is an isomorphism.To further simplify the notation, we will denote (((, ),   )) by ((, ),   ).Let us present a few observations next.Firstly, the mapping  is into ℕ 3×3 .Secondly, all matrices  (,), are rational, regardless of whether they correspond to the head moving left or right.If a configuration matrix is multiplied by the "correct" matrix, then resulting matrix is also integral, and even in ℕ 3×3 .On the other hand, multiplying by an "incorrect" matrix does not guarantee that the resulting matrix is not integral.Let us consider this in details: For the right transitions, let ((, ), #()  ) be a configuration and ( ( ′ , ′ ), ) correspond to a transition ( ′ ,  ′ ) = (, , ).
The resulting vector is and it is integral as long as  =  by the last element.In other words, the pair ( ′ ,  ′ ) does not have to match the pair (, ) of the configuration to ensure that the product is integral.In this case, ((, )) − (( ′ ,  ′ )) ≠ 0 will be in the coefficient of  in the second component of the  mapping.It remains to show that this remainder cannot be removed by further applications of "incorrect" matrices.This is proven in the upcoming Lemma 5.
Let us consider the above case of multiplying by a wrong matrix with transition to the left a bit further.It turns out that the matrix with  ≠  was used, then the next matrix in the product has to correspond to a move of the head to the right.Indeed, assume that we are in the case (7) with the second component integral and according to (8)  So we have showed that in both directions it is possible to get an integral matrix with an incorrect transition matrix, but if an incorrect matrix with transition to the left is applied, then the next matrix in the product has to be to the right to keep the matrix integral.
Next we shall show that if the configuration matrix is multiplied by a wrong matrix, it creates an integer that cannot be removed and thus leads to a rational number if the head is moved beyond this point.This means that, in a way, a multiplication by an incorrect matrix results in a coefficient that cannot be removed.Thus, this makes the tape content on the other side inaccessible.
Before formally proving the above, let us define a useful notation.We say that matrix  ∈ ℕ 3×3 is valid if  = ((, ), #  ) for some configuration ((, ), ) of the TM .This means that the words  and  do not contain symbols from the set  × Γ.Note that this does not imply that ((, ), ) can be reached by the TM.Analogously, we say that ((, ), #  ) is valid if it corresponds to a valid configuration of .
Let us consider the step where the matrix becomes invalid as in our previous considerations.For that assume that () is valid and () is invalid for some matrix  ∈   , but () ∈ ℕ 4 .Since by our assumption () ∈ ℕ 4 , the component that invalidates the vector is in the second component by our considerations in ( 7) and ( 6).Namely, if  simulates a move to the right, then we have the case in (6) with  = , where the element in the second component is
V. Halava and R. Niskanen The following lemma shows that once a matrix product becomes invalid it cannot be transformed into a valid one with matrices in   .Lemma 5. Let () be valid, i.e.,  = ((, ), #  ) for some configuration.Let  ∈   such that () ∈ ℕ 4 is no longer valid.Then there is no sequence Proof.Let us assume that there exists a sequence   1 ,   2 , … ,    ∈   such that (  1   2 ⋯    ) is valid and moreover that  is the smallest index such that the product is valid.Assume first that the final matrix,    , corresponds to the head moving to the right and ) where each   is a sum of images of letters under  with both positive and negative coefficients.Note that by our assumption on the minimality of the sequence, at least one   is not an image of a letter in Γ under  for  = 1, … ,  or  0 is not an image of a letter in  × Γ.After multiplying (  1   2 ⋯   −1 ) with the final matrix   , we have ) Since the product matrix is valid, we observe that, for  = 2, … , ||, each   = (   ) for some    ∈ Γ.The multiplication can only affect  0 directly and  1 indirectly via carries.There are two cases to consider.In the first case  0 is () for some  ∈  × Γ which implies that  1 is not an image of a letter.In this case, for the product to be valid, () − (( ′′ ,  ′′ )) + ( ′′ ) = ( ) +  for some  ∈ Γ and  ∈ ℤ is such that  1 +  = () for some  ∈ Γ, must hold.This is not true due to the definition of .Indeed, the largest image under  is at most

3
. Hence  = 0 and there are no carries.Thus  1 prevents the result from being a valid matrix.In the second case,  0 does not correspond to an image of a letter from  × Γ under .Our goal is to show that there does not exist some  ∈ Γ such that  0 − (( ′′ ,  ′′ )) + ( ′′ ) = ().
We prove this as a separate lemma, Lemma 6, after this proof.
The case where matrix    corresponds to the head moving to the left is proven in analogous way.Let    correspond to the head moving to the left, i.e.,

𝜓(𝑀 𝑖
and where each   is a sum images of letters under  with both positive and negative coefficients.Again, we multiply the latter matrix by the former and obtain ) .
There are two subcases to consider.Either  0 = (( ′′ ,  ′′ )) or  0 is not an image of a letter under .The first subcase implies that   is not an image of a letter under  for some  = 1, 2, … , ||.That is, the analogous considerations as above show that these two coefficients do not become images of some letter under .
where (   ,    ) ∈  + and (   ,    ) ∈  − .We can divide both sides of the equation by primes not corresponding to indexes of  + and  − as every product contains those primes: Next, we take the common factors on both sides.Namely, those primes corresponding to indexes in  − on the left-hand side and to indexes in  + on the right-hand side.
Our first undecidability result follows from the halting problem.
Proof.Let  be as in the previous theorem and let  = (⊳( 0 , ⋆), #⊲), where ⊳( 0 , ⋆)⊲ is the initial configuration of Turing machine  with undecidable halting problem.It is clear, with help of Lemma 5, that  simulates  and that the two properties of the claim hold if and only if  halts.□

The identity problem for rational matrix semigroups with integrality tests
In this section, we apply Theorem 8 to show that the identity problem is undecidable in this setting.Let us first define the identity problem for a generating set  of a -dimensional matrix semigroup with entries from  ∈ {ℤ, ℚ, ℝ, ℂ}, i.e.,  ⊆  × .
Problem 9 (Identity problem).Given a finite set of matrices  ⊆  × .Does the identity matrix   belong to the semigroup ⟨⟩?
Recall that it is known that the identity problem is decidable for  = ℤ and  = 2 [3], undecidable for  = ℍ (rational quaternions) and  = 2 [2], and  = ℤ and  = 4 [5].We are studying the identity problem for three-dimensional matrices, which is a well-known open problem.
Let us introduce a variant of the identity problem, where there is an additional integrality test.Let  ⊆ ℚ × be a finite set of rational matrices.Consider  ∈ ⟨⟩ but  ∉ ℤ × .Then  ∉ ⟨⟩ ℤ .Naturally, we can also ask the other standard matrix semigroup questions for our scenario.But apart from the identity problem, hardly any new results can be derived as most of the problems are undecidable already for integral matrices (i.e., with no integrality tests required).

Problem 10 (Identity problem with integrality test). Given a finite set of matrices
Theorem 11.The identity problem with integrality test is undecidable for  ⊆ ℚ 3×3 .
Proof.Let  be the set   constructed in the previous section together with additional matrices  1 ,  2 used to embed the initial configuration and  3 to remove the final configuration.More precisely, the matrices are , where ⊳( 0 , ⋆)⊲ is the initial configuration and the second configuration is ⊳(, ⊲).
It is straightforward to see that a non-empty product resulting in the identity matrix has to start with  1 or with a matrix of form (5). Indeed, these are the only matrices in ℤ 3×3 .Let us first consider the case where  1 is the first matrix.We can further observe that multiplying  1 with any other matrix beside  1 or  2 or of form (5) result in a matrix that violates integrality.Indeed, for example, when  = Normally,  −1 (⊲) would be removed by the correct choice of a matrix with  = ⊲, but as the bottom right corner is not 1, this does not happen.
In any product resulting in the identity matrix, there must be an equal number of matrices  1 and  2 as multiplying by  2 is the only way to produce a matrix with 1 in the bottom right corner.
Let  ∈ { 1 ,  2 } * . is valid if and only if  =  1  2 .If  is not valid, then analogously to the proof of Lemma 5, it can be proven that a valid matrix cannot be obtained using matrices from   ∪ { 1 ,  2 ,  3 }.
Assume then that the first matrix is of form (5), i.e., is (  0 0 0 1 0 −((,⊲))+((,⊲) 0 1 ) for some ,  ∈  and  ∈ Γ.It is straightforward to see that matrices  2 ,  3 and those corresponding to moving the head to the right cannot be applied as the element (2, 2) would become rational.If  1 , a matrix corresponding to moving the head to the left or of the form ( 5) is applied, then the resulting matrix is not valid and by Lemma 5 cannot be made valid.
Finally, observe that, similarly to how  1 had to be the first matrix,  3 has to be the last matrix and, more specifically, can only multiply (⊳( 0 , ⋆), ⊲).The resulting matrix is the identity matrix.The matrix (⊳( 0 , ⋆), ⊲) is in the semigroup if and only if the TM halts.Thus the identity problem is undecidable.□

Future work
In the previous sections, we constructed a generator set  that allow us to simulate a Turing machine when the partial products are tested to be integers.It would be interesting to see if it is possible to simulate a TM with a matrix semigroup where the integrality test is not performed after every multiplication.That is, there is a set  ⊆  such that the integrality is tested only after a matrix from  appears in the product.In other words, the model has fewer integrality checks or even a fixed number of integrality checks.This can be achieved by constructing a universal TM with special properties that ensure that a computation consists of some special transitions.These special transitions would then be transformed into matrices with integrality checks.This would require a careful analysis similar to Lemma 5 of "incorrect" simulations to make sure that a valid configuration cannot be obtained.

Fig. 1 .
Fig. 1.An illustration of a transition of a TM and the corresponding encoding changes.

Example 4 .
Let ⊳(, )⊲ be a configuration of a TM and let us simulate transition (, ) = (, , ).The subsequent configuration is ⊳(, )⊲.The transition of the TM and the changes in the coefficients in the encoding are depicted in Fig.1.
and  1