On some properties of Toeplitz matrices

In this paper, we investigate some properties of Toeplitz matrices with respect to different matrix products. We also give some results regarding circulant matrices, skew-circulant matrices and approximation by Toeplitz matrices over the field of complex numbers. Subjects: Advanced Mathematics; Algebra; Analysis Mathematics; Functional Analysis; Linear & Multilinear Algebra; Mathematics & Statistics; Operator Theory; Science


PUBLIC INTEREST STATEMENT
Toeplitz matrices arise in a variety of problems in applied mathematics and engineering such as queuing theory, signal processing, time series analysis, integral equations, etc. Despite the fact that Toeplitz matrices have been around for a long time, there are still a number of open problems regarding Toeplitz matrices and Toeplitz operators. One of them is counting the number of involutory and nilpotent Toeplitz matrices over a finite field. In this paper, we provide some solutions to the problem only in very specific cases. We also give several optimal approximation results for Toeplitz, circulant and skew-circulant matrices over the field of complex numbers. Furthermore, we give a characterization for pairs of Toeplitz matrices over a commutative ring whose product with respect to the usual multiplication or the Kronecker product is Toeplitz.
We consider also Kronecker products of Toeplitz matrices. Furthermore, we also give some combinatorial results over the finite prime field ℤ p .
In Section 3, we count the number of involutory and degree two nilpotent Toeplitz matrices in M n (ℤ p ) for particular values of n or p.
In Section 4, Toeplitz matrices over the field of complex numbers are studied. We show that for any matrix, there exists a closest Topelitz matrix (with respect to the Frobenius norm) that approximates it. We describe the equivalence classes of Toeplitz matrices and give several results regarding circulant and skew-circulant matrices.

Product of Toeplitz matrices over a commutative ring
We consider three different binary operations, the usual matrix multiplication, Kronecker product and Schur product, denoted by AB, A ⊗ B and A * B, respectively. With the exception of Schur product, the set of Toeplitz matrices is neither closed under the usual matrix multiplication nor under the Kronecker product. We characterize those pairs of Toeplitz matrices whose product with respect to the usual multiplication or the Kronecker product is Toeplitz again. The following theorem gives a characterization in terms of the usual matrix multiplication: Theorem 2.1 If A = { i−j } and B = { i−j } are two n × n Toeplitz matrices over a commutative ring, then AB is a Toeplitz matrix if and only if the following system of equations, of (n − 1) 2 equations with 4(n − 1) variables, holds: Proof If we set C = AB, by the product formula we have: By definition, C is a Toeplitz matrix if and only if c i+1j+1 = c ij . Therefore, C is Toeplitz if and only if i −j − i−n n−j = 0 where 1 ≤ i, j ≤ n − 1.
It is clear that the inverse of an arbitrary nonsingular Toeplitz matrix is not necessarily Toeplitz. However, the following proposition shows how the stability under inversion, with respect to being Toeplitz, is related to stability under multiplication. For an arbitrary n × n matrix A over a commutative ring, we can compute the coefficients of the characteristic polynomial p A (x) in many different ways, including the eigenvalues of the matrix, or the entries of the matrix (Brooks, 2006). In particular, we wish to use the following formula to compute the coefficients: where Λ m A is the m th exterior power of A. Furthermore, if the characteristic of the commutative ring is zero then it is known (Winitzki, 2010, 3.9 Proof By the Cayley-Hamilton Theorem, where each coefficient c n−m of the characteristic polynomial could be explicitly computed as the sum of all principal minors of A of size m. Furthermore, if the matrix is considered over a commutative ring of characteristic zero, then we have c n−m = (−1) m m! det(C). The trace of a Toeplitz matrix is simply n times its diagonal entry, but the diagonal entry of the mth power of A, denoted by (A m ) 0 here, can be calculated recursively.

Remark 2.3
If an upper (lower) triangular Toeplitz matrix is invertible, then its inverse is Toeplitz, because the product of two upper (lower) triangular Toeplitz matrices is again an upper (lower) triangular Toeplitz matrix. Since an upper (lower) unitriangular matrix is always invertible and its inverse is an upper (lower) unitriangular matrix, the inverse of any upper (lower) unitriangular Toeplitz matrix is also an upper (lower) unitriangular Toeplitz matrix.
Theorem 2.4 Let A = { i } be an n × n Toeplitz matrix and B = { j } be an m × m Toeplitz matrix over a commutative ring. The mn × mn matrix A ⊗ B is Toeplitz if and only if the following system of equations, of 2(m − 1)(n − 1) equations with 2(n + m) − 3 variables, holds: Proof With regard to the Kronecker multiplication of two matrices, A ⊗ B consists of n 2 blocks of Toeplitz matrices, and it is Toeplitz if and only if each of the (2nm − 1) diagonals has the same entries, i.e. if and only if for each diagonal, which is formed by concatenation of diagonals of the adjacent blocks, the entries agree with each other.
Clearly, for the top right corner block in A ⊗ B, there is no concatenation, and therefore, the top m − 1 diagonals always satisfy the property, and the same fact holds for the bottom m − 1 diagonals of A ⊗ B. Furthermore, the main diagonal of A ⊗ B is just the concatenation of n copies of the main diagonal of B, multiplied by 0 , i.e. it is simply 0 0 . We also notice that for the diagonals of A ⊗ B that are the result of concatenation of the main diagonals of some internal blocks, we have no extra conditions and each of them always has the same entry since A and B are both Toeplitz.
Regarding the remaining diagonals, however, we certainly have concatenation of different diagonals of adjacent blocks, which should agree. A simple computation shows that the above equations are coming from those concatenations. Thus, out of the 2mn − 1 diagonals of A ⊗ B, always (2n − 1) + 2(m − 1) of them satisfy the property, from which 2n − 1 comes from the fact that holds for the main diagonals, and 2(m − 1) from the diagonals, coming from the single blocks on the top-right and bottom-left corners.
3. Involutory and nilpotent Toeplitz matrices over the finite field ℤ p Let (n, p) and (n, p), respectively, denote the number of involutory Toeplitz matrices and nilpotent Toeplitz matrices of degree two in M n (ℤ p ). In the following theorem, we calculate (n, p) for specific values of p or n. Clearly, for a Toeplitz matrix is again a Toeplitz matrix in M n (ℤ p ), for p ≠ 2, (n, p) also gives us the number of idempotent Toeplitz matrices in M n (ℤ p ). Moreover, for p = 2, a Toeplitz matrix A is a nontrivial involutory matrix if and only if the Toeplitz matrix (A − I) is nilpotent of degree two. Therefore, (n, 2) − 1 is also the number of degree two nilpotent Toeplitz matrices when n is odd.
Theorem 3.1 The following hold: 2 . Therefore, 2k + 1 entries t l where l = −k, ..., −2, −1, 0, 1, 2, ..., k are known, while 2k entries t l where l = −2k, ..., −k − 1, k + 1, ..., 2k are unknown. Therefore, the matrix T must be of the following form: It is easy to see that for the matrix T which is illustrated above, the equation T 2 = I, in ℤ 2 , gives the following k 2 equations in ℤ 2 : If t i−n = 0 for all i, then t n−j can be either zero or one for any j. Therefore, there are 2 k solutions in this case. Similarly, if t n−j = 0 for all j, then t i−n can be either zero or one for any i. Therefore, there are 2 k solutions in this case as well. However, we counted the case that t i−n = t n−j = 0 for all 1 ≤ i, j ≤ k twice. Hence, there are 2 × 2 k − 1 solutions in total. Since k = n−1 2 , we conclude that there are 2 n+1 2 − 1 solutions in total.
(ii) Since trace(T) = pt 0 = 0 (mod p), the sum of eigenvalues of T must be zero. This is only possible if all eigenvalues of T are 1 or all are −1.
(iii) For n = 2, the matrix equation T 2 = I, in ℤ p , where p is odd, gives the following equations in ℤ p : where p is odd and greater than 3, gives the following equations in ℤ p : We note that t 0 cannot be zero because if t 0 = 0, then t 1 = 0 because of the equation t 2 1 + 2t 0 t 2 = 0 and the first equation in the above gives a contradiction. Furthermore, it follows from the above equations and invertibility of t 0 that if t 1 = 0, then t −1 = t 2 = t −2 = 0, t 2 0 = 1, and if t −1 = 0, then t 1 = t 2 = t −2 = 0, t 2 0 = 1. Hence, we have two solutions in this case, namely t 0 = 1, t 1 = t −1 = t 2 = t −2 = 0 and t 0 = p − 1 , t 1 = t −1 = t 2 = t −2 = 0. Therefore, the remaining case is when t 1 and t −1 are both invertible which means that by knowing the values of t 0 , t 1 and t −1 , we can uniquely determine the values of t 2 and t −2 from the equations. Thus, it suffices to count the solutions of the equation t 2 0 + 2t 1 t −1 = 1. Since the equation t 2 0 + 2t 1 t −1 = 1 has 2(p − 1) solutions in ℤ p , where p is odd and greater than 3, there are 2(p − 1) + 2 solutions in total.
For n = 4, the matrix equation T 2 = I, in ℤ p , where p is odd, gives the following equations in ℤ p : The above equations indicate that if t 0 = 0, then either t Since each of the equations t 2 t −2 = 1 and 2t 1 t −1 = 1 has p − 1 solutions in ℤ p , there are 2(p − 1) solutions when t 0 = 0. If t 0 ≠ 0, then there are two cases to consider: (1) If either t 1 or t −1 is zero, then it follows from the above equations that all variables except t 0 must be zero. Therefore, this case leads to the equation t 2 0 = 1 which has two solutions.
Theorem 3.3 The following hold: Proof (i) The matrix equation T 2 = 0 results in the following equations: If p = 2, then there is just one equation t 2 0 + t −1 t 1 = 0 which has three solutions. Assume p ≠ 2, if t 1 and t −1 are both nonzero, then the equation t 1 t −1 = 0 is a contradiction. Therefore, either t 1 and t −1 are both zero which we do not count or one of them is nonzero which gives 2(p − 1) solutions. Hence, there are 2(p − 1) solutions in total.
(ii) The matrix equation T 2 = 0 results in the following equations: It follows from the above equations that the cases in which t 1 = 0 and t −1 ≠ 0 or t −1 = 0 and t 1 ≠ 0 result in contradictions. If t 1 = t −1 = 0, then t 0 = 0 and the equation t 2 t −2 = 0 has 2p − 2 solutions. If t 1 ≠ 0, t −1 ≠ 0 and p ≠ 3, then the equation −9t 1 t −1 = 0 which follows from the above equations is a contradiction. Therefore, the case t 1 ≠ 0 and t −1 ≠ 0 is not possible when p ≠ 3 and there are 2p − 2 solutions in total. If t 1 ≠ 0, t −1 ≠ 0 and p = 3, then the equation −9t 1 t −1 = 0 is no longer a contradiction and there are four solutions for this case and 2 × 3 − 2 solutions for the previous case which amounts to eight solutions when p = 3.
Remark 3.4 For matrices in M n (ℤ p ), it is known that the number of invertible, involutory (Hodges, 1958) and nilpotent (Fine & Herstein, 1958) matrices are as follows: Therefore, the probabilities of being invertible, involutory and nilpotent are g(n,p) p n 2 , I(n,p) p n 2 and l(n,p) p n 2 = 1 p n , respectively. Since the number of invertible Toeplitz matrices is (p − 1)p 2n−2 (Kaltofen & Lobo, 1996, Theorem 4), the probability that a Toeplitz matrix is invertible is p which is independent of n. By Theorem 3.1, the probability that a Toeplitz matrix in M n (ℤ 2 ), where n is odd, is in- if p is even l(n, p) = p n 2 −n Proof The entries of T must satisfy the equation t m ij = 1 (mod p). Therefore, we are counting elements of ℤ p that their order divides m and their order also divides p − 1, i.e. their order divides gcd(m, p − 1). Hence, where is Euler's totient function. Since there are 2n − 1 distinct elements, there are gcd(m, p − 1) 2n−1 different matrices T with the aforementioned properties.

Toeplitz matrices over the complex field ℂ
In this section, we consider Toeplitz matrices over the complex field ℂ. We define a metric that measures how far a matrix is from being Toeplitz. We can associate a Toeplitz matrix T A to any matrix A as follows:

Theorem 4.1 If T A is the associated Toeplitz matrix of A such that ‖T
We now consider involutory Toeplitz matrices over the complex field ℂ. The following simple Proposition shows that if there exists one such matrix, there exist uncountably many.

Proposition 4.2 If
is an involutory Toeplitz matrix over ℂ, and v is a nonzero complex scalar, then is also an involutory Toeplitz matrix.
Proof The two matrices are related by a similarity transformation by a diagonal matrix. See for example, Proposition 3.1 in Kucerovsky and Sarraf (2014).
In view of the above result, it would appear inevitable that we should consider some equivalence relation or some restriction on the class of involutory Toeplitz matrices considered. We give two such results: Proof Let T be an involutory Toeplitz matrix. Since the polynomial p(x): = x 2 − 1 annihilates T, it follows that the minimal polynomial of T divides p. But because p has distinct roots over ℂ, it follows that the minimal polynomial of T has distinct roots over ℂ, and thus that T is diagonalizable. Thus, T is equivalent under similarity to a diagonal matrix Λ. Let F denote the usual Vandermonde matrix representation of the finite Fourier transform (over ℂ). Then, F −1 ΛF is equivalent to T, is a Hermitian matrix and is a circulant (Driessel, 2011). Clearly, thus, the number of equivalence classes is given by the number of possible involutory diagonal matrices in M n (ℂ), Note: the above proof will work in more general fields: we just need sufficiently many roots of unity and the existence of 1 / n. We can also look at restricting the class of Toeplitz matrices considered: Theorem 4.4 In M n (ℝ), the involutory symmetric Toeplitz matrices are all either symmetric real circulants or are symmetric real skew-circulants. If n is even and greater than 2, there are a total of 3 ⋅ 2 n 2 − 2 such matrices. If n is odd and greater than 1, there is a total of 2 k+3 2 − 2 such matrices.
Proof We observe that if a symmetric real matrix satisfies T 2 = Id, it is in fact an orthogonal matrix. Orthogonal and symmetric real Toeplitz matrices are either circulants or skew-circulants (Böttcher, 2008 (2008), if n is odd and greater than a n a n+1 ⋯ a n−1 a n a n+1 ⋮ a n−1 a n a n+1 ⋱ ⋱ ⋱ a n+1 a n−1 a n v −1 a n−1 a n va n−1 ⋮ v −1 a n−1 a n va n−1 after equivalence, which is n+1. Two diagonal matrices are in the same class after equivalence if and only if the elements of the diagonals are permutations of one another.
1, there is a total of 2 k+3 2 − 2 such matrices, and if n is even and greater than 2, there are a total of 3 ⋅ 2 n 2 − 2 such matrices.
Theorem 4.5 Consider the Frobenius norm ‖⋅‖ on the complex n-by-n matrices, M n (ℂ). Given an arbitrary A ∈ M n (ℂ), there exist a unique complex Toeplitz matrix minimizing ‖A − T‖ over all Toeplitz matrices T. This unique minimizing matrix is the Toeplitz matrix given in each diagonal by the average of the elements of the corresponding diagonal of A. The same statement holds in the real case.
Proof Under this norm, M n (ℂ) is a complex Hilbert space, with the complex Toeplitz matrices forming a subspace . Let e 1 , ⋯ , e 2n−1 denote the Toeplitz matrices that have zero in all diagonals except one, and in that k th diagonal, has elements equal to 1. At the level of Hilbert spaces, this is a finite orthogonal set spanning the subspace . Thus, we can explicitly construct the projection from M n (ℂ) onto in the form: It is well known that orthogonal projection onto a closed subspace, in Hilbert space, minimizes the norm. In this case, the subspace is closed as a consequence of its finite-dimensionality. It is clear that P(A) is given in each diagonal by the average of the elements of the corresponding diagonal of A. In the real case, exactly the same proof works, but with real Hilbert spaces instead of complex Hilbert spaces.
It seems interesting to consider generalizations of the above Theorem to the C*-norm on M n (ℂ).
However, projection onto linear subspaces of C*-algebras has poor properties in general. It is not even true that the operation of projection onto a subspace will not increase the norm distance between elements (Silverman, 1969). The situation improves when we consider projection onto subalgebras. Tomiyama (Tomiyama) showed that each projection of norm one of a C*-algebra onto a C*-subalgebra is a conditional expectation, and Takesaki (1972) showed that in a finite von Neumann algebra, there exist faithful conditional expectations onto von Neumann subalgebras. We will make use of this result, and will explicitly construct faithful conditional expectations onto the circulant algebra and the skew-circulant algebra. We will show these expectations are basically unique.
For information on circulants and skew-circulants, see Driessel, 2011. The circulants of a given order form an abelian subalgebra of M n (ℂ), as do the skew-circulants. The circulants can be simultaneously diagonalized by (a scalar multiple of) the DFT matrix F: = [ (i−1)(j−1) ] where is a primitive (complex) root of unity. The skew-circulants can be simultaneously diagonalized by FΩ 1∕2 where Ω is a suitable diagonal unitary matrix, see Proposition 31 in Driessel (2011). We will write W = H * ΛH, where W is a skew-circulant and Λ is a diagonal matrix, or C = F * ΛF, for a circulant C.

Theorem 4.6 A faithful, trace-preserving, idempotent, conditional expectation of operator norm 1 of M n (ℂ) onto the circulants is given by
where D takes all elements that are not on the main diagonal to zero, and preserves the elements of the main diagonal and F is the discrete Fourier transform matrix.
Proof Since the map m ↦ FMF * is a unitary transformation diagonalizing the circulants, we must show that the map D is a faithful, idempotent, trace-preserving, conditional expectation of norm 1 of M n (ℂ) onto the subalgebra of diagonal matrices. To show that D is a conditional expectation, we verify that the defining property D(dmd * ) = d (m)d * holds, where d is a diagonal matrix, and m is in M n (ℂ). It is apparent that D:M n (ℂ) → M n (ℂ) preserves the trace. Since the diagonal elements of a positive definite matrix are real and nonnegative, it follows that D maps a positive definite matrix to a positive definite (diagonal) matrix. If D were not faithful, there would be some nonzero positive definite matrix that maps to zero under D. But a positive definite matrix with principal diagonal zero is in fact the zero matrix (Horn & Johnson, 1990, p. 398). Conditional expectations are completely positive maps, and a completely positive map of unital algebras has norm one if and only if it takes the unit to the unit. Evidently, the map D is unital, and thus has norm 1. It is clear that D is idempotent.
Corollary 4.7 There is only one idempotent linear mapping of M n (ℂ) onto the circulant subalgebra in M n (ℂ). It is given by the map in the above Theorem.
Proof By Theorem 1 in Tomiyama (1957), any given idempotent linear mapping L is a conditional expectation of a finite-dimensional von Neumann algebra onto a finite-dimensional von Neumann algebra. By finite-dimensionality, the mapping is necessarily normal. By Proposition 1.2 in Tomiyama (1972), the mapping L is necessarily faithful. By Theorem 6.2.2 (and Theorem 6.2.3) in Arveson (1967), the mapping L is unique. It therefore follows that any two maps satisfying the hypotheses of the Corollary are in fact the same map, and clearly, the map provided by Theorem 4.6 satisfies the hypotheses.
From the uniqueness, we have the following further Corollary: Corollary 4.8 The map of Theorem 4.6 minimizes the error in (operator) norm among all idempotent projections of M n (ℂ) onto the circulant subalgebra in M n (ℂ).
Replacing circulants by skew-circulants in the proofs of the last three results, we obtain analoguous results for skew-circulants. We summarize as follows: Since every Toeplitz matrix can be uniquely decomposed as the sum of a circulant and a skewcirculant, we have the following Remark: Remark 4.10 For each matrix m, the sum of the maps from Theorems 4.6 and 4.9, m ↦ (m) + (m), gives a Toeplitz matrix.
We next consider how the usual product and Schur product interact with the circulant property. The finite Fourier transform over the complex numbers can be described most briefly as the unitary transformation that is (up to a scalar multiple) represented by the N-by-N matrix [a ij ] with a ij = (i−1)(j−1) . The complex number is a principal N th root of unity. This definition can be made in a finite field. If is a principal N th root of unity in a finite field, it follows that N = 1 and ∑ n−1 j=0 ji = 0 for 1 ≤ i < N, and these are the main properties used in establishing the well-known properties of the (complex) finite Fourier transform with respect to convolution. In order to be able to invert the generalized finite Fourier transform, it is necessary that N be not divisible by the characteristic of the field. See Chapter 11 of Elliott and Rao (1982) for more information. If these conditions are met, then the generalized finite Fourier transform takes the convolution operation defined by multiplication by a circulant matrix to multiplication by a diagonal matrix, see section 11.2 and 11.3 of Elliott and Rao (1982). Thus, we are able to diagonalize circulant matrices, even over a finite field. The diagonal matrix obtained will be referred to as the eigenvalue matrix of the circulant, and denoted Λ(A) where A is the given circulant. Let us say that a circular convolution of two sequences is the first row of the matrix product of the circulants having the given sequences as their first row. Thus, if Circ(a 1 , ⋯ , a n )Circ(b 1 , ⋯ , b n ) = Circ(c 1 , ⋯ , c n ), then the sequence (c 1 , ⋯ , c n ) is the circular convolution of (a 1 , ⋯ , a n ) and (b 1 , ⋯ , b n ).
Proposition 4.11 Let A and B be two circulants, over a finite field with characteristic relatively prime to the matrix dimension N of the circulant matrix. The eigenvalues of AB and A * B can be determined in terms of the eigenvalues of A and B. The eigenvalues of AB are given by Λ(A) * Λ(B), and the eigenvalues of A * B are given by 1 N C where C is the circulant matrix whose eigenvalue sequence is the circular convolution of the eigenvalue sequences of A and B.
Proof Since the generalized Fourier transform is a unitary transformation, W, it is clear that WABW −1 = WAW −1 WBW −1 . Thus, the eigenvalues of AB are the elementwise product of the eigenvalues of A and B. The main issue is the behaviour of the Schur product.
To reduce notation, let us consider first the case where both A and B have the property that all the eigenvalues except one are zero, and that eigenvalue is 1. Then, A = W −1 Λ(A)W is an outer product of a row of W and a column of W −1 . Since W = [ (i−1)(j−1) ] and W −1 = 1 N [ −(i−1)(j−1) ], we thus have that A = 1 N [ (j−i)(k−1) ] where k is the index of the nonzero eigenvalue of A. Similarly, B = 1 N [ (j−i)( −1) ] where is the index of the nonzero eigenvalue of B. Taking the Schur product, we have A * B = 1 N 2 [ (j−i)( +k−2) ]. We can thus write A * B = 1 N C where C is the circulant matrix with all eigenvalues zero except for a 1 in the ( + k − 1) th place. Since N = 1, the values of the exponent in (j−i)( +k−2) may be taken as being modulo N. Thus, we see that the eigenvalues of A * B are given by 1 N C, where the eigenvalue sequence of the circulant matrix C are given by the circular convolution of the eigenvalue sequence of A and the eigenvalue sequence of B. The case of general circulants A and B follows by taking linear combinations.
The above result can be rephrased as follows: if we diagonalize a circulant using the generalized discrete Fourier transform, and then put the eigenvalue sequence into the first row of a circulant matrix, then up to a scalar multiple, Schur products are transformed into matrix products. This transformation is invertible.

Correction
This article was originally published with errors, which have now been corrected in the online version. Please see Correction (http://dx.doi.org/10. 1080/25742558.2018. 1520444)