Symmetric Nonnegative Matrix Trifactorization

The Symmetric Nonnegative Matrix Trifactorization (SN-Trifactorization) is a factorization of an $n \times n$ nonnegative symmetric matrix $A$ of the form $BCB^T$, where $C$ is a $k \times k$ symmetric matrix, and both $B$ and $C$ are required to be nonnegative. This work introduces the SNT-rank of $A$, as the minimal $k$, for which such factorization exists. After listing basic properties and exploring SNT-rank of low rank matrices, the class of nonnegative symmetric matrices with SNT-rank equal to rank is studied. The paper concludes with a completion problem, that asks for matrices with the smallest possible SNT-rank among all nonnegative symmetric matrices with given diagonal blocks.


Introduction and Notation
Factorizations of matrices, where the factors are required to be entrywise nonnegative, have seen a lot of attention in the recent years, since they provide a powerful tool in analysing nonnegative data. In parallel with applications, theoretical study of those factorizations is a vibrant topic of research, that supports development of applications. In this work, we consider a factorization of nonnegative symmetric matrices, which takes into account symmetry, nonnegativity and low rank of a matrix.
Throughout the paper we depend on predominantly standard notation listed below. By R + we denote the set of nonnegative real numbers, by R n×m the set of n × m real matrices, and by R n×m + the set of n × m entry-wise nonnegative matrices. Our investigation is focused on symmetric nonnegative matrices. For simplicity, we will generally assume that our matrices are irreducible. To this end we define S + n = {A ∈ R n×n + ; A = A T }, S + n = {A ∈ S + n ; A is irreducible}.

Nonnegative Matrix Factorization
Given a nonnegative n×m matrix A and a positive integer k, the Nonnegative Matrix Factorization (NM-Factorization) consists of finding matrices U ∈ R n×k + and V ∈ R m×k + so that UV T approximates A. The most frequently used measure of approximation is the Frobenius norm, hence the goal is to find U and V that minimise A − UV T F . The exact version of NM-Factorization is looking for a minimal k 0 for which there exist matrices U ∈ R n×k 0 + and V ∈ R m×k 0 + with A = UV T . We will denote such k 0 by rk + (A), while rk(A) will denote the rank of A. Clearly, rk(A) ≤ rk + (A) ≤ min{m, n}.
The approximate NM-factorisation by Paatero and Tapper [24], has seen a tremendous growth after a seminal paper of Lee and Seung [21]. We refer the reader to the following recent publications [13,15] for background on the problem, and offer a sample of the works that consider the exact version of the problem [8,22,29].

Symmetric NM-Factorization and Completely Positive Factorization
When dealing with nonnegative symmetric matrices, it makes sense to look for factorizations that exhibit not only nonnegativity but also symmetry. The most influential factorization that fits this requirement is defined for completely positive matrices.
The Symmetric NM-Factorization (SN-Factorization) is a variation of NM-Factorization where U = V . Hence, we are looking for approximations of a given symmetric nonnegative matrix by a matrix of the form UU T . If a matrix can be written as UU T exactly, for some nonnegative matrix U, then it is said to be completely positive. We call such factorization completely positive factorization and use the abbreviation CP-Factorization. While completely positive matrices are necessarily positive semidefinite, not every nonnegative positive semidefinite matrix is completely positive [5,Example 2.4].
For a completely positive matrix A, we define cp(A) to be the minimal k such that there exist U ∈ R n×k + with A = UU T . If a matrix A is not completely positive, we define cp(A) to be equal to infinity.
For the background on the completely positive factorization we refer the reader to the following works [4,5,27].

Symmetric Nonnegative Matrix Trifactorization
In this paper we study a factorization that exhibits nonnegativity, symmetry and low rank of a matrix.
Symmetric Nonnegative Matrix Trifactorization is a an approximate factorization of a given symmetric matrix A of the form BCB T , where B and C are nonnegative, and C is symmetric. As above, the Frobenius norm is typically used to measure the approximation. In this work we consider the exact version of SNMT, which we refer to by the acronym SN-Trifactorization.
In contrast to extensive literature on NM-Factorization and CP-Factorization, SN-Trifactorizaṫion has so far received considerably less attention. We refer the reader to the works [1,14,15] on Symmetric Nonnegative Matrix Trifactorization. This factorization is also known as Semi (or weighted) Symmetric Nonnegative Factorization [10,18,31]. Applications of SN-Trifactorization established to date include Hidden Markov Model Indentification and Community Detection [13].
Minimal possible k in such factorization, is called the SNT-rank of A, and is denoted by st + (A).

Overview
The paper defines (exact) SN-Trifactorization and SNT-rank and dedicates Section 2 to basic properties of this newly defined parameter. Those include comparison with related parameters, and investigation which properties of the classical rank transfer to SNT-rank. Matrices of rank 2 and of rank 3 are examined. The class of nonnegative symmetric matrices whose rank equal SNT-rank is studied in Section 3. Section 4 is dedicated to a completion problem. The work is concluded with a handful of questions for further research.

Basic Observations
The results in this section establish basic properties of SNT-rank. In the introduction we met three different ranks of a matrix that are defined through factorizations involving nonnegative factors: rk + (A), st + (A) and cp(A). First, we take a look at how they compare.
Proposition 2.1. Let A be a nonnegative symmetric matrix. Then: Proof. Both items are quickly deduced from arguments below: On the other hand, every A ∈ S + n can be written as A = IAI T , so st + (A) ≤ n. Note that all the inequalities listed in Proposition 2.1 can be strict.

Factorization
Two simple lemmas below are referred to later in selected proofs. First one lists trivial ambiguities in the SN-Trifactorization.

The matrix
Both factorizations exhibit the same the size of the SN-Trifactorization as the original factorization.
Some properties of the classical rank extend to SNT-rank. Proposition 2.2. Let A, A ′ ∈ S + n , A ′′ ∈ S + m and let A 0 be a principal submatrix of A. Then: Proof. Inequalities can be deduced from corresponding SN-Trifactorizations as follows:

Let
Lemma 2.1, we can without loss of generality assume A = A 0 * * * .

Suppose that
4. Similarly, A ′ = B ′ C ′ B ′T and A ′′ = B ′′ C ′′ B ′′T gives us: As we have equality in the last item, we still need to prove converse inequality. Let st + (A ′ ⊕ A ′′ ) = s with a corresponding SN-Trifactorization A ′ ⊕ A ′′ = BCB T . Using Lemma 2.1, we can assume that B and C are of the following form: , and B 11 has no zero columns. Since the factorization corresponds to st + (A ′ ⊕ A ′′ ), B 22 also has no zero columns. From B 11 C 11 B T 21 + B 11 C 12 B T 22 = 0, we get B 11 C 12 B T 22 = 0, and thus C 12 = 0 by Lemma 2.2. Now, C 11 has no zero rows or columns, so B 11 C 11 has no zero columns. Since n . Then:

If
for X ∈ R n 1 ×m 1 proving st + (A) ≤ 2k. This proves the first two items in the proposition. The inequality st + (A) ≤ 2rk + (X) in item 3. is now also established by noting that Let A be as in (1) and st + (A) = s with corresponding SN-Trifactorization A = BCB T . Using Lemma 2.1, we can assume: , and B 11 has no zero columns. From As we illustrate in the example bellow, it can happen that st Separable NMF is a variation of NMF, where the columns of the first factor in A = UV T are chosen from the columns of the matrix A, [11]. It turns out that with separability condition added, SNT-rank and NMF-rank agree.
n be a nonnegative symmetric matrix and P ∈ R n×n a permutation matrix. If If the condition that Q is nonnegative is removed, then a factorization of the form (2) exists for every symmetric matrix A, with k = rk(A). In particular, any rank 1 matrix A ∈ S + n can be written as A = vv T for some v ∈ R n + , thus st + (A) = 1. The following corollary proves that a similar conclusion is true also for matrices of rank 2. A similar results holds for NMF-rank, see [8,Theorem 4.1].
n be a nonnegative symmetric matrix of rank 2. Then st + (A) = 2.
Proof. Every rank 2 matrix is separable by the proof of Theorem 2.6 in [15]. Using Lemma 2.3 we get st + (A) = 2.
Corollary 2.1 cannot be generalised to matrices with rk(A) = 3, or even to matrices with rk + (A) = 3. This is shown in our next example, that also illustrates that st + (A) > rk + (A) can happen, showing that st + (A) is indeed a new parameter.
and rk(C) = 3. With the aim of arriving at contradiction, we first we consider the pattern restrictions on B and C coming from the two zero entries in A. Let us denote the rows of have two positive entries, then C needs to have a 2 ×2 principal submatrix equal to zero, contradicting rk(C) = 3. Now that we know that b 2 and b 3 each have only one positive entry, we further note that those entries have to appear in different positions, for otherwise we would have a 23 = 0.
Replacing B with BP and C with P T CP , where P is a permutation matrix, we may assume that b T 2 = α 1 0 0 , b T 3 = β 0 1 0 , and c 11 = c 22 = 0. Let D be a diagonal matrix D = diag α β 1 . Replacing B by BD −1 and C by DCD, we may further assume that α = β = 1. From a 23 = 1, we now get c 12 = 1.
Since B shares a column space with B 1 defined in (3), we have B = B 1 X, for X ∈ R 3×3 . From the information that we already have on B, we deduce: which in turn gives us: Again replacing B with BD −1 and C with DCD, this time for matrix D = diag 1 1 x 23 , we get: Note that in the example above we were not able to exclude st + (A) = 3 based on the pattern of A alone. This example also allows us to show that the property of rank: rk(A n ) = rk(A) when A = A T , does not extend to st + . For matrices with rk(A) = 3, st + (A) cannot be bounded by a constant independent of the size of the matrix A. This fact can be deduced from the equivalent statement for rk + (A). This was first observed in [3], where it was shown that for the Euclidean distance matrix M n ∈ S + n , defined by (M n ) ij = (i−j) 2 , we have rk(M n ) = 3 but rk + (M n ) cannot be bounded independently of n. The paper [16] gives some lower bounds for NMF-rank of M n in Corollary 6, and the upper bound rk + (M n ) ≤ ⌈ n 2 ⌉ + 2 in Theorem 9. This upper bound is proved by constructing a corresponding NM-Factorization, that we modify to SN-Trifactorization below. The NMF-rank for Euclidean distance matrices has been also considered in [20,23,30].
and K n ∈ R n×n be the matrix with ones on the anti-diagonal and zeros elsewhere. For 3. Matrices whose SNT-rank equals rank As we have seen, st + (A) can be significantly bigger than rk(A). In this section we take a closer look at the family of nonnegative symmetric matrices that satisfy rk(A) = st + (A): An invertible matrix T is called a Perron similarity if one of its columns and the corresponding row of its inverse are both nonnegative or both nonpositive. Perron similarities play a role in the theory of nonnegative matrices, as any matrix that brings an irreducible nonnegative matrix to its Jordan canonical form under similarity is Perron similarity. In proposition below we meet Perron similarities in connection with congruence transformation that connects a diagonal matrix with a nonnegative matrix C in SN-Trifactorization BCB T .
where D 1 is a diagonal matrix with nonzero entries on the diagonal, U = u U 1 ∈ R n×r , U T U = I r , B ∈ R n×r + and C ∈ R r×r + .
Then there exists an invertible matrix T with the first column of T and the first row of T −1 both nonnegative, so that B = UT −1 and C = T (λ 1 ⊕D 1 )T T . If A is irreducible then the first column of T and the first row of T −1 are both positive.
Proof. Let A = BCB T , where B ∈ R n×r + and C ∈ R r×r + . Since both B and U have rank r, and the span of columns of B is equal to the span of columns of U, there exists an invertible matrix T satisfying B = UT −1 . Now, T , we see that the first row of T −1 is necessarily nonnegative. Now In the irreducible case, we know that u is positive, and B and C have no columns equal to zero. The conclusion follows.
If u is the Perron eigenvector of a symmetric nonnegative matrix A (normalised so that u T u = 1), then it is straightforward to see that rk(A + αuu T ) = rk(A), and if we require α ≥ 0, then clearly A + αuu T remains nonnegative. The following theorem shows that if α is chosen to be sufficiently large, then st + (A + αuu T ) drops down to rk(A). This type of perturbation was considered in connection with the completely positive rank in [6].
Theorem 3.1. Let A ∈ S + n be an irreducible symmetric nonnegative matrix with the Perron eigenvector u, u T u = 1. Then st + (A + αuu T ) ≤ st + (A) for all α ≥ 0, and there exists α 0 , so that A + αuu T ∈ E n for all α ≥ α 0 .
Proof. Let A = BCB T , and Au = λ 1 u with u T u = 1. Direct calculation gives us: for an appropriate choice of β ≥ 0. This proves st + (A + αuu T ) ≤ st + (A).
To prove that there exists α with st + (A + αuu T ) = rk(A) =: r, we start with a spectral decomposition of A + αuu T : where D 1 is a (r−1)×(r−1) diagonal matrix containing nonzero, non-Perron eigenvalues of A, and U 1 ∈ R n×(r−1) a matrix whose columns are equal to the corresponding normalised eigenvectors of A. Let β > 0, q 1 ∈ R r + be a positive vector satisfying q T 1 q 1 = 1, and Q = q 1 Q 1 ∈ R r×r an orthogonal matrix. Let Then A + αuu T = B(β)C(α, β)B(β) T for all α, β > 0. It remains to show that we can choose α > 0 and β > 0 so that B(β) > 0 and C(α, β) > 0. Note that: B(β) = βuq T 1 + U 1 Q T 1 , and since uq T 1 > 0 we can choose β > 0 so that B(β) > 0. On the other hand, we have: Since q 1 q T 1 > 0, we can choose α so that C(α, β) > 0 for any fixed β > 0. From Theorem 3.1 if follows that in order to understand E n , it is enough to study ∂E n , defined as the set of matrices A ∈ E n with the property that A − αuu T ∈ E n for any α > 0, where u is the Perron eigenvector of A. In particular, all irreducible matrices in E n that contain at least one zero entry are necessarily in ∂E n . Hence, given A ∈ S + n , we would like to determine minimal α with rk(A + αuu T ) = st + (A + αuu T ), or equivalently, we are looking for α with A + αuu T ∈ ∂E n .
From the proof of Theorem 3.1 we can produce upper bounds for α using different orthogonal matrices Q. In fact, in the proof, an orthogonal matrix Q can be replaced by any r × r invertible matrix S with the first column of S and the first row of S −1 both positive. Indeed, if we define: then A + αvv T = B(β, S)C(α, β, S)B(β, S) T . As in the proof above, we can find β that makes B(β, S) nonnegative, and given β and S we can find α so that C(α, β, S) is nonnegative. Theorem 3.1 implies that optimisation over all such invertible matrices S will produce the optimal α. We explore this idea in Example 3.1.
Proof. By Lemma 3.1, we know that, if the system of inequalities BT > 0 and T −1 C(T T ) −1 > 0 has a solution for some invertible matrix T , then A ∈ ∂E n . In particular, if B(I − ǫY ) > 0 and (I − ǫY ) −1 C(I − ǫY T ) −1 > 0 for some Y ∈ R r×r and ǫ > 0, then A ∈ ∂E n . Assume A ∈ ∂E n and define Z(B) := {(i, j); B ij = 0} and Z(C) := {(i, j); C ij = 0}. (Note that the assumption A ∈ ∂E n implies Z(B) and Z(C) are not empty.) From the formal expansion (I − ǫY ) −1 = ∞ i=0 (ǫY ) i , and looking at linear terms in ǫ, we deduce that the system of linear inequalities: is not solvable for any r × r matrix Y . This system of linear inequalities is equivalent to the following one: where P is the permutation matrix satisfying P vec(Y ) = vec(Y T ). By the Transposition theorem of Gordan [25] this system is unsolvable if and only if the following dual system is solvable: Let x and z be solutions to the above, and let x 0 ∈ R rn be the vector obtained from x by inserting (x 0 ) i = 0 for i that correspond to supp(B), i.e. x 0 = vec(X) with X • B = 0. Similarly, let z 0 ∈ R r 2 be the vector obtained from z by inserting (z 0 ) i = 0 for i that correspond to supp(C). In other words, z 0 = vec(Z) satisfying Z • C = 0.
The system above, rewritten in terms of X and Z, becomes: Introducing W := Z + Z T , we get: We have shown that the assumption A ∈ ∂E n implies nonzero solution (X, W ) to the system 8. The conclusion of the theorem follows.
To illustrate how Theorem 3.2 is applied, we return to Example 3.1.
From Example 3.1, we already know that A + 12vv T ∈ ∂E n . This is supported by the fact, that the only solution to the system (8) for B(2, S 1 ) and C(12, 2, S 1 ) is X = 0 and W = 0. On the other hand, applying Theorem 3.2 to we get the following nonzero solutions to the system (8) for B = B 3 and C = C 3 : While this does not prove A+α 3 vv T ∈ ∂E n , it does show that the factorisation A = B 3 C 3 B T 3 cannot locally be moved to have positive factors.

A Completion Problem
For given nonnegative symmetric matrices A 1 and A 2 , we consider the question of minimising the SNT-rank of over all nonnegative matrices X of appropriate order. We will consider two variants of this problem, one allowing any nonnegative X, and the other requiring X to be positive. Problems of this type occur in situations where only partial information on the data is known, and we desire unknown data to produce a matrix of low SNT-rank. Here our main motivation for considering this problem is to advance our understanding of matrices with low SNT-rank.
For i = 1, 2, let A i be n i × n i nonnegative symmetric matrices, and let A be as above. We define: The two ranks can happen to be the same for some given A 1 and A 2 . The example below illustrates that st can also occur. Example 4.1. Let A 1 = 0 and A 2 be a rank 1 symmetric nonnegative matrix. Clearly, st + (A 1 , A 2 ) = 1, and st > + (A 1 , A 2 ) = 2. Below we list some straightforward inequalities: The last inequality holds, since we can always choose X = 0. The corresponding question on low rank completion without nonnegativity constraints is resolved, and can be deduced from the main result in [7]. The solution depends on the inertia of matrices given on the block diagonal.
Definition 4.1. Let A ∈ R n×n be a symmetric matrix. The inertia of A is the triple In(A) = (π + , π − , π 0 ), where π + , π − , π 0 are, respectively, the number of positive, negative and zero eigenvalues of A.
Then rk(A) = rk(A 1 ) + rk(A 0 ) and st Corollary 4.1. Let A 1 ∈ S + n , an n × m nonnegative matrix, and where a can be any positive number. Indeed: Observe, that On the other hand, st > + (A 1 , A 2 ) = 4, since rk + (A) = 4 for any choice of positive matrix X. Namely, suppose that A = UV T is a NM-Factorization of A with U, V ∈ R 4×3 + . Since each row of A has a zero entry, each row of U has to have one as well. Two rows of U cannot have the same pattern of zeros, so at least one row of U has two zeros. Without loss of generality we may assume that one of the rows of U, say k-th, equals 1 0 0 . It follows that the first row of V equals the k-th row of A, so it contains three nonzero entries. It further follows that the first entry of each row of A, exept k-th, equals zero. So the rank of matrix A with the k-th row omitted equals 2, a contradiction.
The following lemma gives some insight into the case when A is completed with a matrix X of rank 1.

The upper bound is shown by constructing an SN-Trifactorization of
A from SN-Trifactorizations ofÂ 1 and A 2 , as follows. Let We have st + (A 2 ) ≤ st + (A) by Proposition 2.2. Finally, let be an SN-Trifactorization of A that achieves st + (A), where the partition of B respects the partition of A in (11). Then is the SN-Trifactorization ofÂ, proving st + (Â 1 ) ≤ st + (A).
3. If rank(A 2 ) = 1, then A 2 = αuu T . Let hence st + (A) ≤ st + (Â 1 ). The reverse inequality follows from 2. Proof. Choose b ∈ R n + so that A 1 b is positive and b T A 1 b is equal to the Perron eigenvalue of A 2 . LetÂ One more application of Proposition 4.2, this time joining PÂ 1 P T with A 2 , where P is a permutation matrix switching the first and the third row, gives us A. By [12] this proves that A belongs to a family of a nonnegative matrices that are generated by a Soules matrix. If a matrix generated by a Solues matrix happens to be positive semi-definite, then it is completely positive, and its cp-rank is equal to its rank, [26]. If a matrix generated by a Soules matrix is not positive semi-definite, then is clearly not completely positive. The matrix A in this example satisfies rk(A) = 3 < st + (A) = 4, hence we note that a symmetric matrix generated by Solues matrix can have SNT-rank bigger than rank.  Let v ∈ R k + , v T v = 1, and A 2 := vv T . InsertingÂ 1 and A 2 in Lemma 4.2 we can construct a matrix:

Conclusion and open questions
In this work we introduced the problem of SN-Trifactorization and SNTrank, and developed some foundation results. Since the SN-Trifactorization can be connected to both the NMF-factorization and the CP-Factorization, research directions for further work can be easily found in the extensive literature on those factorizations. Here, we suggest a handful of questions that can be motivated by the results in this work.
1. In [16], the restricted nonnegative rank of a matrix A, denoted by rk * + (A), is defined to be the minimum value of k such that there exist U ∈ R m×k + and V ∈ R n×k + satisfying A = UV T and rk(A) = rk(U). Further, it is shown that rk + (A) can be smaller than rk * + (A). Similarly, one can define st * + (A) to be the minimal k so that A = BCB T and rk(A) = rk(B) for B ∈ R m×k + , C = C T ∈ R k×k . It would be interesting to explore to what extent the geometric interpretation of rk * + can be addapted to st * + , and find examples of matrices A with st + (A) < st * + (A).
2. Shitov [28] found the bound rk + (A) ≤ ⌈ 6 min{m,n} 7 ⌉ for A ∈ R m×n + with rk(A) = 3. On the other hand, Hannah and Laffey [17] and Barioli and Berman [2] bounded the cp(A) in terms of rk(A), for a completely positive matrix A. In particular, they showed: cp(A) ≤ rk(A)(rk(A)+1) 2 − 1. Proposition 2.1 implies that st + (A) has the same upper bound if A is completely positive. From our discussion above it is clear that bounding st + (A) solely in terms of rk(A) for general symmetric nonnegative matrices is not possible. However, it would be interesting to explore if bounds similar to the one derived in [28] can be found for st + (A).