Polynomial identities for ternary intermolecular recombination

The operation of binary intermolecular recombination, originating in the theory of DNA computing, permits a natural generalization to n-ary operations which perform simultaneous recombination of n molecules. In the case n = 3, we use computer algebra to determine the polynomial identities of degree<= 9 satisfied by this trilinear nonassociative operation. Our approach requires computing a basis for the nullspace of a large integer matrix, and for this we compare two methods: (i) the row canonical form, and (ii) the Hermite normal form with lattice basis reduction. In the conclusion, we formulate some conjectures for the general case of n-ary intermolecular recombination.

In Definition 1, each a i can be regarded as the abstract representation of a molecule divided into n consecutive submolecules a i1 , a i2 , . . ., a in . For example, if X = {A, C, G, T } represents the four bases found in DNA, then each a i represents a DNA sequence partitioned into n subsequences. With this interpretation, the sum defining {a 1 , a 2 , . . . , a n } expresses the results of recombining the n molecules in all possible ways which preserve the positions of the n submolecules in each molecule. Formula (1) is very similar to the definition of the permanent of an n × n matrix; see Wanless [7].
The next result is clear.
In the case n = 2, the operation of binary intermolecular recombination was introduced by Landweber and Kari [4].
Bremner [1] used computer algebra to establish the following proposition.
Proposition 3. Every polynomial identity of degree ≤ 5 satisfied by binary intermolecular recombination is a consequence of commutativity (binary complete symmetry) and the binary recombination identity Sverchkov [5] has recently proved the following theorem.
Theorem 4. Every identity, with no restriction on the degree, satisfied by binary intermolecular recombination is a consequence of commutativity and the binary recombination identity.
In related work, Bremner, Piao and Richards [3] have shown that every polynomial identity satisfied by the zygotic algebras of simple Mendelian inheritance is a consequence of commutativity and the binary recombination identity.
In the present paper we consider the case n = 3; we use computer algebra to determine the polynomial identities of degree ≤ 9 satisfied by ternary intermolecular recombination. A monomial which involves d applications of an n-ary operation has degree d(n−1) + 1; thus polynomial identities for ternary intermolecular recombination exist only in odd degrees.

Binary intermolecular recombination
In this section we recall the results of Bremner [1] on polynomial identities for binary intermolecular recombination, and show how these results may also be obtained (and slightly improved) using the Hermite normal form with lattice basis reduction.
In degree 4, there are 15 commutative nonassociative multilinear monomials, 12 with association type {{{a, b}, c}, d} and 3 with association type {{a, b}, {c, d}}. We order these monomials first by association type, and then lexicographically by the permutation of the variables: We expand each monomial by three applications of binary intermolecular recombination. For example, writing Each expansion is a linear combination (allowing zero coefficients) of 12 pairs, which we order lexicographically: We store the expansions in the 12 × 15 matrix E in which entry (i, j) contains the coefficient of pair i in the expansion of monomial j; see Table 1. The coefficient vectors of the polynomial identities satisfied by binary intermolecular recombination in degree 4 are the nontrivial linear combinations of the columns of E that give the zero vector; that is, the nonzero vectors in the nullspace of E.
There are two ways to compute a basis for the nullspace of a matrix E with entries in the ring Z of integers: (   For each basis vector, we apply all permutations of a, b, c, d to the corresponding identity I and store the results in a 24 × 15 matrix M (I); the rank of M (I) is the dimension of the S 4 -module of identities which are consequences of I. Rows 2 and 3 are the shortest vectors (squared norm 8) for which the corresponding identities generate the entire nullspace, and row 3 corresponds to the binary recombination identity.
Method (a) provides a basis of integral vectors for the rational nullspace of the matrix E. The disadvantage of this method is that this basis of integral vectors may not be a integer basis for the nullspace lattice of E; see Example 17 in Bremner and Peresi [2].
We want to find a lattice basis of L(E); that is, a set of vectors in L(E) which are linearly independent over Q and such that every vector in L(E) is an integer linear combination of these basis vectors. To find a lattice basis, we need to recall the definition of the Hermite normal form (HNF) of an integer matrix.
The following two results are Lemmas 19 and 20 in Bremner and Peresi [2]. .
Lemma 8. Let E be an m × n matrix over Z, let H be the HNF of E t , and let U be an n × n matrix over Z with det(U ) = ±1 and U E t = H. If r is the rank of H, then the last n − r rows of U form a lattice basis for L(E).
For the matrix E of Table 1, method (b) implemented with the Maple command gives the Hermite normal form H in Table 4 and the transform matrix U in Table  5. Since the rank of E is 6, the last 9 rows of U form a lattice basis of the integer nullspace of E t . The squared norms of these 9 row vectors are 11, 5, 5, 31, 15, 25, 6,124,44,196,14,142,22,6,38. At this point the lengths of the basis vectors are greater than those obtained with the RCF; however, we know that we have an integer basis of the nullspace lattice.
To improve these results we use the LLL algorithm for lattice basis reduction, following Bremner and Peresi [2]. For the matrix E of Table 1, the Maple command produces the transform matrix U of Table 6. The bottom 9 rows of U are a reduced basis for the integral nullspace lattice of E; every one of these vectors has squared norm 6, the same as the shortest vector from Table 3. Even for this small matrix, the LLL algorithm has produced a basis of the nullspace with significantly shorter vectors. The identity corresponding to row 9 of U generates the entire nullspace: Identity (4) has one more term than the binary recombination identity (3), but its coefficient vector is shorter.
Identities (3) and (4) both have only one term in the second association type, so both imply that every monomial in the second type can be expressed as a linear combination of monomials in the first type. For example, identity (4)  This identity can be used to reduce the number of association types that we need to consider when studying identities of higher degree.

Ternary intermolecular recombination
We now present the main results of this paper: a complete and minimal set of polynomial identities of degree ≤ 9 for ternary intermolecular recombination. 3. 1. Degree 3. Every polynomial identity in degree 3 satisfied by ternary intermolecular recombination is a consequence of the complete symmetry identity. Indeed, equation (2) implies that there is only one monomial in degree 3, namely {a, b, c}, and hence the only possibly polynomial identity in degree 3 is {a, b, c} = 0 for all a, b, c; this clearly does not hold. 3 Any polynomial identity in degree 5 satisfied by ternary intermolecular recombination is a linear combination of these 10 monomials.
Lemma 9. Every polynomial identity of degree 5 satisfied by ternary intermolecular recombination is a consequence of the complete symmetry identity in degree 3. Each term in these 10 expansions has the form (x 1 , y 2 , z 3 ) where (x, y, z) is a permutation of a 3-element subset of {a, b, c, d, e}. There are 60 such triples, which we order lexicographically: We construct the 60 × 10 matrix E in which entry (i, j) contains the coefficient of triple i in the expansion of monomial j. The polynomial identities in degree 5 for ternary intermolecular recombination correspond to the nonzero vectors in the nullspace of E. It suffices to show that this matrix has full rank, and for this it suffices to find 10 rows for which the corresponding 10 × 10 submatrix has full rank. Rows 1, 2, 3, 5, 6, 9, 17, 18, 21 and 33 produce the submatrix displayed in Table  7; the row canonical form of this submatrix is the identity matrix.  Table 7. Submatrix for the proof of Lemma 9 3.3. Degree 7. We now consider 7 distinct molecules, each divided into 3 submolecules: After performing three applications of ternary intermolecular recombination on these molecules with some order of operations and some permutation of molecules, we obtain a linear combination of triples of the form (x 1 , y 2 , z 3 ) where (x, y, z) is a permutation of a 3-element subset of {a, b, c, d, e, f, g}. The number of triples which can occur as terms in these linear combinations is therefore 3! 7 3 = 210.
We order these triples lexicographically by the permutation (x, y, z). Equation (2)  Within each association type, we order the monomials lexicographically by the permutation of the variables. Any polynomial identity in degree 7 satisfied by ternary intermolecular recombination is a linear combination of these 280 nonassociative monomials.
The expansion matrix E in degree 7 has 210 rows and 280 columns; entry (i, j) is the coefficient of triple i in the expansion of nonassociative monomial j. The expansions of the two association types are displayed in Tables 8 and 9. Each expansion has 6 3 = 216 terms; after collecting terms, the first type produces a linear combination of 30 triples with multiplicities 4 and 12, and the second type produces a linear combination of 54 triples with multiplicity 4. We use Maple to find that the matrix E has rank 35, and so its nullspace has dimension 245. We compute the row canonical form of E, and find that every entry of the RCF is an integer. We obtain the canonical basis of the nullspace of E; this is a list of   Table 9. Expansion of monomial {{a, b, c}, {d, e, f }, g} 245 vectors of dimension 280 with integer components. We sort these vectors by increasing Euclidean norm; the list of squared norms is displayed in Table 10.
We now perform further computations, using modular arithmetic to save memory, to determine a set of generators for the nullspace as a module over the symmetric group S 7 . (We use p = 101, the smallest prime greater than 100; any prime greater than the degree of the identities would produce the same dimensions.) For each basis vector in the nullspace, we apply all 5040 permutations of a, b, c, d, e, f , g to the corresponding identity, and store the results in a matrix of size 5040 × 280. We compute the row canonical form of this matrix; its nonzero rows form a basis of the S 7 -module generated by the identity, and its rank is the dimension of this module. We process the nullspace basis vectors in order, saving previous results (the nonzero rows which form a basis of the S 7 -module generated by the previous basis vectors). At each stage, an increase in the rank implies that the current identity is a new generator of the nullspace as a module over S 7 . The results of these computations show that the canonical basis vectors with positions 1, 10 and 60 (after the basis vectors have been sorted by increasing Euclidean norm) represent polynomial identities which generate the entire nullspace as an S 7 -module:  6  6  6  6  6  6  6  6  6  6  6  6  6  6  6  6  6  6  6  6  8  8  8  8  8  8  8  8  8  10  10  10  10  10  10  10  10  10  10  10  10  12  12  12  12  12  14  16  32  32  34  34  34  34  52  52  52  52  52  52  80  80  82  82  86  86  88  94  94  96   Every identity in the nullspace is a linear combination of permutations of these three identities, which have squared norms 4, 6 and 32; the second identity has been normalized so that it begins with a positive coefficient. We can get significantly better results using the Hermite normal form and lattice basis reduction.
Theorem 10. Every polynomial identity of degree 7 satisfied by ternary intermolecular recombination is a consequence of complete symmetry in degree 3 and these three identities in degree 7: Identities P and Q are independent: neither implies the other. Identity R implies both P and Q, but identities P and Q together do not imply R.

Proof. We apply the Maple command
HermiteForm( Transpose(E), output='U', method='integer[reduced]' ): to the transpose (size 280 × 210) of the expansion matrix E, and obtain a transform matrix U (size 280 × 280). The bottom 245 rows of U are a basis for the integral nullspace lattice of E. We sort these vectors by increasing Euclidean norm; the squared norms are displayed in Table 11. It is clear by comparing Tables 10 and 11 that the basis vectors obtained in this way are generally much shorter than those obtained from the RCF. Identities P , Q and R have coefficients ±1 and the squared norms of the corresponding coefficient vectors are 4, 6 and 12. (These identities correspond to the reduced basis vectors in positions 1, 40 and 129 after the reduced basis vectors have been sorted by increasing Euclidean norm.) These three identities can be verified by expanding each term using three applications of ternary intermolecular recombination, and then checking that the results collapse to zero.
To prove the stated dependence and independence relations we proceed as follows. Given an identity I(a, b, c, d, e, f, g) we apply all 5040 permutations of a, b, c, d, e, f , g and store the coefficients of the resulting identities in a 5040 × 280 matrix. The rank of this matrix is the dimension of the S 7 -submodule of identities generated by I. To check whether another identity J(a, b, c, d, e, f, g) is implied by I we simply determine whether J is in the row space of this matrix. We find that identity P produces dimension 105, and identity Q produces dimension 127; stacking the two corresponding matrices together, we see that these two identities together produce dimension 155. However, identity R produces dimension 245, and this is also the rank produced by the three identities together. Since 245 is also the dimension of the nullspace of the expansion matrix, it follows that every identity in degree 7 is a consequence of the identity R, in the sense that every identity in degree 7 is a linear combination of permutations of R(a, b, c, d, e, f, g).
The identity R of Theorem 10 will be called the ternary recombination identity. The last term in R is the only term in identities P , Q and R which has the second association type, so we have the following result.
Corollary 11. Every polynomial identity of degree ≤ 7 satisfied by ternary intermolecular recombination is a consequence of complete symmetry and the following identity, which expresses any monomial in the second association type as a linear combination of monomials in the first association type:  This is the number of rows in the expansion matrix in degree 9. Equation (2) implies that we need only four association types in degree 9: The total is 15400; this is the number of columns in the expansion matrix. We initialize the expansion matrix and then use modular arithmetic (again with p = 101) to compute its rank; the result is 84, so the dimension of the nullspace is 15316. To decide whether any nullspace vector represents a new identity in degree 9, we need to determine the dimension of the subspace of the nullspace which consists of all consequences of the ternary recombination identity R = R(a, b, c, d, e, f, g) from Theorem 10. To obtain consequences in degree 9, we can either (i) replace one of the variables x ∈ {a, b, c, d, e, f, g} by the triple {x, h, i} where h and i are two new variables, or (ii) embed the entire identity R in a triple {R, h, i}. This gives 8 identities in degree 9 which generate the S 9 -module of all consequences of the identity R in degree 7: Remark 12. Shortly before the final version of this paper was sent to the editors, I received Sverchkov's preprint [6], which contains a complete proof of all three Conjectures for all n, with an explicit identity R as required by Conjecture 2.