Information-bit error rate and false positives in an MDS code

In this paper, a refinement of the weight distribution in an MDS code is computed. Concretely, the number of codewords with a fixed amount of nonzero bits in both information and redundancy parts is obtained. This refinement improves the theoretical approximation of the information-bit and -symbol error rate, in terms of the channel bit-error rate, in a block transmission through a discrete memoryless channel. Since a bounded distance reproducing encoder is assumed, the computation of the here-called false positive (a decoding failure with no information-symbol error) is provided. As a consequence, a new performance analysis of an MDS code is proposed.


I. INTRODUCTION
A FUNDAMENTAL challenge when determining the performance of a block error-correcting code is to measure its bit-error rate (BER), which quantifies the reliability of the system. In practice, the BER estimation for a single code is simple, just send data and divide the errors committed among the total number of bits. However, it would be too costly and time-consuming if a comparative between several codes is required. Mathematical software packages for encoding and decoding are very limited and restricted to specific codes and simulations would consume a huge amount of time when dealing with low bit-error rates. For this reason, a theoretical approach to the measurement of the BER is proposed by several authors in the literature, see for instance [1]- [6], [16]. All these papers follow this scheme of codification: let C be a code of length n and dimension k over the field with q elements, being q ≥ 2. An n-tuple is transmitted through a q-ary symmetric discrete memoryless channel. In this step, there are two possibilities, the transmission is right or failed in some symbols. In a second step, the code corrects the ntuple, detects an erroneous transmission but does not correct it, or asserts that the transmitted n-tuple is a codeword. Finally, there is a comparison between the encoded and decoded ntuples, see  2) A right correction (RC), i.e., some of the symbols are incorrectly received but the decoding algorithm corrects them. 3) An error detection (ED), i.e., the number of errors exceeds the error-correction capability of the code, the block is not corrected and the bad transmission is detected. 4) A wrong correction (WC), i.e., some errors occur (beyond the error capability of the code), there is a codecorrection, but nevertheless, the encoded block differs from the decoded block. 5) A false negative (FN), i.e., some symbols are incorrectly received but the whole block is a codeword, so, from the receiver's point of view, the block is correctly received.
Cases FN and WC are called undetected errors in [6], and it is proven that, for maximum-distance-separable (MDS) codes, the probability of an undetected error decreases monotonically as the channel symbol error decreases, that is, MDS codes are proper in the terminology of [2]. Hence, the performance of an MDS code is characterized by the probability that an erroneous transmission will remain undetected. In [1], as a performance criterion, the probability of an ED is added and an exhaustive calculus of the word-, symbol-, and bit-error rate of ED, WC, and FN is made.
In this paper, we propose a refinement in the calculi of the probability of an FN, a WC, and an ED. Consequently, we get a better approximation of the BER for a q-ary MDS code. As in the above references, we consider a bounded distance reproducing decoder, i.e., it reproduces the received word whenever there are uncorrectable errors. The underlying idea consists in removing the symbol errors produced in the redundancy part of the decoded n-tuple, that is, following the nomenclature of [13], [14], unlike the aforementioned papers, we estimate the information-bit error rate (iBER), sometimes also called post-decoding bit-error rate. More formally, let us assume, without loss of generality, that the codification is systematic and the first k symbols of the n-tuples form an information set. Hence, following the above scheme, after the comparison step, if there are s errors, the symbol-error proportion is s/n. Nevertheless, some of these errors belong to the redundancy part and they will not spread in the final post-decoded k-tuple. In other words, a new variable should be considered: the comparison between the original block and the final k-tuple obtained after decoding, see Figure 2.
Original k-tuple Attending to this new variable, we may split the ED into two disjoint cases: 3.a) A pure error detection (PED), i.e., some errors affect the information set. 3.b) A false positive (FP), i.e., all errors belong to the redundancy part. So, from the point of view of the receiver, there are uncorrectable errors but, indeed, the post-decoded block is right.
Then, a study of the BER should consider the probability of obtaining a false positive after the post-decoding process and, hence, the criterion for measuring the performance of the code should be the probability of undetected and PED errors.
All along this paper we shall assume that the code under consideration is a q-ary [n, k] MDS code in a bounded distance reproducing encoder, where q = 2 b for some natural b > 0 and 0 < k < n, with minimum distance d = n − k + 1 and error-correcting capacity up to t = ⌊(n − k)/2⌋ errors. Furthermore, we shall assume, without loss of generality, that the generator matrix is systematic in its first k coordinates. This is so since it is well known that, in an MDS code, any subset of k coordinates form an information set, see e.g. [11,Theorem 5.3.4]. The reorganization of these components to the first k does not affect our calculations and makes the text much more readable. The channel BER shall be denoted by p, and by 1 − p, the probability of a bit of being correctly received. The code is 2 b -ary, so the probability for a symbol of being correctly received is q s = (1 − p) b . Therefore, the expected probability of transition between any two symbols, the channel symbol-error rate, is the average of all possible cases, i.e., p s = . Finally, if we want to know the probability for a bit to be erroneous inside an erroneous symbol, we simply give the conditional probability p b|s = p 1−(1−p) b . We will denote by ∧ and ∨ the minimum and the maximum of a set of integers.
The paper is structured as follows. In Section II, we count the codewords of a q-ary MDS code. The number of codewords of a fixed weight has been calculated before (see [6] and [9]); nevertheless, we need a finer approach. In [7] and [8] a formula for the number of codewords of fixed weight with respect to a given partition is provided. By means of arguments of Linear Algebra, we shall give an equivalent formula for counting the number of codewords of a fixed weight both in the information set and in the redundancy part. This allows us to calculate the iBER of an FN. In Section III, we shall deal with the decoding failures, that is, when the received word is corrected to a different codeword to the original one. We shall follow the style of [1] and calculate the words inside the sphere of a codeword of a given information and redundancy weight. We will make use of these calculations in Section IV in order to give an expression of the probability that an FP occurs and obtain the desired approximation of the BER of an MDS code. In order to make the paper self-contained, in the Appendix, we add some of the combinatorial formulae needed all along the text. II. COUNTING CODEWORDS Let C be a q-ary [n, k] MDS code generated by a systematic matrix G = (I k |R). The aim of this section is to compute A i j , the number of codewords in C with i nonzero information symbols and j non-zero redundancy symbols, where i ∈ {0, · · · , k} and j ∈ {0, · · · , n − k}. This is called the input-redundancy weight enumerator (IRWE) in the literature (see e.g. [7]). In fact the weight enumerator of any partition T on {1, . . . , n} is computed in [7, Theorem 1] and [8, Theorem 3.1], hence A i j can be obtained as a particular case of those theorems. We propose a new way to compute A i j involving linear algebra techniques. We shall need the following lemmata.
Let a ∈ F k q be the vector in which a h is in the i h -th coordinate for all 1 ≤ h ≤ l and zero otherwise. Since aG = (aI k |aR), there are l non-zero coordinates in the first k coordinates of aG. Now, for any (1), so, in the last n − k coordinates of aG, there are no more than n−k−l non-zero coordinates. Then, aG has weight n − k at most. Since aG ∈ C and the minimum distance of C is d = n − k + 1, we get a contradiction. Consequently, R ργ is regular. Given a system of linear equations, we say that a solution is totally non-zero if all the coordinates of the vector are other than zero. We say that a matrix has totally full rank if any submatrix has full rank. By Lemma 1, R has totally full rank.
Let us denote by f q i,j the number of totally non-zero solutions of a homogeneous linear system, over the field F q , with i variables and j equations whose coefficient matrix has totally full rank.

Lemma 2.
For any integers i, j ≥ 1, f q i,j is given by the following recurrence: Proof: If i ≤ j, there are, at least, as many equations as variables. Since its coefficient matrix has full rank, the zero vector is the only solution. Then f q i,j = 0. If i > j, since its coefficient matrix has full rank, the system is undetermined. Specifically, whenever we fix i − j coordinates, we will find a single unique solution. Then, there are (q − 1) i−j solutions whose first i − j coordinates are nonzero. In order to calculate f q i,j , it is enough to subtract those solutions for which some of the remaining j coordinates are zero. For any 0 < h ≤ j, the solutions with exactly h coordinates being zero may be obtained choosing h coordinates from j, and calculating the number of totally non-zero solutions in a linear system of i − h variables and j equations, that is, (2). Observe that if i ≤ j, then F q i,j = 0, since the sum runs over the empty set. Suppose that i > j.

Proposition 3. For any integers
We substitute f q i−h,j by F q i−h,j in (2), and using (3), we get This expression is a polynomial in (q−1). Let us now calculate the coefficients of (q −1) i−j−H for H = 0, · · · , j ∧(i−j −1). We group those coefficients in which l + h = H. If H = 0, then l = h = 0, and therefore the coefficient of (q − 1) i−j is where the last equality comes from Lemma 15, since As a result the case when j < H ≤ i − j − 1 is left to be analyzed. In such case, the coefficient of (q − 1) i−j−H is given by where, again, the last equality follows from Lemma 15 since By (3), j is the number of codewords of C with weight i in the first k coordinates (the information set) and j, in the remaining n − k coordinates (the redundancy part).
A codeword with weight i in the information set and j in the redundancy part must be obtained as a linear combination of i rows of G with non zero coefficients such that exactly n − k − j coordinates of the redundancy part become zero, i.e. it is a totally non-zero solution of a homogeneous linear system with n − k − j equation and i variables. By Lemma 1, this number is given by f q i,n−k−j . However some of the solutions counted in f q i,n−k−j can turn zero some of the remaining j coordinates. Hence, the lemma is obtained as a consequence of the Inclusion-Exclusion Principle.
This agrees with the fact that the minimum weight of a non-zero codeword must be n − k + 1 = d.
Let us now give a neater description of A i j . By mixing Proposition 3 and Lemma 4, we have We proceed analogously to the proof of Proposition 3. Let us examine the coefficient of ( We distinguish two cases. On the one hand, if H ≤ j, then 0 ≤ H ≤ j ∧ (i − n + k + j − 1), and then, the coefficient is given by where the last equality is a consequence of Lemma 11. On the other hand, if j ≤ H, the coefficient of (q − 1) i−n+k+j−H is given by where † is given by Lemma 11 and Lemma 12, and ‡ is a consequence of Lemma 12. That is, by (5) and (6), we have proven the main result of this section: Recall from [7, Theorem 1] and [8, Theorem 3.1] that for any (n 1 , . . . , n p ) partition T on {1, . . . , n}, the partition weight enumerator is Proposition 7. Let T be the (k, n−k) partition on {1, . . . , n} associated to the k information symbols and n−k redundancy symbols. Then A T (i, j) = A i j for all i ≤ k and j ≤ n − k. The proof of Proposition 7 is in Appendix B. As a consequence of Theorem 6, we may recover the well known formula of the weight distribution of an MDS code as it appears in [6]. Let {A r | 0 ≤ r ≤ n} denotes the weight distribution of C. Then since a codeword of weight r distributes its weight between the first k and the last n − k coordinates.

Proposition 8. The weight distribution of an MDS code is given by the following formula:
Proof: The case r = 0 is evident. If 0 < r ≤ n − k < d, as we have pointed out in Remark 5, A i r−i = 0. Let us now suppose that r ≥ d = n − k + 1. By Theorem 6, as A 0 If r > k, then k i = 0 for all k + 1 ≤ i ≤ r, so, for any k and r, By Lemma 11, we obtain that We drop n r and develop the rest of the previous formula, computing the coefficient of q r−n+k−j , that is, j = H + L = r−n+k j=0 j∧(r−n+k−1) where we have split the case j = r − n + k. By Lemma 14 for δ = n − k, β = r, α = j and κ = H, and this yields The independent term in q of this expression can be rewritten as where the last equality is given by Lemma 14 with β = r − 1, κ = H, α = d − 1 and δ = r − d. By Lemma 13 with α = r, γ = j and κ = r − d we have This finishes the proof. As a corollary of the above proof, we find a new description of the weight distribution of an MDS code.

Corollary 9.
For any r ≥ 0, the number of weight r codewords of an MDS code is given by the formula Now, we are in the position to describe the informationbit and information-symbol error rate of a false negative, concretely, The iBER of a false negative can be compared for different codes in Figure 3. Observe that it increases monotonically as the channel BER increases. The iBER of a FN is significantly smaller than the channel BER, at least for dimensions less or equal than 117.

III. DECODING FAILURES
In this section, we shall make use of the calculus of the values A i j in order to obtain the information errors of a decoding failure. For simplicity, we may assume that the zero codeword is transmitted. Suppose that we have received an erroneous transmission with r 1 non-zero coordinates in the information set and r 2 in the redundancy. All along the paper, we shall say that the weight of this error is r 1 + r 2 . If the received word is corrected by the code, then it is at a maximum distance t of a codeword. Obviously, if 0 ≤ r 1 + r 2 ≤ t, the word is properly corrected, so we may assume that t < r 1 +r 2 . In this case, the correction is always wrong and we have a decoding failure. Our aim now is to count these words, highlighting the number of wrong information symbols and the errors belonging to the redundancy, i.e. the words of weight r 1 + r 2 that decode to a codeword c of weight c 1 + c 2 . The reasoning is as follows: for any codeword c of such weight, we calculate N (c1,c2) (r1,r2) , the number of words of weight r 1 + r 2 which belong to its sphere of radius t. This can be carried out by provoking up to t changes in c. Our reasoning is analogous to the one in [1]. Firstly, there is a minimum number of symbols that must be changed, either to zero or to something non-zero depending on the sign of r 1 − c 1 and r 2 − c 2 in order to obtain the correct weight. If t is large enough, we can use the remaining correction capacity to modify an additional number of symbols to zero, and the same number of symbols to a non-zero element of F q in order to keep the weight unchanged. Finally, the remaining possible symbol modifications can be used to change some non-zero symbols into other non-zero symbols, without affecting the weight of the word. Let α = t − |c 1 − r 1 | − |c 2 − r 2 | where |n| denotes the absolute value of n. We may distinguish four cases: a) r 1 ≤ c 1 and r 2 ≤ c 2 : In the c 1 non-zero information coordinates, c 1 − r 1 of them should be changed to zero. Additionally, we also allow i 1 more. In the same way, on the c 2 non-zero coordinates of the redundancy, we must change c 2 − r 2 and i 2 of them. Therefore, we have the following number of possibilities: Now, we should give a non-zero value to i 1 coordinates between the k − c 1 remaining information symbols, and i 2 coordinates between the n − k − c 2 remaining redundancy symbols. Thus, the number of possible changes is as follows: Since the changes cannot exceed t, the admissible quantities for i 1 and i 2 satisfy and hence Finally, we may change some of the remaining non-zero r 1 −i 1 and r 2 − i 2 coordinates to another non-zero symbol. If we change the j 1 and j 2 coordinates, respectively, we obtain changes, where j 1 and j 2 satisfy Therefore, the total number of words is the following: If we denote I = i 1 + i 2 and J = j 1 + j 2 , by Lemma 12, we have b) r 1 ≤ c 1 y r 2 > c 2 : We proceed as in the above case with the information symbols. So we can make changes in the information set. In the redundancy part, we must give a non-zero symbol to c 2 − r 2 + i 2 coordinates, and, therefore, change i 2 of c 2 coordinates to zero. In that way, we have possible changes, where and then Finally, changing the value of j 1 and j 2 of the remaining r 1 −i 1 and c 2 − i 2 non-zero coordinates, we have Thus, the total number of words is Again, we denote I = i 1 +i 2 and J = j 1 +j 2 , and, by Lemma 12, we simplify the expression to If we change I ′ = I +r 2 −c 2 and we take in care that binomial coefficients of negative integers are 0, we get (9) c) r 1 > c 1 y r 2 ≤ c 2 : In this case, as before. If we make the change of variable i ′ = I − i we can proceed as in the previous case and we get the formula where We proceed as in the two previous cases and we obtain the formula where Theorem 10. Let t < r 1 + r 2 . For any codeword c of weight c 1 + c 2 the number N (c1,c2) (r1,r2) of words of weight r 1 + r 2 which belong to its sphere of radius t is where α and β are as in Theorem 10. Hence, the informationbit and -symbol error rate of a wrong correction is as follows.
Observe from Figure 4 that the iBER of a WC also increases monotonically, so MDS codes can be said "proper" with respect to the iBER, see [2], [6].

IV. FALSE POSITIVES
As we pointed out in the Introduction, there exists the possibility of occurrence of an FP. Up to our knowledge, this has not been treated before in the literature. In this section, we calculate the probability that a PED and an FP occur, finishing our estimation of the iBER of an MDS code. As we noticed above, without loss of generality, we may suppose that the zero word is transmitted and we want to analyze the behaviour of the received word. Our purpose now is to count the words whose weight decomposes as 0 + r, i.e. there are no non-zero information symbols, which are not corrected by the decoder. Obviously, if 0 ≤ r ≤ t, the word shall be properly corrected, so we assume that t + 1 ≤ r ≤ n − k = d − 1. Two disjoint cases can take place: the error is detected but not corrected, producing an FP, or the error is (wrongly) corrected to a codeword. Since the total number of such words is given by it is enough to calculate the words corresponding to one of the two cases. We can make use of the calculations in Section III and give an expression of the words belonging to the second case. Indeed, the number of words of weight r with t + 1 ≤ r ≤ n − k that are corrected to a codeword is as follows: Hence, the number of false positives of weight r is given by and the probability of producing a false positive is given by the following formula F P (r)p r s q n−r s It can be observed in Figure 5a that the probability of a FP has a maximum for each code. When the channel BER is high enough this probability increases as the error correction capability of the code increases, see Figure 5b.
We may now give an estimation of the iBER of a PED. Indeed, when the received word has a weight greater than t, the error-correcting capability of the code, three disjoint cases can take place: an undetected error, an FP, or a PED. Hence, for a given weight i 1 + i 2 , the number of words producing a PED is given by whenever i 1 + i 2 > t and zero otherwise. Therefore, The reader may observe from Figures 6 that, for high channel BER's, the behaviour of the iBER of a PED becomes almost linear. Actually, the curves approximate to the line y = x according the dimension of the code diminishes.

APPENDIX A SOME COMBINATORIAL IDENTITIES
For the convenience of the reader and in order to make the paper self-contained, we add the combinatorial identities that have been referenced all along the paper.