Analysis of Minimal Ldpc Decoder System on a Chip Implementation

This paper presents a practical method of potential replacement of several different Quasi-Cyclic Low-Density Parity-Check (QC-LDPC) codes with one, with the intention of saving as much memory as required to implement the LDPC encoder and decoder in a memory-constrained System on a Chip (SoC). The presented method requires only a very small modification of the existing encoder and decoder, making it suitable for utilization in a Software Defined Radio (SDR) platform. Besides the analysis of the effects of necessary variable-node value fixation during the Belief Propagation (BP) decoding algorithm , practical standard-defined code parameters are scrutinized in order to evaluate the feasibility of the proposed LDPC setup simplification. Finally, the error performance of the modified system structure is evaluated and compared with the original system structure by means of simulation.


Introduction
The shortening of Reed Solomon (RS) codes presents a known and widely used technique of code parameter adaptation for practical implementations.It has been well described in literature [1].On the other hand, a similar method adapted for the use with the more modern Low-Density Parity-Check (LDPC) codes is a relatively recent topic.A theoretical approach regarding various shortening algorithms for selecting the optimal subset of variable nodes to be fixed has been thoroughly evaluated in [2] and [3].This paper elaborates on this analysis by providing insight not just regarding the code structure, but also analyzing the effect of variable node value fixation on practical Belief Propagation (BP) decoding algorithms.Moreover, this text focuses on the even more practical approach -evaluation of potential LDPC decoder with respect to the limitations of a resource limited System on a Chip (SoC) implementation.Such devices usually im-plement IEEE 802.15.4 standard [4], with different Forward Error Correction (FEC) schemes -the current version of the standard specifies the use of RS code along with an optional internal Convolutional Code.Considering the experience with other standards, such as IEEE 802.16 [5] and IEEE 802.11 [6], it is reasonable to evaluate the possibility of future inclusion of the LDPC code family to FEC techniques used in low-resource standards and systems.Our theoretical method is in some aspects similar to the LDPC code puncturing method presented in [7] and further analyzed in [8], while being different in one key aspectthe fixed bits are always part of the information portion of the codeword, so their value is always known, thus enabling to initialize the prior Log Likelihood Ratios (LLRs) of the fixed bits to infinite confidence, instead of zero.This paper focuses on preliminary evaluation of the potential of LDPC code utilization in the context of a very resource-constrained platform.The next section reintroduces the topic of LDPC codes with focus on the practical Quasi-Cyclic LDPC (QC-LDPC).The third section reviews the code shortening technique and proposes a detailed adaptation algorithm to be used with LDPC codes in both the transmitter and receiver.The third section also provides some insights into the operation of the Min-Sum decoding algorithm and the effects of node value fixation (necessary for code shortening) on the values being exchanged between the nodes during the message passing decoding algorithm.The fourth section discusses practical issues, such as the potential for simplification of the standard-defined [5] LDPC code set.The fifth section evaluates the decoding error performance by means of simulations and compares the achieved performance with the original system.The final section contains a brief summary and concludes the paper.

LDPC Codes Review
This section provides a brief recapitulation of known LDPC terminology required in the following sections.The LDPC code is a Linear Block Code (LBC) defined by its sparse parity check matrix H [9], such as the one depicted in Fig. 1.According to Fig. 1, for purpose of formulation of decoder equations, the columns of this matrix will be indexed by col and rows by row.As with any LBC, basic code parameters can be summarized in a triplet (n, k, d min ) or pair (n, k); where n denotes the codeword length, k the number of information symbols and d min the minimum code distance.Symbols N(row) and M(col) define the so called neighborhoods -a set of nodes incident with a given check-or variable-node; where c row denotes the row-th check-node and v col the col-th variable-node.
Example of a small parity check matrix of a (6, 3) code along with neighborhoods for variable node col = 4 and check node row = 2.
The graphical representation of the parity check matrix is called a Tanner graph.This is a bipartite graph consisting of two types on nodes -variable nodes each corresponding to codeword bits, and check nodes each associated with a parity equation.Tanner graph edges always connect a check node with variable nodes that participate in its parity equation.Tanner graph for the parity check matrix from Fig. 1 is depicted in Fig. 2. The main purpose of the Tanner graph is to visualize the LDPC code structure in order to support definition and visualization of various LDPC decoding schemes.More details regarding the LDPC code structure and decoding are provided by [9].

LDPC Code Shortening
The main idea of LDPC code shortening is similar to the known methods of RS code shortening [1]: the selected bits of the data words are assumed fixed, usually but not necessarily set to zero.Their value is defined in advance before the encoding in the transmitter and also known in advance to the receiver.These bits cannot be used to transmit useful information.Their purpose is merely to enable utilization of a code with given parameters (n, k) in situation where a code with different parameters (n', k') would be more appropriate.Fixed bits are inserted to data stream and after encoding they are discarded so that they are not transmitted at all.This transformation is widely used with RS codes to overcome their limitations, where codeword length n is bound to the size of the underlying Galois Field [1].This paper elaborates on the novel idea of using a similar technique with LDPC codes [2], [3].In this context, the proposed idea can be utilized for practical purpose of system complexity reduction, and also potential improvement of the decoder error performance.These two potential improvements are analyzed in the following sections.
The focus of this section is the thorough description of technical steps necessary to implement the proposed modification.The channel models can be Binary Symmetric Channel (BSC) if hard-decision is made before decoding (used with bit-flipping decoders) or Adaptive White Gaussian Noise (AWGN) channel for a more advanced Soft-Input Soft-Output (SISO) log-likelihood BP decoding.The model shown in Fig. 3 represents a simple standard along with usual symbol notation [9].The process description of modified LDPC code shortening in the transmitter and corresponding inverse process in the receiver provided in the next subsection is referred to as Algorithm A1.

Algorithm A1:
1. TX: Set part of the data bits to zero in the transmitter before encoding.These bits are to be called fixed bits with f denoting their number.
2. TX: Encode the whole data word with a systematic LDPC code.
3. TX: Discard the redundant filler bits of the systematic part of the codeword -the all zero part.

RX:
Insert appropriate filler values to the systematic part of the received noisy codeword in the receiver.
6. RX: Decode noisy codeword using original decoder with unchanged parameters.

RX: Discard redundant filler values.
For convenience, the process is also depicted in Fig. 4, with focus on visualizing the insertion and discarding of the filler bits before and after LDPC encoding, along with the equivalent processes in the receiver.The left side of the picture shows standard system operation -without code shortening, while the right side shows the process flow in a system implementing the described algorithm A1.There are some important implications rising from code shortening: regarding the useful data being transmitted, the new code has a smaller code rate: if the code rate of the original code is R c , defined by the fraction k/n, then by using f filler bits, the new code rate R' c will be equal to When comparing two codes similar in structure to the extent that all other concerns can be regarded insignificant, it is important to realize, that the code with smaller code rate should perform better in terms of error ratio -more specifically in higher Signal-to-Noise Ratio (SNR) range of the waterfall curve, where the coding gain offsets the drop of E b (Energy per bit) associated with code rate drop (in case assuming the data rate remains constant).

LDPC Decoding
At first glance it would seem that the fixed filler bits hide a potential not only for improving the error capability of the code, but of the decoding as well.The error decoding properties of the code itself have been analyzed in detail in publications [2] and [3].However, the decoding algorithm is a slightly different topic from the code itself.Therefore it is reasonable to evaluate this eventuality in detail.Intuitive approach unfolds as follows: since the fixed filler bits are not transmitted at all, no channel noise affects them, and so the perfect values of received symbols (+1 or -1 when using BPSK modulation -Binary Phase-Shift Keying) can be inserted in the receiver.When computing the channel LLR values, the variance of channel noise for these symbols n 2 is zero which results in infinite values of their LLR metric: where r is the received channel sample as shown in Fig. 3.The infinite confidence seems promising, since based on the BP one might expect that this confidence will improve the decoding itself, by propagating some of this confidence to other variable nodes containing usable data.While this intuition seems interesting, a detailed analysis provided in the following subsection reveals that this improvement does not occur.
On the other hand, as also our simulations in later sections will confirm, there is indeed an improvement in the waterfall curve when using fixed nodes.However, this improvement originates from the lower code rate R c ', more specifically from the fact that there is a relatively higher number of parity bits per really transmitted data bit.(In absolute values, there are just few data bits transmitted while the number of parity bits remains the same).

Analysis of Min-Sum Decoding
While the intuition regarding the infinite confidence in the fixed filler nodes would suggest the propagation of this confidence to other nodes, thus improving the overall decoding, this section contradicts the intuition by means of a sophisticated example.First we review the operation of prominent and most widely used BP based (Sum-Product and Min-Sum) SISO decoding algorithm [9], [10].The decoding is iterative, where each iteration consists of two steps: the horizontal step, conforming to one parity equation in which variable nodes send their extrinsic information L row,col to other variable nodes using the interconnections defined by a check node; the vertical step where these messages are collected across all parity equations defined by code structure to form a final posterior LLR estimate.
The Sum-Product algorithm defines the horizontal step equation [9]: with symbols N(row) and M(col) already defined in Sec. 2, (j) is the index of decoding iteration and symbols L row,col and Z row,col are the messages exchanged between variable nodes: Z row,col is the log-likelihood ratio defining that the col-th bit of the input data has the value 0 versus 1, given the information obtained via the check nodes other than check node row.L row,col is the LLR where the condition for check node row is satisfied when the input data bit col is fixed to value 0 versus value 1 and the other bits are independent with LLRs Z row,col' , col ∈ N(row) \ col [10].It is necessary to realize, that equation ( 3) is really a shorthand for many equations -one for each check node and each incident variable node set.This is elaborated in further subsections.For the majority of practical decoders this equation is often approximated by a computationally much simpler equation omitting the expensive hyperbolic tangent functions [9]: sgn min The second step in the iterative LDPC decoding algorithm is the simple vertical (per variable node) summation of extrinsic messages to produce final posterior LLR estimate: The effect of the infinite confidence of fixed-value nodes on Min-Sum decoding procedure can be nicely demonstrated by a simple example E1 in the following subsection.

Example E1
For the LDPC code defined in Fig. 1 and second row of parity check matrix H (row = 2), equation ( 4) can be rewritten in the following way: First, let variable nodes v 2 and v 3 contain real channel observations so that their LLR metrics, denoted z 2 and z 3 , will be finite values.On the other hand, let v 6 be a fixed node with infinite absolute confidence z 6 .This translates to the messages propagated between the variable nodes in horizontal step.First, value row = 2 is substituted into (4) which really defines three equations -each for one variable node in the role of receiver of the message L row,col .For simplicity, we now ignore the signs of the messages: Values Z row,col are initialized to channel observation z col before the first iteration which makes all values Z row,6 = z 6 infinite.
Further by substituting values of col we get three equations: , min min From ( 7) to (9) it is now clear, that whenever the infinity value of Z row,6 enters the minimum operator, it will be discarded in favor of smaller amplitude values.Thus the infinite confidence of the fixed nodes, that intuitively seemed very promising in delivering potentially extra error correcting capability, is effectively discarded right in the first half (horizontal step) of the first decoder iteration.
The horizontal step is followed by the vertical summation (5) for each variable node, independent of other variable nodes.Therefore the variable Z 6 (1) remains infinity, while all others Z col (1) remain unaffected by the infinite confidence of Z 6 (0) .This remains true for all other iterations, regardless of their number.
Example E1 can be easily generalized for any LDPC code, provided that the degree of each check node is larger than 1, which is very much the case for all practical LDPC codes.This shows how and why the infinite confidence doesn't really translate to improve decoding under Min-Sum algorithm, which is later confirmed by simulations.

Practical Considerations
The simple design presented in the previous section has some interesting implications for communication stack implementations on a resource-constrained SoC.One of the goals of a modern Physical (PHY) layer implementation is to provide a good ACM -an Adaptive Coding and Modulation scheme that responds to dynamically changing mobile channel conditions by adjusting the parameters of error control code and modulation scheme, in order to provide a consistent Bit Error Rate (BER) and Frame Error Rate (FER) to upper layers.In industry-wide communication standards such as IEEE 802.16e [5] and IEEE 802.11ac [6] this is achieved by specifying several codes along with the set of their supported parameters.For instance, there are 6 different LDPC codes with 4 different code rates specified for use in [5], along with a set of 19 different codeword lengths.Each one of the codes is defined by its parity check matrix that needs to be stored in memory.Given the limited resources of a SoC, this may be a considerable problem.In this section we analyze the possibility of implementing just one of the LDPC codes and using the proposed method to obtain codes with different parameters by shortening of the single implemented code.The main purpose is to analyze all code parameters given in a communication standard and to evaluate what percentage of these codes can be replaced.A second goal is to provide a simple tabular overview of which exact code parameters can be implemented in this way.
First it is necessary to review the structure of LDPC codes used in modern communication standards.The H matrix's size is quite large, with number of variable nodes going to thousands.Therefore a compressed form of H matrix is usually required.The structure of the H matrix is defined by its partitioning to smaller square submatrices P, either all zero, or rotated identity matrices.H matrix can be easily written in the compressed form as show in Fig. 5 [5 The whole code structure can then be expressed by a much smaller integer-valued compact model matrix H b,p , each integer value represents a circular shift in the appropriate rotation matrix P. Six different LDPC codes and 19 different codeword lengths are standardized in [5].Table 1 provides a basic overview of the codes, along with their compressed model matrix sizes.The different codeword (and also dataword) lengths are implemented by expanding these matrices by a factor z ranging from 24 to 96.This defines a total set of 114 similar LDPC codes with a very flexible range of parameters.Tab. 1. Overview of basic codes in standard [5].The minimum and maximum code sizes are determined by possible values of the expansion factor z.

Code
While the flexibility of code parameters is appreciated, practical applications can perform quite well with only a small subset of this very large code parameter range.For the actual full size matrix H parameters k and n are determined from the size of the associated model matrix H b,p and the size of the expansion factor z by (11).
( ) where the conceptual expansion operation was described before.The possibility of various LDPC code parameters implemented by using just one code (with code rate R cb = 5/6) was analyzed with complete result summarization covering all the 114 codes provided in Tab. 2. The table is organized in four composite-columns, each one giving code parameters n, k, (n -k) or f, of a target code, with target code rates R c = {5/6, 3/4, 2/3, 1/2}.The purpose of this table is to provide information whether or not the target code can be implemented by shortening of the standard-defined highest code rate code (R cb = 5/6) defined in the first composite column.If it can be replaced, also the value f is nonzero and specifies the number of filler bits that must be shortened from the basis code.
For example: The first code of rate R c = 3/4 would have parameters n = 576 and k = 432 defined by the value z = 24.As indicated in the table, this can be replaced by shortening of a basis code with R cb = 5/6 and parameters n s = 864 and k s = 720.This must be shortened by f = 288 bits to get the desired (576, 432) code.Since such a basis code is part of the standard (such code parameters exist in the first composite column), it is possible to replace the original code with the shortened version of the basis code.The second code of rate R c = 3/4 would have had parameters n = 672 and k = 504 defined by the value z = 28.To replace this code with an equivalent code by shortening a basis R cb = 5/6 code with codeword size n s = 1008 and k s = 840 with value f = 336 bits would have to be used.However, such a code doesn't belong to the set of standard-mandated codeword sizes, and therefore the R c = 3/4 code cannot be replaced.This is indicated by setting the numbers of n s , k s and f in the second row of the second composite column in Tab. 2 to zeros.
Given a target code with code rate smaller than the base rate R cb , an appropriate basis code with R cb = 5/6 can be found by a simple algorithm referred to as Algorithm A2.

Algorithm A2:
Compute the values n, k, (n -k) based on basic matrix size dimension given in Tab. 1 and expansion factor z. Find compatible basis code parameters in the first composite column by comparing the (n -k) column.If such a code exists in the first composite column of Tab. 2, use its parameters n s and k s .Compute the number of filler bits f by subtracting k from k s .If such a code doesn't exist, indicate this by setting code parameters to zero.
Tab. 2 provides the results of similar computations for all the standard-defined codes.Out of the 76 combinations of code rate a codeword length, 35 are implemented using only one of the 6 codes specified -the basis code with R c = 5/6.That means 46% of the standard required code parameters are covered while saving design complexity.This seems like a reasonable tradeoff to be considered in future standards.4. Nonzero values provide the values of parameters of basic R cb = 5/6 code that must be used to obtain an equivalent-parameters code, including the number of bits to shorten f.All desired code parameters are defined by the expansion factor z.

Simulation Results
As already mentioned in the previous sections, the infinite confidence of the filler nodes can lead to a wrong expectation of improved error performance.This was already demonstrated to be wrong for Min-Sum decoding algorithm and further analyzed from the code structure perspective in [2], [3].Effects of shortening on different decoding schemes have to be analyzed separately.This analysis is to be performed in our future work.Despite this claim, there is indeed some shift in the waterfall curve present if the simulation is not designed carefully.This stems from a very simple fact that shortening a code by f data bits, while keeping the number of parity bits intact, the code rate of the new shortened code is lower than the code rate of the original code.This was already explicitly stated in equation (1).Therefore, in the following simulations, codes with same parameters are compared -one original code and one shortened code with the same parameter after shortening.Further simulation settings are classical: we compare the performance of a BPSK system in AWGN channel under the same Min-Sum decoding algorithm with 5 decoder iterations without any special decoder optimization, such as layered decoding.meaningless bit positions to be filled with zeros was implemented -the zero fill is one continuous block as shown in Fig. 4.Even without any optimization of fill positions, performance of the two codes is the same.
As shown in Fig. 7, the resulting waterfall curves almost overlap again.The minor difference comes from the slightly different code structures and may be considered negligible.This means that the shortened code is just as good as the original one, which is not surprising.

Conclusion
In this paper, we have shown how a time proven method of code shortening, used in the area of RS codes, can be successfully applied also in a novel different context of LDPC codes decoding.We expanded the existing effect analysis of shortening on code structure [2], [3] by providing a simple analysis of the effects of this code modification scheme on practical LDPC decoding algorithms, such as Min-Sum.A detailed analysis of potential simplification of LDPC code set currently used in modern communication standards is also provided, with results tabulated.Our analysis is then complemented by simulation results, confirming the theoretically expected effects of the proposed modification on system error performance.

Fig. 2 .
Fig. 2. Tanner graph for the code defined by H matrix in Fig. 1.

Fig. 4 .
Fig. 4. Principle of LDPC code shortening by fill and removal of meaningless filler bits fixed in value.Standard system (left).System with LDPC shortening (right).The "n." abbreviates "noisy" -for SISO decoding this means LLR values constructed from noisy channel observations, R c -code rate.

Fig. 5 .
Fig. 5. Compressed format of H matrix using rotation submatrices P, b-number of submatrix rows, p-number of submatrix columns.

Figure 6 Fig. 6 .
Figure 6 provides comparison between R c = 3/4A code, with an R c = 5/6 code, shortened so that the parame-
About the Authors... Tomáš PÁLENÍK (Ing., Ph.D.) was born in 1980.He received his Ing.degree in Telecommunications in 2006 from the Faculty of Electrical Engineering and Information Technology, Slovak University of Technology in Bratislava (FEI STU), Slovakia.In 2010 he received his Ph.D. degree in Telecommunications also from FEI STU.Currently, he is an assistant at the Dept. of Telecommunications, FEI STU in Bratislava, Slovakia.He is a Member of the IEEE.His research interests include digital communication systems simulations, Orthogonal Frequency Divi- Potential for implementation of the standard defined codes with different code rates with a single code and proposed shortening scheme.Zero values indicate that this code parameters cannot be implemented by shortening of the basic R cb = 5/6 code described in Sec.