Performance of Turbo Interference Cancellation Receivers in Space-Time Block Coded DS-CDMA Systems

We investigate the performance of turbo interference cancellation receivers in the space time block coded (STBC) direct-sequence code division multiple access (DS-CDMA) system. Depending on the concatenation scheme used, we divide these receivers into the partitioned approach (PA) and the iterative approach (IA) receivers. The performance of both the PA and IA receivers is evaluated in Rayleigh fading channels for the uplink scenario. Numerical results show that the MMSE front-end turbo space-time iterative approach receiver (IA) e ﬀ ectively combats the mixture of MAI and intersymbol interference (ISI). To further investigate the possible achievable data rates in the turbo interference cancellation receivers, we introduce the puncturing of the turbo code through the use of rate compatible punctured turbo codes (RCPTCs). Simulation results suggest that combining interference cancellation, turbo decoding, STBC, and RCPTC can signiﬁcantly improve the achievable data rates for a synchronous DS-CDMA system for the uplink in Rayleigh ﬂat fading channels.


INTRODUCTION
The presence of multiple access interference (MAI) in CDMA systems has led many researchers to investigate ways of exploiting the MAI to improve the system performance. The optimum multiuser detector (MUD) proposed in [1] that consists of the maximum-likelihood sequence estimator (MLSE) based on the Viterbi decoding algorithm has shown huge improvements over the conventional correlation receiver. Unfortunately, as the number of users increases so does its computational complexity. This complexity grows exponentially with the number of active users and constraint length of the code making any practical implementation very prohibitive. Various suboptimum detectors have been proposed, which include, but not limited to, decorrelator, minimum mean squared error (MMSE), successive interference cancellation (SIC), and parallel interference cancellation (PIC) receivers [2,3].
The demand for higher system capacity and higher data rates has led researchers to the investigation of MIMO wireless systems [4]. The implementation of STBC is particularly appealing because of its relative simplicity of implementation and the feasibility of multiple antennas at the base station where the MIMO costs can be evenly shared by the system users [5]. When many users are in the system, strong MAI will occur. In this case, diversity processing alone cannot improve the system performance.
Joint detection and decoding in multiuser systems have been an active research area in recent years. Papers like [6] have investigated the combined optimum detector [1] and convolutional decoding system performance. Due to the exponential complexity of the receiver in [1], the authors of [6] propose suboptimal MUD with convolution coding in [7]. By integrating a combination of various suboptimal MUDs with iterative channel decoding, the authors of [8] introduce a convolutionaly coded iterative interference canceller.
The powerful error correction ability of the turbo codes [9] has been combined with interference cancellation in [10] to produce the turbo interference cancellation detection approach. The work of [10] has further been studied in [11,12] with further work being done by [13,14]. Though the above work investigates the combined MUD and error control coding performance, it still does not investigate these in conjunction with diversity techniques. Recently, much work has been done on combining diversity techniques with MUD algorithms [15][16][17]. Some authors like [18] have proposed iterative MUD techniques using error control coding and antenna arrays while in [19] a soft iterative multisensor array receiver for coded MUD CDMA wireless uplink is proposed. Most recently work in [20] investigates the joint DS-CDMA space-time MUD system with error control coding over a multipath fading channel. The authors of [20] use convolutional coding for error control coding and a space-time MMSE detector at the receiver end. The authors of this thesis in [21] investigate the performance of IA and PA schemes for a turbo coded asynchronous DS-CDMA system that employs space-time multiuser detection in a Rayleigh fading channel. However, as seen in [22], a non-MMSE front-end turbo receiver does not provide as much capacity gains as its MMSE front-end counterpart.
The objective of this paper is to investigate the performance (through simulation) of a synchronous turbo coded DS-CDMA system that employs an MMSE front-end turbo space-time multiuser detector at reception propagating through a Rayleigh fading channel. We use an MMSE/PIC MUD coupled with STBC to achieve space-time multiuser detection. Depending on the concatenation scheme used, we divide these into MMSE front-end partitioned approach and MMSE front-end iterative approach receivers, herein thereafter referred to as PA and IA receivers, respectively. We further study these receivers in conjunction with rate compatible punctured turbo codes (RCPTC) in turbo spacetime coded MIMO-CDMA systems and investigate possible ways of achieving higher data rates in DS-CDMA uplink.
The remainder of this paper is organized as follows. In Section 2, we present the turbo space-time coded MIMO-CDMA system model. Section 3 presents MMSE space-time receivers for coded MIMO-CDMA systems. In Section 4, we present turbo space-time receivers that employ rate compatible punctured turbo codes. The numerical results are discussed in Section 5, and Section 6 concludes this paper.

TURBO SPACE-TIME CODED MIMO-CDMA SYSTEM
A MIMO-CDMA system that employs turbo codes and space time block codes is investigated. The main focus is at the receiver end where two multiuser receiver structures are investigated and compared. The turbo space-time MIMO-CDMA system depicted in Figure 1 is considered. The system has K active users, with each kth user's data b k , of duration T b , being first encoded by a rate r = 1/3 turbo encoder resulting in coded bits d k .
The coded symbols are then passed through the channel interleaver. All the interleaved data is demultiplexed by the space-time demultiplexer (ST-Demux), into substreams. For the kth user, the demultiplexed symbols are then spread before transmission using that user's spreading sequence c k of duration T c . All substreams are BPSK modulated.
Each user transmits its substream through n T transmit antennas. The transmitted data per symbol time can be described as where Each transmit antenna, m, has an average transmitter power of (B m k ) 2 , where (B k ) 2 is the kth user's overall average power. It is assumed that all transmit antennas have equal transmit power of All the n T transmitted data streams for all K users are combined during the wireless transmission process. A synchronous Rayleigh flat fading uplink MIMO-CDMA channel is considered.

MMSE SPACE-TIME RECEIVERS FOR CODED MIMO-CDMA SYSTEMS
The received signal on the ηth receiver antenna is given by where c k (t) is the kth user's spreading sequence, and n η (t) is the AWGN on the ηth receiver antenna.
Here, H η,m k represents the fading factor from the kth user's mth antenna to the ηth receiver antenna. To facilitate the expressing of (4) in discrete-form, we express H as a Kn R × Kn T diagonal matrix whose elements are the submatrix H k : The MIMO-CDMA spreading matrix can be represented by a Nn R × Nn R matrix as Turbo encoder Turbo encoder where Furthermore, the MIMO-CDMA amplitude matrix can be represented by a Kn T × Kn T matrix as where The discrete-time representation of the received signal is expressed in the conventional matrix form as Each of the n R receiver antennas is responsible for the capturing of the transmitted signals from the fading channel. The received signals are combined and dispread by a bank of matched filters (MFs). The bank of MIMO MF will be matched to the corresponding user's signature waveform and also to the fading factors of all receiver antennas. The maximum-ratio combining (MRC) technique is used to combine all the MF outputs. This combining and dispreading process will be repeated for all n T transmit antennas.
The MIMO MF output is written as where H, C, and B are given by (5), (7), and (9), respectively, where In (15), y MF,m k represents the kth user's MF output for the signal received from transmit antenna m given by The correlation between the kth and jth user is From (12), the combined correlation matrix can be expressed as The MF output signals, y MF , are fed into the MMSE multiuser antenna to suppress the MAI. The output of the MMSE multiuser-antenna detector is given by where I is a Kn T × Kn T identity matrix.  Figure 2: MMSE front-end turbo space-time PA receiver structure.
multiplexed signal y MMSE k is then deinterleaved before it is decoded by the turbo decoder. Here, p decoder iterations may be performed before a hard decision is taken on the turbo decoder output. However, the focus of this work is the use of the MMSE space-time receiver in a turbo PIC receiver configuration.

MMSE front-end turbo space-time partitioned approach receiver
The MMSE front-end turbo space-time PA receiver for the MIMO-CDMA system is shown in Figure 2.
The outputs of the MMSE receiver are passed onto the PIC detector where p IC stages are performed on the multiplexed MMSE output signals y MMSE k . After p IC stages, the signals y PIC,p k,m are then multiplexed by the ST-Mux before being deinterleaved.
The PIC detection output after multiplexing is given by

MMSE front-end turbo space-time iterative approach receiver
The MMSE front-end turbo space-time IA receiver structure is shown in Figure 3. The PIC estimates the signal interference present on the received signal by reconstructing it from the data estimates d i j and the cross-correlation values χ m,i k, j and removing it from the MMSE output signal (note: on the first iteration there will be no reconstructed estimates of the signal interference).
The output of the PIC detection process is given by where The resultant signal y PIC is expected to be improved, after the reconstructed interference is subtracted from the y MMSE signal. This signal is multiplexed and fed into the turbo decoder. A soft decision is taken on the decoded signal (which consists of both information and parity LLR values). These data estimates are demultiplexed by the ST-Demux to recover the space-time MIMO-CDMA form.
These demultiplexed data estimates are used in the MAI reconstruction process. The reconstructed interference is subtracted from the y MMSE signal on the next iteration. This iterative process is repeated for p iterations.

TURBO SPACE-TIME RECEIVERS WITH RATE COMPATIBLE PUNCTURED TURBO (RCPT) CODES
The RCPT encoder will turbo encode the input data sequence of length L in into a coded sequence of length L out . The length of the coded sequence L out depends on whether the zero termination bits, (tail-bits) used for trellis termination, are included or not. L out is given as where (25)     them in this paper since excluding them in the transmission can result in degradation in the MAP decoder performance and/or increased delay in iterative decoding [23]. For a r = 1/M parent encoder, a family of higher rate codes given by where P is called the puncturing period. These are constructed by employing a M×P puncturing matrix P M (l). This matrix indicates the number of subblocks to be transmitted. An entry of 1 in P M (l) indicates a column to be transmitted, where the first row of P M (l) refers to the systematic matrix and the subsequent rows (i.e., 2 to M) refer to parity matrix from constituent encoders, RSC1 to RSC (M − 1). We consider an example of a rate 1/3 turbo encoder with two rate 1/2 RSC encoders and a puncturing period P = 4: From the first row of P M (2), we note that all P = 4 columns of systematic bits are sent. From the second row, only the third column of RSC1's parity bits is sent and from the last row, only the first column of RSC2's parity bits is sent. The reader is referred to [23] for a complete list of possible puncturing tables for different turbo code generators, and their derivation.
The optimal puncturing tables with puncturing period P = 8, given in [24,Table IV], are used to achieve the higher order code rates.
If no parity symbols have been received for two or more RSC encoders, then iterative decoding will not be possible as the corresponding decoders will be excluded in the iterative process [23]. In order to take advantage of the iterative MAP decoders, more parity symbols will be transmitted, and the possibility of puncturing some of the systematic symbols arises [24].

NUMERICAL RESULTS
In this section, we consider the simulated performance of a synchronous turbo coded DS-CDMA system that employs an MMSE front-end turbo space-time multiuser detector at reception. The communication model considered consists of K active users that transmit simultaneously and synchronously through a Rayleigh fading channel. Monte Carlo simulations are used to obtain the performance of the turbo receivers. The receivers all assume perfect knowledge of the channel state information. The maximum number of active system users is K = 15, and each user transmits an information frame size of L in = 1024 data bits. The FEC code used is a rate r = 1/3 turbo code with a component encoder with generator polynomial (7, 5) octal . All spreading codes are of length N = 15 and are generated in a pseudorandom manner for each user.
The uplink of the above system is considered with a maximum of 2 transmit antennas at the mobile station and a maximum of 2 receive antennas at the base station.

Comparison on simulated nonpunctured PA and IA receiver performances
For each approach, we perform four iterative cancellation stages (or joint cancellation stages in the case of IA) thus giving a fair comparison, in terms of complexity, between the two systems as both are viewed to perform the same number of floating point operations per user per symbol, however indepth complexity issues are not discussed in this paper. Figure 4 shows the performance comparison of the PA and IA receivers over four receiver iterations for a system with K = 5 users. The results show that the IA achieves marginal gains in 4 iterations and reaches a BER of 10 −3 at SNR of 1.4 dB while the PA receiver maintains the same performance at an SNR value of 1.6 dB for the 2×2 diversity system. The IA advantage in terms of capacity for low-loaded systems seems to be very marginal. This observation holds even for the case of a no diversity system. However as the system load increases, the performance gains of the IA receiver become more obvious as indicated in Figure 5. This graph shows the capacity performance of both IA and PA receivers in a 2 × 2 diversity system configuration evaluated over four receiver iterations. Depending on the diversity configuration employed, it can be noted that the IA receiver maintains a considerable capacity gain over the PA  receiver for a BER performance of 10 −3 at an SNR value of 2 dB for this diversity system configuration. A more in-depth look into the performance of both PA and IA receivers as a function of the number of iterations is shown in Figure 6. Worth noting are the observations made from Figure 6 for highly loaded systems: the PA receiver reaches an error floor just under a BER of 10 −1 , and no amount of additional iterations can improve the performance of this receiver. In contrast, a highly loaded system performance of the IA receiver reveals that more performance improvement is attainable with an increase in the number of iterations.

Comparison on simulated punctured PA and IA receiver performances
In this section, we investigate the RCPTC scheme based on a rate r = 1/3 mother code for a Rayleigh fading channel model. The data bits of each user for the rate r = 1/3 encoder are assigned according to puncturing [24, Table IV] with puncturing period P = 8. For performance evaluation purposes, we consider values of l = 2, 8, and 16 thus giving rates r = 4/5, 1/2, and 1/3, respectively. These code rates are adopted for the PA and IA receivers. Since full rate spacetime block codes are being used, the overall code rate of both systems is not affected, thus the puncturing pattern used determines the total system code rate. Furthermore, we assume that the effects of puncturing on the overall system complexity are negligible. This assumption can be quantified by reasoning that puncturing merely involves the removal of a subset of the encoded bits at transmission and the addition of dummy bits at the receiver end.
Simulations were conducted to investigate the degree of performance degradation due to the implementation of punctured rates r = 4/5 and r = 1/2 for a single user system  with no diversity and also a 2 × 2 diversity system both which are bench-marked against the rate r = 1/3 equivalent system. Figures 7 and 8 show the punctured IA receiver BER versus SNR performance graphs for the K = 5 and K = 15 user systems, respectively. Simulations are considered for a synchronous system with N = 15 for both nondiversity and 2 × 2 diversity turbo coded systems employing an iterative approach detection scheme at reception.
In both graphs, there is expected system degradation due to MAI. The higher code rate shows a further loss in performance for both the K = 5 and K = 15 systems.   Figure 10: Punctured multiple user BER performance as a function of the code rate at SNR = 2 dB for IA receiver.
The effects of increasing the system capacity coupled with an increase in the system code rate can be better observed in Figure 9 for the PA system and Figure 10 for the IA system. Figure 9 shows punctured BER performance as a function of the code rate at SNR = 2 dB for PA receiver with system loads of K = 5 and K = 15. The single user performance graphs for both the nondiversity and 2 × 2 diversity systems are also given for comparison reasons. From Figure 9, it is clear that at such a low SNR value, the multiple user systems fail to reach the 10 −3 BER performance threshold for both nondiversity and 2 × 2 diversity systems.

EURASIP Journal on Wireless Communications and Networking
This poor performance can, however, be attributed to the choice of receiver used. Figure 10 illustrates the simulated punctured multiuser BER performance as a function of the code rate at SNR = 2 dB for an IA receiver.
From Figure 10, it is observed that the nondiversity systems for all system loading values perform similarly to that of the PA receiver and fail to achieve the performance threshold. However as the diversity is increased to 2 × 2, the IA system performs much better than the PA system and attains the performance threshold at a code rates of r = 0.39 and r = 0.32 for the K = 5 and K = 15 systems, respectively.

CONCLUSION
In this paper, two turbo interference cancellation receivers are discussed and are divided into the MMSE front-end turbo space-time partitioned approach receiver (PA) and the MMSE front-end turbo space-time iterative approach receiver (IA). Numerical results reveal that for an equal number of receiver iterations both IA and PA receivers achieve approximately the same performance for a lightly loaded system at any given performance threshold. However as the system load increases, the IA starts to gain sizable performance and capacity gains over the PA receiver. Important to note is that the PA receiver (as compared to the IA receiver) is seen to attain no further performance or capacity gains with an increased number of iterations for the case of a highly loaded system. This poor PA performance can possibly be attributed to the poor parity data decoding performance characteristic of turbo codes.
Rate compatible punctured turbo codes are investigated in a turbo space-time coded MIMO-CDMA system as a possible way of achieving higher data rates in DS-CDMA uplink. Results show that by using two transmitting antennas and two receiving antennas, there is a higher attainable data rate when compared with the nondiversity system. There is, however, a limit to the degree of puncturing that can be done, this limit is generally dictated by the required system performance threshold.
With an increase in SNR, the stipulated system performance can even be attained by using higher code rates, thus significantly increasing the achievable data rates. However, it is observed that as the system load increases the degree of freedom on puncturing becomes greatly reduced. This is attributed to the choice of receiver being employed at reception. The IA receiver is observed to be a better receiver choice than the PA receiver when considering the achievable data rates in a heavily loaded CDMA system.