EURASIP Journal on Applied Signal Processing 2005:5, 626–634 c ○ 2005 Hindawi Publishing Corporation Improved Iterative Parallel Interference Cancellation Receiver for Future Wireless DS-CDMA Systems

We present a new turbo multiuser detector for turbo-coded direct sequence code division multiple access (DS-CDMA) systems. The proposed detector is based on the utilization of a parallel interference cancellation (PIC) and a bank of turbo decoders. The PIC is broken up in order to perform interference cancellation after each constituent decoder of the turbo decoding scheme. Moreover, in the paper we propose a new enhanced algorithm that provides a more accurate estimation of the signal-to-noise-plus-interference-ratio used in the tentative decision device and in the MAP decoding algorithm. The performance of the proposed receiver is evaluated by means of computer simulations for medium to very high system loads, in AWGN and multipath fading channel, and compared to recently proposed interference cancellation-based iterative MUD, by taking into account the number of iterations and the complexity involved. We will see that the proposed receiver outperforms the others especially for highly loaded systems.


INTRODUCTION
One of the main shortages of the code division multiple access (CDMA) communication systems can be identified in their vulnerability to multiple access interference (MAI): hence, during the 90s, great attention has been devoted to multiuser detection strategies that constitute the natural answer to this problem; the receivers based on this technique aim to exploit the interference as an additional information source but, generally, are characterized by a very high complexity. Firstly, multiuser detection researches have been focused on uncoded systems; however, practical CDMA communications rely on the utilization of error control coding and interleaving so that, recently, more and more attention has been addressed to coded systems. Optimal joint decoding/detection is an excellent solution to this problem, as shown in [1]. However, this scheme results in a prohibitive computational complexity for actual implementa-tion. In contrast, suboptimal solutions, which separate the operations of symbol detection and channel decoding, appear more attractive for practical applications.
Particularly, the successful proposal of turbo codes [2] naturally leads to investigate the feasibility of iterative (turbo) processing techniques in the design of multiuser receivers. In the iterative multiuser detection, extrinsic information is determined in each detection and decoding stage and used as a priori information for the next iteration. This procedure is adopted at each iteration as in turbo codes: this detection philosophy is defined as turbo MUD and the advantages due to its introduction are remarkable, also for heavily loaded systems. In the last years, many iterative receivers that can achieve near-optimal performance [3,4] have been investigated, but with the drawback of the complexity, that is still exponential in the number of users. This is the reason of the attention addressed to interference cancellation schemes [5,6,7,8,9]. In this paper, we propose a new iterative multiuser detector based on the utilization of a parallel interference cancellation(PIC) and a bank of turbo decoders. In the proposed structure the PIC detector is broken up so that it is possible to perform interference cancellation after each constituent convolutional decoder. Due to the tight relationship between the proposed receiver and the MAP decoders, we defined it as MAP decoder aided PIC (MPIC): this solution aims to profit by IC introduction from the first iterations. Moreover, in the paper the variance of the noise-plus-(residual-)interference is determined by a new algorithm: particularly, only the most reliable symbols are used in variance determination, neglecting all the others. This solution affords performance improvement for all the considered systems.

SYSTEM MODEL
We consider an uplink DS-CDMA communication system with K synchronous turbo-coded users. Timing, amplitudes, carrier phases, and spreading sequences of all the users are assumed to be perfectly known at the receiving end in the base station. Each user encodes blocks of information bits u k,i with a parallel concatenated convolutional code (PCCC) and transmits the resulting codewords composed of I coded bits over a common AWGN channel with BPSK modulation. The equivalent baseband received signal can be written as where (i) T b is the bit interval; (ii) E bk is the kth user received energy; (iii) c k,i ∈ {+1, −1} is the bit transmitted by kth user during the ith bit period; (iv) p(t) is the unit-power rectangular pulse shape with duration T b ; (v) s k (t) is the kth user unit-power spreading sequence; (vi) n(t) is an additive white gaussian noise (AWGN) process with double-sided spectrum density σ 2 = N 0 /2 (W/Hz).
In the receiver a bank of matched filter is used for despreading. Without loss of generality, we can assume that the first bit interval is observed: therefore, the index i in all the these variables is dropped. As a result, the output of the kth matched filter is given by where ρ jk is the normalized cross-correlation coefficient between the jth and kth users and n k is the noise Gaussian sample of the kth user with distribution N(0, σ 2 ). The second term in (2) represents the MAI that has to be cancelled.

A NEW IC ITERATIVE RECEIVER
An iterative cancellator consists of an interference-cancellation (IC)-based multiuser detector (MUD) followed by K single-user turbo decoders. Each constituent block iteratively provides soft information to the others, as shown in Figure 1.
In the first multiuser detection iteration, the a priori information of the coded bits is not available, that is, L ap (c k ) = 0, k = 1, 2, . . . , K. The IC stage delivers interferencecancelled soft outputsỹ k to the input of the turbo decoders. After a fixed number of turbo decoder iterations, the extrinsic information of the coded bits at the output of turbo decoders are fed back to the input of the IC detector as the a priori information for the next receiver iteration.
The considered turbo codes are composed of two recursive systematic convolutional codes (RSC) which are connected by an interleaver and a MAP-based algorithm is used for iterative decoding [10]. Since the IC receiver requires soft information about the reliability of both the systematic and the parity bits, the decoding algorithm is properly modified MAP   to produce also extrinsic information about the latter [11]. At each new iteration, the iterative structure permits the multiuser receiver to have a more reliable a priori information and the decoders to operate on the soft inputs, in which a greater amount of interference has been cancelled.

The conventional iterative PIC receiver
In the conventional iterative parallel interference cancellation (PIC) receiver [12], at each IC stage, MAI is to be removed simultaneously from each user. Therefore, at the mth receiver iteration, the PIC soft output, that is, the turbo decoders input, can be expressed as whereĉ j (m) is the estimate at the mth iteration of the bit c j transmitted by jth user. Note that the second summation represents the residual MAI after cancellation. The data estimatesĉ k (m) have been chosen in [12] as the expectation of coded bits, that is equal tô in order to take advantage of the turbo decoders output. The term L ap (c k (m)) is the a priori log-likelihood ratio of bit c k at the mth iteration, which is defined as In the first receiver iteration no a priori information is available from the decoder output. Hence, as an initializing condition, it is assumed that L ap (c k (0)) = 0, k = 1, 2, . . . , K.
Instead, in the successive iterations the extrinsic information coming from the decoders can be used, leading to L ap (c k (m)) = L ex (c k (m − 1)).

The proposed iterative MPIC receiver
As it is known, as the number of decoding iterations increases, the coding gain offered by a turbo decoder becomes larger. However, the performance improvement obtained by the turbo codes is remarkable in the first iterations, and more and more negligible in the successive ones. This remark suggests to concentrate the significant part of interference cancellation in the first iterations: for this reason many IC-based iterative receivers with a first linear stage have been proposed [7,8,13]. Nevertheless, a linear MUD has the drawback of extremely high computational complexity. In this paper we present an iterative PIC receiver, which exploits most interference-cancellation in the first iterations: a conventional iterative receiver tries to cancel the MAI from all the parity and the systematic bits only once at the end of each decoding iteration. Note that in turbo decoding iteration, the first MAP decoder provides extrinsic information about all the systematic and half of the parity bits while the second decoder gives the extrinsic information about the interleaved systematic bits and the second half of the parity bits; as a consequence, in the proposed receiver the interferencecancellation is performed after each decoding iteration and the systematic bits are cancelled twice. Because of the tight connection between the proposed receiver and the MAP decoders, we have named it as MAP decoder aided PIC (MPIC). The structure of this receiver is shown in Figure 2. Note that the subscripts s, p 1 , and p have been added in this figure to describe thoroughly the performance cancellation operation.
Assuming that R c is the turbo encoder code rate and that only the parity bits are punctured at the transmitter end, the computational complexity needed by the MPIC to perform interference cancellation at each iteration is (1 + R c ) greater than the conventional PIC. However it will be shown by simulation that the proposed iterative receiver performance is better than the conventional one even with equal complexity. This goal has been reached by removing from the last iterations so many cancellation stages equal to the ones added in the first iterations.

An improved nonlinear MMSE estimator
The tentative decision device (4), which is used to estimate the data bits and to reconstruct and cancel the MAI, exploits only the extrinsic information from the decoders but not channel outputs; hence, only a fraction of available information sources is used in the decision. An improved data estimator can be obtained by exploiting jointly decoders and channel outputs, as in [6], where an ML estimator is used, producing tentative hard decisions [14]. Unlike [6], we derive a nonlinear MMSE estimator, that can also provide the reliability value of each estimated bit. The decisionĉ k (m), for kth user at the mth receiver iteration, is taken as the expectation of c k , given the channel output, that is, Under the assumption that the interference can be modelled as a Gaussian process, it can be demonstrated that where σ 2 k is the thermal noise-plus-interference variance, given by σ 2 k = σ 2 + σ 2 k,MAI . The proof of (7) can be found in the appendix. As it is shown in [15], the combination of the channel output and the extrinsic information in the decision statistic yields a biased residual interference term which tends to cancel the useful signal. However, the computer simulations confirm that a better performance is achieved by using all the information sources and that the mitigation of the bias effect is obtained after few iterations.

Variance estimation
The tentative decision device needs to know the noise-plus-(residual-)interference variance, such as the channel reliability value used in the MAP decoding [2]; hence, an appropriate estimation method has to be used. In [5,16] a simple algorithm based on the data samples has been proposed; particularly, therein the variance is calculated aŝ where the meanỹ k,i (m) has been either derived by the samples [16] or assumed equal to the square root of the received energy E bk [5]. As it will be confirmed by the simulation results, this algorithm leads to the underestimation the variance in the case of a low signal-to-interference-plus-noise ratio (SINR). The motivation of this behaviour is the following: assuming that the (residual-)interference is Gaussian, the conditional probability density functions (pdf) of the Figure 3: Decision variable probability density function.
variableỹ k,i (m), that is, p(ỹ k,i (m) | c k = ±1), are also Gaussian. As a consequence, the unconditional pdf ofỹ k,i (m) is given by where the last equality follows from the assumption that the transmitted bits are independent and identically distributed (i.i.d.). As it can be seen from Figure 3, the valuesỹ k,i (m), satisfying the inequality |ỹ k,i (m)| < E bk , are ambiguous, in the sense that they belong to the region of maximum superposition of the conditional pdfs. For this reason we have chosen to evaluate the variance only by the values satisfying |ỹ k,i (m)| ≥ E bk . This original variance definition can be used to improve the variance estimation through the formulâ where I 1 ≤ I indicates the number of such values in a codeword.

ITERATIVE MPIC FOR MULTIPATH SLOWLY FADING CHANNELS
Wireless CDMA systems usually have to cope with multipath fading environments. The multipath fading channel is often assumed to be wide-sense stationary with uncorrelated scattering [17]. Based on this assumption, the received baseband DS-CDMA signal can be written as where L is the number of propagation paths and α l k (t) and τ l k are, respectively, the time-varying complex attenuation factor and the relative delay of the lth path of the kth user signal. Besides, we assume that E{ L l=1 α l k 2 } = 1/T b , so that E bk is the received averaged energy of the kth user, and that the channel fading is so slow that α l k (t) does not change in a chip interval and thus can be denoted as α l k (h), where h refers to chip position in the codeword. The chip matched filter output for the lth path of the kth user is then given by where T c is the chip interval that is tied to the bit interval through the processing gain G = T b /T c .
If a maximal ratio combiner (MRC) RAKE receiver is used, the decision statistic can be expressed as In the mth iteration of the iterative PIC receiver, the kth received signal after the MAI subtraction is given by where the bit tentative decisionĉ j,i (m) is given by (7), while the variance estimation formula has to be modified to take into account the fading attenuation. The conditional pdfs p(ỹ k,i (m) | c k,i = ±1, α l k ) are Gaussian and (10) can be rewritten aŝ where α k (i) is the bit-averaged fading attenuation after recombination and it is computed as

SIMULATION RESULTS
In this section we investigate the effectiveness of the proposed receiver through computer simulations. In order to mitigate the complexity burden due to the implementation of a nonlinear decision device, the tanh(·) function has been approximated by an eight-value look-up table. For all simulations we use a rate R c = 1/2 turbo code, composed of two 8-state RSC codes with generator polynomials G 0 = (13) 8 G 1 = (15) 8 , and the block interleaver recommended by the UMTS standard [18]. We compare the proposed iterative MPIC receiver with the conventional PIC [6] and the iterative partial PIC (PPIC) receiver proposed in [6] and derived from [14]. First, we analyze the performance in a synchronous AWGN channel. The system has 10 equal-power users with pseudonoise short codes, processing gain G=16, and frame length 800. The quantized log-MAP algorithm is used for the decoding [10]. Figure 4 shows the decision variables histogram at the output of the MF bank and of the PIC receiver after 1, 2, 3, and 6 iterations for a signal-to-noise-ratio E b /N 0 = 10 dB, compared with the Gaussian distribution with variance calculated through (10). We can remark the interference reduction through iterations and the accuracy of enhanced variance estimation, while a bias is evident just in the first iteration. In Figure 5 the performance improvement obtained using the enhanced variance estimator in comparison with the basic one, derived from (8), is shown. Figure 6 shows the performance gain achieved by the proposed receiver by increasing the number of iterations. At a BER of 10 −4 and with 16 iterations the loss from the singleuser bound is about 0.8 dB. A comparison of performance versus iterations between the iterative PIC, MPIC, and PPIC is made in Figure 7 at E b /N 0 = 3 dB. The cancellation weights for PPIC have been found by a trial-and-error searching method, as p 0 = 0.55, p 1 = 0.65, p 2 = 0.75, p 3 = 0.85, p 4 = 0.92, p 5 = 0.97, p i≥6 = 1.0.
Then, we illustrate the performance of the proposed receiver in an asynchronous 3-path fading channel with perfect power control: the replica amplitudes are assumed to be Rayleigh distributed, with relative attenuations equal to 0, −1, −9 dB. Processing gain is G = 15 and the spread-   ing operation is performed by a set of pseudonoise short codes. The frame length is 500, a MRC-RAKE receiver is used as path-combiner and the max-log-MAP algorithm is used for the decoding. The channel has a normalized frequency Doppler f d T b = 0.0002 and just 2 receiver iterations are performed. The other transmission parameters are the same as the AWGN case. In Figure 8 it can be seen that in a system with 20 active users, that is, with a system load β = N/G   greater than one (β = 1.33), the MPIC outperforms the conventional PIC: a gain of 3.5 dB for the MPIC versus the PIC receiver at a BER of 10 −3 is observed. In this figure, the great benefit obtained for both the receivers is shown by using a correct estimate of the noise-plus-(residual-)interference variance rather than approximating it by the ambient noise, assumed perfectly known, as in [6]. Figure 9 shows that 25 20 15 10   the iterative MPIC receiver can support heavily loaded DS-CDMA systems. This behaviour can be exalted with the assistance of low cross-correlations spreading sequences, as shown in Figure 10 where the MPIC receiver reaches singleuser performance in a fully loaded DS-CDMA system, with normalized Doppler frequency f d T b = 0.0004, processing gain 31, and 31 equal-power users using Gold31 codes.

CONCLUSION
In this paper, a new turbo multiuser detector for turbocoded DS-CDMA systems has been presented. In the proposed detector a parallel interference cancellation (PIC) has been divided in order to perform interference cancellation after each MAP constituent decoder. Moreover, an improved algorithm to estimate the variance of the noise-plusresidual-interference has been derived. Performance results of computer simulations for different system loads, in AWGN and multipath fading channel, confirm the excellent behaviour of the proposed receiver. It is shown that it outperforms recently proposed alternative IC-based turbo MUD with the same complexity.

APPENDIX
As stated in Section 3.3, the decisionĉ k (m), for kth user at the mth receiver iteration, is taken as the expectation of c k , given the channel output, that is equal tô .
By invoking Bayes theorem, the decision variable is equal to By taking into account that .
Hence, the decision rule for the nonlinear MMSE estimator isĉ (A.6) 1 The proof of (A.4) can be found in [11, equation (