EURASIP Journal on Applied Signal Processing 2005:5, 611–625 c ○ 2005 Hindawi Publishing Corporation A Kalman-Filter Approach to Equalization of CDMA Downlink Channels

An efficient method for equalization of downlink CDMA channels is presented. By describing the observed signal in terms of a state-space model, the method employs the Kalman filter (KF) to achieve an unbiased signal estimate satisfying the linear minimum mean-squared error (LMMSE) criterion. The state-space model is realized at the symbol and chip levels. With the symbol-level model, the KF is used to estimate the transmitted chips that correspond to each symbol interval; whereas at the chip level, the transmitted chips are estimated individually. The symbol-level KF has a built-in tracking capability that takes advantage of the a priori known scrambling sequence, which renders the transmitted signal nonstationary. The chip-level KF reduces the complexity of the symbol-level KF significantly by ignoring the nonstationarity introduced by scrambling. A simple method for further reducing the KF complexity is also presented. The computational complexity of the proposed technique is analyzed and compared with that of several linear approaches based on finite-impulse response (FIR) filtering. Simulations under realistic channel conditions are carried out which indicate that the KF-based approach is superior to FIR equalizers by- in error-rate performance.


INTRODUCTION
The existence of dispersive effects of the channel, such as multipath, destroys the orthogonality of the spreading codes in CDMA systems. As a result, the conventional RAKE receiver in many cases reaches a noise floor at a fairly high frame error rate [1,2]. There are two major categories of methods by which channel dispersion can be dealt with, namely multiuser detection and single-user detection. In multiuser detection, which generally requires knowledge of all user codes, the channel is equalized and every user is detected so that this approach is suitable for uplink CDMA. In downlink CDMA, however, all users share the same physical channel and multiuser detection is inefficient since only one user needs to be demodulated. To efficiently demodulate just the desired user, the channel is equalized in order to restore the code orthogonality inherent in the signal structure. Restoration of code orthogonality is thus motivated by the fact that to demodulate the desired user, (i) the receiver does not need to know the spreading codes of the interfering users, and (ii) the computational complexity of the detection mechanism does not increase with the number of users.
Well-known methods in the literature for orthogonality restoration rely on the LMMSE criterion attained via FIR filtering, which can be realized at the chip or symbol level in adaptive or batch form. For each segment of data over which the observation sequence is considered stationary, the batch implementation requires the solution to a matrix equation which can be obtained by means of LU decomposition or matrix inversion. Frequent matrix solution imposes a heavy computational burden. In an effort to reduce this computational cost, an approximate solution to the matrix inversion problem is developed in [3] based on FFT and IFFT operations by taking advantage of the near-circulant Toeplitz structure of the observed-data autocorrelation matrix. We note parenthetically that symmetric Toeplitz equations can also be solved by means of the Levinson recursion [4,5,6] whose complexity is on the order of the square of the system dimension. Even for stationary channels, the symbol-level LMMSE equalizer of [7] requires one matrix solution per symbol period. This equalizer in essence is a chiplevel FIR (cFIR) LMMSE equalizer that encompasses the descrambling and despreading operations, normally done outside the equalizer. Its objective, however, is to minimize the error variance of the symbol rather than chip estimate. On the other hand, matrix inversion can be avoided by means of chip-adaptive FIR equalizers. Adaptive implementations typically employ well-known algorithms such as stochastic gradient, LMS, and RLS. Adaptive FIR equalizers unfortunately tend to be unstable, sensitive to initialization, and slow to converge. Examples of stochastic gradient descent implementations of the LMMSE criterion can be found in [8,9].
An attractive alternative to FIR filtering is to use the Kalman filter (KF) which we consider in this paper. The KF is preferred to FIR approaches for several important reasons. First, the KF has the capability to track nonstationarities which result typically from the time-variant characteristics of the channel, noise processes, and the underlying signal to be estimated. Also, the KF is well known as an optimal linear estimation method in the mean-squared error (MSE) sense.
To clearly differentiate our proposed method from other applications of the KF, it is worth discussing CDMA statespace formulations proposed by several researchers. Most notable perhaps is the work of Iltis et al. [10,11] where a state-space model is developed for estimation of the path gains and delays of multipath channels. The observation equation of this model depends nonlinearly on the path delays and is therefore linearized with respect to these variables so that the extended Kalman filter (EKF) [12] is suitable for parameter tracking. Further, in [11], multiuser detection at each time epoch is performed based on the maximum a posteriori probability (MAP) criterion evaluated from the estimated model up to the previous time epoch; in this respect, one can view the multiuser detector as one of decisionfeedback type. Similar use of the EKF with linearized statespace models for estimation and tracking of multipath gains and delays can also be found in [13,14,15,16]. For the case of flat Rayleigh fading channels, relative delay estimation is no longer necessary since there is only one path to consider. If the fade process admits a Gauss-Markov model, then it is possible to employ the KF in the decision-feedback mode to track the fade [17,18,19].
Apart from the above applications of the KF, we discuss several state-space formulations which are more closely related to our proposed approach. In particular, the state-space models of [20,21,22,23] have a resemblance to one another as seen from the fact that the measurement matrix consists of all spreading codes and channel coefficients, while the state vector consists of multiuser data symbols. Similarly, the models of [24,25] multiplicatively lump the channel coefficients and user symbols into the state vector while the spreading codes are incorporated in the measurement matrix. It should be noted that the state-space model of [21] for multiuser detection (not equalization) is a special case of that developed in [20] without channel dispersion.
In consideration of complexity, it is worth paying special attention to the state-space model of [24]. In this model, each element of the state vector is the product of a channel tap value and a user-transmitted symbol. What makes this formulation interesting, in particular, is the fact that it yields the symbol estimate, rather than chip estimate, while the KF for this model operates at the chip level rather than symbol level since the model supplies chip-wise observations. This strategy avoids matrix inversion in computing the Kalman gain, while taking advantage of the ability of the KF to handle nonstationary state dynamics which occurs at the symbol boundaries.
In conclusion, we note that all of the above-mentioned state-space formulations require knowledge of all spreading codes, normally unavailable in practical downlink CDMA receivers. Such models are appropriate for multiuser detection and equalization, perhaps in the uplink, but inefficient for channel equalization and detection of just one desired user in a downlink multiple-access channel. We observe also that none of the aforementioned state-space models encompass a scrambling code, except for [13,14] which include the possibility of handling long codes but no scrambling is incorporated explicitly. Scrambling codes are used in the original and evolving CDMA standards such as IS-95 and 1X EV-DV to mitigate intercell interferences; their use, however, renders the transmitted signal nonstationary. This nonstationarity is taken into account by our symbol-level state-space formulation.
An anonymous referee points our attention to [26] published during the review phase of this paper. In [26], a nonlinear state-space model is developed for joint channel estimation and equalization via the EKF. Though the model is valid, there appears to be an implementation error in [26,Section V.B.] in applying the EKF to the linearized model: in addition to replacing the measurement matrix H k in [26, (8), (9), and (10)] with the Jacobian H k as indicated, the observation z [k] in [26, (8)] must also be replaced with No mention about this later replacement is made in [26]; if this was indeed overlooked, it could be the main reason causing the performance failure of the EKF reported in [26, Figure 1]. Nonetheless, the model presented in [26, Section IV] represents a special case of our chip-level statespace model applied to single-input single-output systems.
In this paper, we elaborate on the results presented in [27,28] where we apply the Kalman filter to single-user detection in downlink CDMA. The objective is to combat channel distortions by restoring the signal orthogonality so that demodulation can be done for just the desired user. To achieve this objective, we develop two state-space models for downlink CDMA channels, namely the symbol-and chip-level models. The symbol-level model incorporates the nonstationarity of the transmitted signal and, hence, has an enhanced tracking capability. On the other hand, the chip-level model ignores this nonstationarity in order to achieve a lower computational complexity. In addition, the computational complexity of the proposed method is analyzed for each statespace model and compared with several FIR approaches.
To take advantage of slowly fading channels which exist in many environments, a complexity reduction technique is described which yields a simple mechanism for making tradeoffs between tracking and complexity. We demonstrate also that the proposed state-space models have a signal delay structure so that the KF naturally yields fixed-lag signal estimates. In this paper, we assume that the channel impulse response is available since it can be estimated relatively accurately from the CDMA pilot tone having sufficiently high power.
The rest of the paper is organized as follows. In Section 2, we develop the symbol-and chip-level state-space models for the downlink CDMA channels. In Section 3, a complexity reduction technique is described. Fixed-lag smoothing via several ways of model augmentation is discussed in Section 4. Section 5 contains a complexity analysis and a brief description of several FIR filtering methods to which the performance of the KF techniques is compared. In Section 6, results of numerical simulations obtained in accordance with industrial standards under realistic multipath fade conditions are given. Finally, concluding remarks are given in Section 7.

STATE-SPACE DESCRIPTIONS OF DOWNLINK CDMA CHANNELS
For a multiple-input multiple-output (MIMO) system with M transmit and N receive antennas, we assume that the transmit antennas emit uncorrelated data streams. Nonetheless, the same set of spreading codes is used for every transmit antenna. In the sequel, the superscript or subscript t, such as in x t u (i), indexes the tth transmit antenna. Since the Kalman filter is well known, no discussion thereof is given here. A comprehensive treatment of the Kalman filter and its properties can be found in [12,29]. The one-step-ahead prediction and the filtered estimates of the state vector x(k) are denoted by x(k|k − 1) and x(k|k), respectively, where k represents a generic time index which may have different interpretations for different state-space models.

MIMO CDMA signal model
Consider a U-user system where the transmitted chip sequence of the uth user is Here, i denotes the chip index; A u is the signal amplitude; a t u (n) is a possibly coded sequence of i.i.d. data symbols; s u = [s u (0), . . . , s u (F − 1)] T is the spreading sequence; and F represents the spreading factor. The notations · and | · | F denote the floor function (rounding toward −∞) and the modulo-F reduction, respectively. Let c(i) denote the basestation dependent scrambling sequence which has a constant modulus. The total signal transmitted via the tth transmit antenna takes the form which is the sum of all user signals scrambled by the scrambling code. The signal observed at the rth receive antenna can be expressed as where for t = 1, . . . , M and r = 1, . . . , N, Here, h t r,l represents the lth tap of the composite channel impulse response between the tth transmit and the rth receive antennas sampled at the chip rate, and D denotes the channel time span in chip durations. The measurement noises v r (i) are assumed to be uncorrelated white Gaussian processes with variance σ 2 v ≡ N o . The superscript T denotes matrix transposition; for example, x tT (i) is the transpose of x t (i). Note that for each transmit antenna, the pilot tone is contained among the U user signals defined by (1). Typically, the pilot tones account for about 10% of the total transmit power.
It is worth pointing out that the above signal model holds for the case of fractional sampling with an integer oversampling factor, denoted by P. In that case, N represents the number of "virtual" receive antennas which is equal to P times the number of actual receive antennas. Fractional sampling, however, colorizes the noise processes v r (i) spatially (over r) and temporally (over i) if the nominal bandwidth of the receive filter is kept at the chip rate 1/T c , with T c denoting the chip duration. Assuming additive white Gaussian noise (WGN) at the receiver front end, the statistics of v r (i) are completely determined by the receive filter and therefore it is possible in principle to whiten the measurement noise by employing spectral factorization [4, page 104], which requires passing the sampled signal through a whitening filter. Instead of noise whitening, which is difficult for a general process, we describe in the appendix how the measurement noise v r (i) can be treated as the output of a discrete-time filter driven by WGN. We also show that the filtered noise model can then be incorporated into the state-space model of the downlink signal, yielding a white-noise model to which the ordinary KF is applicable. Due to space limit, however, we restrict our treatment of fractional sampling to the appendix only.

Symbol-level state-space model
The symbol-level state-space model is developed in this section. We call the estimator operating based on this model the symbol Kalman filter (SKF). During the nth symbol interval, we seek the best linear unbiased estimate (BLUE) The characterizer best here means minimum error variance. Note that, once an estimate of the chip-level sequence x t (n|n) has been obtained, an estimate of the nth symbol a t u (n) of the desired user u transmitted from antenna t can be produced from a simple despreading and descrambling operation on the first F elements of x t (n|n). We define the observation at time n as where denotes definition and for r = 1, . . . , N, From (3) and (9) we can write where In (13) we have assumed for notational simplicity that the channel is constant over each symbol period. Observe that the channel matrix H t r (n) has dimensions F × (F + D). The measurement equation can be written as where To describe the state dynamics, we define where I a denotes the identity matrix of size a; 0 a×b represents an a × b zero matrix; and ⊗ denotes the Kronecker product operator. By considering x(n) as the state vector, the state dynamics admits the description The state noise process w(n) is temporally white but nonstationary. In particular, its covariance matrix is time varying and has the form where Observe that the transmitted signal is nonstationary due to the a priori known scrambling sequence c(i). The measurement noise v(n) is assumed to be a white Gaussian process with covariance matrix It is reasonable to assume that the measurement noise variance N o is known from the front-end noise figure. The expectation (23) can be thought of as a long-term average taken over all unknown quantities such as the spreading codes of interfering users. The expectation operator can be dropped if the argument is given. However, if R ss is unknown to the receiver, it can be approximated by where which represents the total transmitted power. Simulations indicate that R ss and R ss yield similar performances. The approximation (27) is motivated by the fact that when the system is fully loaded (i.e., U = F) and the users have equal powers, there holds for any orthogonal set of spreading codes. The rows or columns of an F × F Walsh-Hadamard matrix composed of ±1's form such set of codes (see, e.g., [30, page 48], [31, page 422]).

Chip-level state-space model
The chip-level state-space model developed in this section is used to obtain the BLUE of each transmitted chip x t (i) based on all past observations {y r (k) | k ≤ i, r = 1, . . . , N} measured from an N-antenna array. We call the estimator operating based on this model the chip Kalman filter (CKF). To construct this model, we define by the array observation, channel matrix, and state vector at chip time i, respectively. Recall that, for t ∈ {1, . . . , M}, x t (i) is defined by (5). In (31), where h t r (i) is given by (4). By defining the state vector at chip epoch i as x(i), the chip-level state-space model can be written as with for t = 1, . . . , M. For this model, we assume that the scrambled transmitted signals are uncorrelated white processes so that where δ(·) denotes the Kronecker delta function and N t = E[|x t (i)| 2 ] which represents the power of the tth transmit antenna. With (38), the covariance matrix of the state noise process w(i) takes the form where By assuming that the scrambled signal driving each transmit antenna is white, we ignore its nonstationarity and, hence, forgo part of the tracking capability of the symbol-level model. In the FIR-equalizer framework (see, e.g., [3,7,8,9]), the whiteness assumption is used extensively and is justified by viewing the scrambling sequence as pseudorandom. This assumption holds relatively well when the scrambling sequence is persistently exciting, that is, sufficiently white. The nonstationarity caused by the time variation of the channel is still retained in the model. The autocorrelation matrix of the measurement noise is given by

REDUCED-COMPLEXITY KALMAN FILTER
As mentioned earlier, LMMSE equalization based on FIR filtering requires the solution to a linear matrix equation in order to obtain the FIR equalizer. For stability and reliability, FIR equalizers are usually implemented in a block-by-block manner. If the time span of each block is small compared to the coherence time of the channel, then the channel can be assumed constant over each block so that the equalizer needs to be recomputed only once per block. Hence, as the channel coherence time increases, the equalizer is updated less frequently so that the complexity of the block FIR equalizer decreases accordingly. By following this strategy, the operation of the Kalman filter can be modified slightly to trade away some tracking capability in exchange for a lower computational complexity in the presence of long coherence time. In the extreme case where the signal model is perfectly stationary, the FIR equalizer needs to be computed only once. Similarly, the Kalman-filter approaches the causal Wiener filter (see [12, page 254], [4, page 378]) and therefore its steadystate solution can be synthesized off-line and used to filter the entire observation record. The state prediction error covariance matrix P(k|k − 1), the Kalman gain K(k), and the filtering error covariance matrix P(k|k) converge to their respective steady-state values.
Consider the state-space model We assume that the channel remains relatively constant over a long time interval so that H(k) = H over this time period. We assume also that the transmitted signal is stationary so that Q w (k) = Q w . This assumption is satisfied by the chiplevel state-space model of Section 2.3. If approximation (27) is used, then the stationarity assumption is implied so that the symbol-level state-space model of Section 2.2 is considered stationary by noticing that the scrambling code has a constant modulus. The time index k stands for the chip and symbol index when applied to the chip-level and symbollevel state-space models, respectively. It is easy to verify that , that satisfies the well-known Riccati equation Therefore, K(k) and P(k|k) also converge to their respective steady-state solutions which follow directly from the Kalman filter recursion, that is, K(k) → K = PH H (HPH H + Q v ) −1 and P(k|k) → P + = (I − KH)P. In practice, it takes only a small number of steps for the Kalman filter to converge. The Figure 1 shows an example where the settling time is only about 5 chip periods (see also [24, Figure 2] and [32, Example 4.6]).
To reduce the complexity of the Kalman filter, let l s denote the number of consecutive time steps for which the state-space model can be considered stationary. Let l t represent the settling time of the KF for the given model. Since l s , which on the order of the coherence time of the channel, is generally large and l t is generally small, we have l t l s . Then, in each data segment of l s time steps, P(k|k − 1), K(k), and P(k|k) need to be computed only during the first l t steps. In other words, the subroutine for updating these quantities can be turned on for a period of l t units of time and then off for a period of l s − l t units of time, and the process is repeated periodically. The resulting recursion can be summarized as follows.
(1) Initialize P + = P(0|0) and x(0|0) = 0. For k = 1, 2, . . ., execute the following recursion. (2) Compute the time update (3) Filter update: letl t andl s satisfy l t ≤l t ≤l s ≤ l s . If |k|l s ≤l t , compute Recall that | · | l denotes the modulo-l reduction. (4) Compute the measurement update Note that the full-complexity KF ignores the condition |k|l s ≤l t in step (3) and performs the filter update (45) at every time epoch. Thus, if we setl s =l t = 1, then the above recursion operates just like the original KF. We observe that the complexity of the routine for updating P(k|k − 1), K(k), and P(k|k − 1) is reduced by a factor of Thus, tradeoffs between complexity and tracking can be easily achieved by controlling the values ofl s andl t . Furthermore, the time update (44) involves only a shift operation and, hence, incurs essentially no computing cost. On the other hand, the key computing cost is incurred by the filter update (45) and this cost is reduced by a factor of f .

FIXED-LAG SMOOTHING
The signal estimate can be improved by employing fixed-lag smoothing which, for a fixed delay Similarly, for the symbol-level state-space model, we see from (12) and (14) that during the nth symbol period the smoothed estimates x t (nF − 1|nF + F − 1), x t (nF − 2|nF + F − 1), . . . , x t (nF − D|nF + F − 1) of the respective chips x t (nF −1), x t (nF −2), . . . , x t (nF −D) are automatically available from x(n|n). Note that they are smoothed estimates because they belong to one or more symbols transmitted before the nth symbol period while they depend on the future observations {y r (l) | l = nF, . . . , nF +F −1; r = 1, . . . , N} collected during the nth symbol period. These smoothed estimates can be substituted for the filtered estimates x t (nF − 1|nF − 1), when making a decision on the (n − 1)th symbol. In general, they can be used to make decision on the (n − D/F )th symbol, where · denotes the ceiling function (rounding toward ∞). Thus, the decision lag can be as large as D/F symbol periods. From the above discussion, we see that a simple augmented state-space model can be obtained by replacing D withD where ∆ > 0, and padding the channel impulse response with ∆ zeros for each pair of transmit-receive antennas. Specifically, (4) and (5) are replaced by respectively. When replacing D withD, one needs to make a simple note that h t r,l = 0 for l = D + 1, . . . ,D. Using this augmented state-space model endows us the lag-D smoothed estimate of x t (i −D) based on all observations up to time i.
The augmentation method described above is different in several aspects from that of Anderson and Moore (AM) [12, page 176] and [33]. First, the AM's augmentation increases the state dimension by an integer factor whereas our augmentation, by taking advantage of the signal delay structure, allows the state vector to be extended by an arbitrary number of chips. Second, if the chip-level state-space model above is augmented according to AM's approach, then each additional lag of ∆ chips increases the state dimension by (D + 1)M∆ chips and the new state vector contains some redundancy; specifically, many elements of the new state vector occur multiple times. For the symbol-level state-space model, the amount of redundancy is small compared to the state dimension, and the state vector is extended by (F +D)M chips for each additional lag of one symbol, which can be larger than necessary.
We note that for a general state-space model, the augmentation method of AM results in a stable and efficient fixed-lag smoother algorithm. Furthermore, our augmented model is guaranteed to be stable since all the eigenvalues of the dynamics matrix A lie within the unit circle; in fact they are all zero. This fact guarantees that, for a stationary signal model, the closed-loop gain A − AKH of the prediction filter has all its eigenvalues inside the unit circle so that the Kalman filter for the given state-space model is asymptotically stable [12, page 77].
Another augmentation arrangement described by AM [12, page 184] can also be used in lieu of the previous one. This augmentation involves a delay structure for the model input, instead of the state; a chain of time-delayed copies of the state vector; and padding the measurement matrix with zeros. The resulting model has approximately the same computational complexity as the one described earlier.

COMPUTATIONAL COMPLEXITY
In Tables 1 and 2, we summarize the complexity of the Kalman and FIR filtering methods measured in terms of the number of complex multiplications (CMs) and complex additions (CAs) per-chip period. Typical complexity values are listed in Tables 3 and 4 for the Veh-A channel model. The complexity of the SKF and recursive FIR methods is shown for the single-input single-output (SISO) case only, since they are computationally more intensive than the CKF and the block LMMSE filter. However, we use the SKF as a reference for performance comparison. The performance of the KF techniques will be compared with the indicated FIR filtering methods which we briefly describe next. All FIR methods based on the LMMSE criterion employ linear FIR filtering.  Specifically, the underlying principle is to minimize the MSE objective function where and the estimate x i of x i is obtained by passing the observations through a multidimensional linear FIR filter G i , that is, with Note that y(i) is defined by (30). Here, L and δ are design parameters which indicate the equalizer order and the number of precursor taps, respectively. The filter G i is selected to minimize the MSE criterion (51) so that where Assuming that the scrambled transmitted sequence is white, obtaining R xy (i) is straightforward, so we will not discuss it further (see, e.g., [27]). Methods based on FIR filtering differ essentially by the method in which the inverse autocorrelation matrix R −1 i implicated in (55) is estimated. In the next section, we compare the error performance of the KF approach with the following cFIR methods.

Block LMMSE
In this approach, R i is assumed to be blockwise constant and is estimated by the time average where K and b denote the block length and block index, respectively. Matrix inversion is thus required once per block in order to compute the filter (55).

Recursive FIR
This algorithm estimates R i by the weighted average which yields the recursion where Γ i = R −1 i−1 y i , λ ∈ (0, 1) is called the forgetting factor, and R −1 0 can be initialized via direct matrix inversion or with a diagonal matrix.

Sliding-window RLS
The complexity of the sliding-window recursive least squares (swRLS) algorithm is not shown here (see [27]), but for performance comparison in Section 6, we briefly describe this approach whereby R i is estimated by the sliding time averagẽ Then,R where The inverse of (61) can be expressed in the recursive form whereΓ i R −1 i E i . As for R −1 0 , the initial valueR −1 0 can be obtained from direct matrix inversion or approximated by a diagonal matrix.

NUMERICAL SIMULATIONS
In this section, we compare the performance of the Kalman filtering techniques developed in this paper with those of show also the performance of the RAKE receiver. We use bit error rate (BER) and block error rate (BLER) obtained from link level simulations as the criteria for comparison. The simulations were performed in accordance with the downlink packet data channel (F-PDCH) format of the CDMA2000 1X EV-DV standard [34]. The standard allows for several combinations of modulation schemes, coding rates, spreading code assignments, and frame durations based on packet size, channel conditions, and scheduling considerations. From these allowable combinations, we have chosen two formats, one using QPSK and another using 16 QAM. Note that the 1X EV-DV standard employs forward error correcting (FEC) codes of variable rates, all of which are achieved by puncturing a rate-1/5 parallel concatenated convolutional code (PCCC), also known as a turbo code. The simulations were performed under realistic channel conditions mandated by ITU. The settings and parameters used in the simulations are listed in Table 5. Of the 32 available Walsh codes, 25 are used by various users and 1 by the pilot. Hence the simulation represents an almost fully loaded situation. Out of the total of U = 25 user Walsh codes, the desired user utilizes 3 of them in the case of QPSK, and 4 in the case of 16 QAM. In the following plots, dotted and solid curves are used to indicate bit error rates and block error rates, respectively. The performance curves for the block LMMSE, recursive FIR, and swRLS methods are labelled respectively by "block," "adptv" with a λ value, and "swRLS." A fixed-lag value of ∆ = 3 is used for the CKF in the 16 QAM case, and ∆ = 0 for all other cases.

SISO links
Simulation results for SISO systems are shown in Figure 2 for QPSK and Figure 3 for 16 QAM. It is interesting to see from Figure 2a that the CKF and the SKF have similar performances and are about 2 dB better than the FIR filters. Furthermore, the SKF exhibits a small performance loss when the code correlation matrix R ss of (23) is replaced by approximation (27). In Figures 2b and 3b, the simulation is performed for a high vehicle speed of v = 120 km/h, where the CKF has a substantial performance gain over the FIR filtering methods. In Figure 2a, a complexity reduction factor of f = 240/5 = 48 degrades the performance of the CKF by about 0.5 dB as compared to the case of no complexity reduction. Furthermore, the loss of performance of the CKF due to complexity reduction is also quite small in the other cases studied.
For the simulations results shown in Figure 3, where the CKF is used to estimate a 16 QAM signal transmitted over an SISO channel, the soft-demodulated bits (centered at ±1) associated with the nth symbol estimate are scaled by 1/s n prior to decoding. The quantity s n is the symbol error variance estimate computed from the diagonal elements of P(i|i), i = nF, . . . , nF + F − 1. In the soft decoding of the FEC code, the soft-demodulated bits sequence is treated as if it is the coded bits sequence observed in additive white Gaussian noise with time-dependent variance. This approximate noise model is adopted in consideration of the fact that the error variance of the signal estimate is a decreasing function of the channel strength which varies over time, and such variations are indicated by the magnitude of the diagonal elements of P(i|i). This noise model holds well as long as there is sufficient interleaving introduced between the encoder output and the input of the bits-to-symbol mapper. The aforementioned scale factor 1/s n transforms the nonstationary noise process into a one of constant variance so that the ordinary soft decoder, which is designed for stationary noise, is applicable without modification. Note that the transition from symbols to bits is a soft-mapping demodulation operation.

MIMO links
Simulation results for MIMO systems are shown in Figures 4 and 5 with QPSK and in Figure 6 with 16 QAM. We see again that the SKF and the CKF perform comparably. Also, the swRLS and block LMMSE filters have almost the same performance. Note that the results in Figure 5 are for a multipleinput single-output (MISO) system with two transmit antennas (M = 2) and one receive antenna (N = 1). For this system, the FIR filtering methods exhibit a noise floor at a block error rate slightly above 10 −2 , while the Kalman filtering techniques show no noise floor in the given BLER range. Comparing the KF performance curves in Figure 2a with their counterparts in Figure 5b indicates that with one receive antenna no diversity gain is seen when the number of transmit antennas is increased from one to two, even though in doing so the effective FEC code rate is reduced by a factor of 2 (the total transmit energy per information bit is unchanged). On the other hand, the diversity gain is large when two receive antennas are used instead of one, as evident from Figures 4 and 5.
Note, in particular, from Figure 4b that a complexity reduction factor of f = 240/5, results in negligible performance loss for the CKF. In Figure 4a, the performance loss 20     is almost unnoticeable for f = 256/32, indicating that the reduction factor can be increased to a higher value such as the one used in Figure 4b.
It is not surprising to see that the CKF performs as well as the SKF based on the estimated code correlation matrix R ss given by (27). It can be concluded by examining 8      the state-space models that the two techniques should perform exactly the same since, with the use of (27), the nonstationary behavior of the transmitted signal is masked out by the constant-modulus property of the scrambling sequence as can be seen from (22), yielding a time-invariant Q w (n). 14

CONCLUSIONS
In this paper, we have developed symbol-level and chiplevel state-space models for downlink CDMA channels. The Kalman filter is employed to estimate the transmitted signal and restore the code orthogonality so that single-user detection is feasible in multiple-access channels. Simulations indicate that the proposed Kalman filtering techniques outperform the FIR filtering approaches by 1-2 dBs. Due to the tracking capability of the KF, its performance gain over FIR methods tends to increase with the channel fading rate. Like methods based on FIR filtering which can take advantage of slow fading to reduce the computational complexity via block-by-block filter synthesis, the KF can periodically skip the update of the Kalman gain and the error covariance matrices. For moderate reduction factors, the periodic-update operation reduces the KF complexity but essentially retains the performance of the full-complexity KF.
Due to the signal delay structure, the KF automatically produces fixed-lag smoothed estimates of the transmitted signal corresponding to all lags up to the channel order. We have demonstrated that an augmented state-space model can be obtained by taking advantage of the chip-delay structure of the channel. Therefore, an arbitrary fixed lag larger than the channel order can be achieved.
We have assumed that the channel information is available. In actual receiver operation, however, the channel is estimated from the pilot tones embedded in the downlink signals and is subject to estimation error due to multiple-access interference. The channel estimate can be improved by feeding the soft signal estimate at the output of the KF back to the channel estimator and combining it with the pilots, thereby giving rise to a stronger effective pilot. The soft feedback thus accounts for all traffic interference. On the other hand, hard decision feedback from the output of the decoder through a reencoder is also possible, which, however, accounts only for the desired user traffic.
The KF can be implemented in the square-root form. This implementation is better at keeping the error covariance matrices nonnegative definite, improves the matrix condition number, and enhances the numerical stability.
Finally, the error covariance matrix P(k|k) contains useful information. Since the diagonal elements of the matrix are a time-varying reliability measure for the chip estimates and, hence, the symbol estimates, incorporating this information into the soft demodulator/decoder may further improve the error performance.

COLOR NOISE MODEL FOR FRACTIONAL SAMPLING
Let g c (t) and P denote the impulse response of the (continuous-time) receive filter and the integer oversampling factor, respectively. Here, we consider only the case where P > 1, since the observation noise is automatically white when P = 1 if the deterministic autocorrelation of g c (t) is a Nyquist pulse. To avoid out-of-band interference, the two-sided nominal bandwidth of g c (t) is limited to the chip rate 1/T c . Therefore, for all practical purposes, g c (t) is considered band-limited. Since the two-sided bandwidth of g c (t) is smaller than the sampling rate 1/T s , with T s = T c /P, from the sampling theorem we can represent g c (t) by its discretetime (DT) samples g(i) g c (iT s ), that is, where sinc(x) sin(πx)/(πx). Assume that the DT measurement noise results from the zero-mean additive white Gaussian noise (WGN) process w c (t) present at the input of the receive filter. By definition, the autocorrelation of w c (t) is E[w c (t + τ)w * c (t)] = N o δ c (τ) where δ c (τ) is the Dirac delta function. The response of g c (t) to the input noise is given by where * denotes the convolution operator. When the filtered signal is oversampled by an integer factor of P, the DT output noise is for all integer k, where t o is an arbitrary timing offset. We will show that v(k) can be modelled as the output of a DT filter driven by WGN. We define the following: It is easy to show that the DT processw(k) is wide-sense stationary and admits the autocorrelation where we have assumed that g c (t) has a finite time span from t = 0 to t = D g T s for some integer D g . Equation (A.7) shows that v(k) is the output of a DT filter driven by WGN. We now demonstrate how (A.7) can be incorporated into the state-space model for the downlink CDMA signal. We revisit the chip-level state-space model given by (34) and (35). For simplicity, consider the case of one receive antenna sampled at the rate of P/T c , so N is replaced by P and the channel structure remains the same. The only difference is that the noise vector v(i) in (35) is no longer a white process with respect to i because it is v(k) demultiplexed into P parallel channels. That is, during the ith chip interval, where G          0 · · · 0 0 g(0) g(1) · · · g D g 0 · · · 0 g(0) g(1) · · · g D g 0 . Note that the upper part of (A.13) comes directly from (34) and the lower part is obvious from the delay structure of (A.11). In addition, (A.14) is obtained by substituting (A.9) into (35) and, from the KF viewpoint, contains no measurement noise. Instead, the source of measurement noise is contained in the new state vector [x T (i),w T (i)] T . Several observations are in order. First, the above development introduces no unknown parameters as g(i) is completely known since it is the DT-sampled version of the front-end filter. Second, the above noise model allows g c (t) to be an arbitrary bandlimited filter. Third, the new state-dynamics matrix has all its eigenvalues inside the unit circle; in fact, all the eigenvalues are zero. Therefore, the model is stable. Fourth, if the noise processes at different antenna front ends are uncorrelated, the above state-space model can be cascaded for multiantenna receivers. Finally, this model admits the ordinary KF as the optimal linear unbiased estimator since the state noise process w(i) is white over i.