EURASIP Journal on Applied Signal Processing 2005:5, 683–697 c ○ 2005 Hindawi Publishing Corporation Computationally Efficient Blind Code Synchronization for Asynchronous DS-CDMA Systems with Adaptive Antenna Arrays

A novel space-time adaptive near-far robust code-synchronization array detector for asynchronous DS-CDMA systems is developed in this paper. There are the same basic requirements that are needed by the conventional matched filter of an asynchronous DS-CDMA system. For the real-time applicability, a computationally efficient architecture of the proposed detector is developed that is based on the concept of the multistage Wiener filter (MWF) of Goldstein and Reed. This multistage technique results in a self-synchronizing detection criterion that requires no inversion or eigendecomposition of a covariance matrix. As a consequence, this detector achieves a complexity that is only a linear function of the size of antenna array (), the rank of the MWF (), the system processing gain (), and the number of samples in a chip interval (), that is,. The complexity of the equivalent detector based on the minimum mean-squared error (MMSE) or the subspace-based eigenstructure analysis is a function of. Moreover, this multistage scheme provides a rapid adaptive convergence under limited observation-data support. Simulations are conducted to evaluate the performance and convergence behavior of the proposed detector with the size of the-element antenna array, the amount of the-sample support, and the rank of the-stage MWF. The performance advantage of the proposed detector over other DS-CDMA detectors is investigated as well.


INTRODUCTION
Spread-spectrum communication systems have been used successfully in military applications for several decades. Recently, direct-sequence (DS) code-division multiple access (CDMA), a specific form of spread-spectrum transmission, has become an important component in third-generation (3G) mobile communication systems, such as wideband CDMA (W-CDMA) or multicarrier CDMA (MC-CDMA) for 3G cellular radio systems, because of its many advantages compared with the conventional frequency-and/or time-division multiple-access (FDMA/TDMA) systems. In a DS-CDMA communication system, all users are allowed to transmit information simultaneously and independently over a common channel using preassigned spreading waveforms or signature sequences that uniquely identify the users. In [1], Verdú demonstrates that a DS-CDMA receiver is not fundamentally multiple-access interference (MAI) limited and can be near-far resistant. The proposed optimal multiuser detector for DS-CDMA signals comprises a bank of matched filters followed by a maximum-likelihood sequence detector whose decision algorithm is the Viterbi algorithm. Unfortunately, the computational complexity of Verdú's de-tector grows exponentially with the number of users, which is much too complex for practical DS-CDMA systems. A variety of suboptimal DS-CDMA receivers resistant to MAI have been proposed over the last decade or so (e.g., [2] and additional references therein), such as the decorrelating receiver [3], the MMSE receiver [4], and the multistage successive interference cancellation (SIC) [5] and parallel interference cancellation (PIC) [6]. However, most DS-CDMA multiuser receivers use detection systems that require precise time-delay knowledge of all the users, which is usually not known to the receiver a priori. To use such algorithms, the time delays have to be estimated, and also the receivers that use these delays suffer from high complexity and errors that occur with the estimation of the propagation delays. The effect of imperfect time-delay estimation, that is, delay mismatch, degrades dramatically the capability of such a receiver to adequately establish code acquisition and demodulation [7]. Hence, synchronization has become an essential part of all communication systems.
In a nonorthogonal CDMA system, the sliding correlator [8] for time-delay estimation often suffers from the so-called near-far problem. Reliable communication links based on the conventional correlator can only be achieved by utilizing stringent power control mechanism and increasing the transmit-power level or the ratio of the spreading factor (SF) to the number of users. Fortunately, the acquisition performance can be enhanced considerably if the MAI is mitigated or suppressed effectively. Existing schemes contributed on MAI-resistant propagation-delay acquisition techniques include the following: a modified correlator-type timing estimator developed based on the minimum mean-squared error (MMSE) criterion, is proposed in [9]. The MMSE scheme is able to outperform substantially the conventional correlator-based methods, especially in a near-far environment. However, an all-one training sequence is required for it to function properly. In [10], a maximum-likelihood synchronization for single users is developed. But the method presented in [10] again requires a training period. Subspacebased code-timing estimators that use a single antenna element are presented in [11,12,13]. However, these timing estimators involve intensive computations due to the requirement of an eigendecomposition. Additionally, the knowledge of the exact number of active users is needed.
The incorporation of adaptive-array antennas in cellular systems to mitigate MAI, time dispersion, and multipath fading that occur in mobile communications has received considerable attention in the recent research. This is due to the fact that the base stations are being equipped with a number of antenna elements. The spacing between antenna elements at the base station is assumed to be close enough, typically half the signal-carrier wavelength. This type of antenna arrays can be used as a beamforming array, where the received signal's envelope correlation at each antenna element is equal to one. In other words, the same signal is received by all elements of the beamforming array. A J-element beamforming array antenna is known to be able to perform beamforming with J − 1 degrees of freedom to control the directions of J − 1 nulls of the antenna. Hence, a better acquisition and demodulation performance of asynchronous DS-CDMA signals can be expected in comparison to the single-antenna case. Multiple-element antenna algorithms that utilize the large-sample maximum-likelihood (LSML) estimation in [14,15] and the subspace-based multiple signal classification (MUSIC) in [16] are used to perform code-timing acquisition over a time-varying fading channel. The resulting computational cost of a covariance matrix inversion or an eigendecomposition is O((JNS) 3 ) [17]. Here the big O(·) notation indicates that complexity in number of operations is proportional to the argument. This requirement is quite computationally expensive in a nonstationary environment because the receiver filter coefficients need to be recalculated quite often. In [18], a decoupled multiuser acquisition (DEMA) algorithm for the code-timing estimation is introduced. It provides an improved timing accuracy and an alleviated computational cost over LSML. But this DEMA algorithm shows restrictive applications due to the need of the code sequences and the transmitted data bits for all users. A filterbank-based blind code-synchronization scheme with the only requirement of the signature vector of the desired user is proposed in [19]. This filterbank scheme can be used to perform code acquisition and code track-ing in frequency-flat and frequency-selective, time-invariant, and time-varying fading channels. However, this algorithm again involves the forming process of the covariance matrix inversion. As a consequence, the computational complexities of those proposed systems remain high and thus of limited practical use.
In the present paper, an adaptive near-far robust synchronization array detector for space-time asynchronous DS-CDMA signals is developed. The primary requirement needed for the proposed timing synchronization system is knowledge of the signature's spreading code vector of the desired user, making it ideal for a decentralized implementation. There is no need for a pilot signal, a side channel, a long training period, or signal-free observations. Furthermore, a computationally efficient implementation of the proposed detector that utilizes the recently developed reducedrank multistage Wiener filter (MWF) of Goldstein et al. [20] is presented. By exploiting the low-rank MWF structure, one can not only avoid the computationally expensive matrix inversion operation, but also maintain the performance close to that of its full-rank counterpart with a much smaller number of data samples. Consequently, the computational complexity of the system is reduced substantially from O((JNS) 3 ) to O(JMNS) for each computing cycle of clock time, where 1 ≤ M ≤ JNS − 1. In fact, the multistage structure can achieve near full-rank detection and estimation performance with often only a small number of stages, that is, M JNS. Therefore, the computational complexity achieved by the proposed array detector is comparable to the complexity O(JNS) of the MMSE CDMA detector that uses the adaptive least-mean-square (LMS) coefficients update algorithm [21]. But the proposed detector does not have the drawback of convergence instability and the sluggishness of an LMSbased algorithm. This is because of the dependence free of the proposed detector on the eigenvalue spread. Moreover, the achieved computational efficiency is better than that of the adaptive recursive least-squares (RLS) taps-update algorithm used in the linear MMSE CDMA detector (with O((JNS) 2 ) operations) [21]. Also this multistage adaptive filtering scheme provides a rapid adaptive convergence and tracking capability under limited observation-data support. These important features contribute significantly to the reduction of the computational cost and amount of data sample support needed to accurately estimate a covariance matrix.
The material included in this paper is organized as follows: in Section 2, an asynchronous DS-CDMA signal model is outlined. Section 3 develops the test statistic for the proposed code-synchronization detector and derives an equivalent structure of the classical generalized sidelobe canceler (GSC) as well. In particular, an effective decision-feedback (DF) adaptive scheme for the steering vector is detailed in Section 3.3. In Section 4, an adaptive batch-mode truncated MWF realization is introduced and its performance is evaluated via computer simulations in Section 5. The comparison between the proposed reduced-rank multistage scheme with other timing estimation techniques is also evaluated in Section 5. Finally, concluding remarks are given in Section 6.

ASYNCHRONOUS DS-CDMA SIGNAL MODEL
In DS-CDMA systems, all users transmit simultaneously in the same frequency band. Consider an asynchronous DS-CDMA mobile radio system with K users that employs K spreading waveforms s 1 (t), s 2 (t), . . . , s K (t) and their transmitted sequences of the BPSK symbols. The received baseband continuous-time signal, which impinges on the receiving antenna array with J sensors in an additive white Gaussian noise (AWGN) channel, is a superposition of all K signals as follows: where n(t) is an AWGN vector and each user's signal r l (t) is (v) T b : information (data) symbol interval; (vi) τ l : propagation delay of user l.
We assume that different symbols of the same user, as well as symbols of different users, are uncorrelated. The s l (t) in (2) is the spreading waveform of user l, given by where T c is the chip interval and p(t) represents the rectangular chip waveform of duration T c . In one symbol period, there are N = T b /T c chips, modulated with the spreading code sequence (c l,0 , c l,1 , . . . , c l,N−1 ). Here N is called the spreading factor. The spreading sequences are repeated periodically in each symbol duration (i.e., length-N short spreading codes are employed).

STRUCTURE OF SYNCHRONIZATION DETECTOR
The proposed receiver is described by means of a basebandequivalent structure. Such a baseband complex signal process is physically achieved by the combination of quadrature demodulation and a phase-locked loop (PLL) (see [22,Chapter 6]). This converts the received radio-frequency (RF) modulated signal to a baseband complex-valued signal. Then the received signal of each individual antenna sensor is passed through a chip matched filter (CMF). The output of the kth antenna element is for k = 1, 2, . . . , J. Subsequently, the output of the CMF for each antenna element is sampled every T s seconds, where S(= T c /T s ) is an integer and S ≥ 1. Assume that the output signals of the CMFs are sampled at the time instant iT s . The tapped delay lines (TDLs) for the J-element antenna array are expressed as a J × NS data array, given by The data matrix Z[i] ∈ C J×NS is then "vectorized" by sequencing all matrix rows in the form of a vector as follows: The vector x[i] in (6) The components of the weight vector w[i] as an optimum Wiener filter are determined later in (30). The output of the TDL filter is the inner product of the vectors in (6) and (7) as follows: where superscripts (·) † and (·) * denote the conjugate (Hermitian) transpose of a matrix and the conjugate of a complex number, respectively. This output is passed through the time-synchronization acquisition system to obtain the information about synchronization. This time acquisition system can be modeled conceptually as a filter bank constructed of NS filters in sequence, each of the type as shown above, in order to identify the time phase of the desired user.

Test statistic of synchronization detector
In this paper, the detection of a single desired user's signature vector embedded in the MAI plus noise is modeled as a binary-hypothesis testing problem, where H 0 corresponds to target-signal absence and H 1 corresponds to target-signal presence. Thus, at each time phase of the JNS-vector x[i], the time-synchronization detector must distinguish between two hypotheses of the desired user, say user 1. The targetsignal vector under hypothesis H 1 is given by the JNS-vector A 1 a 1 d 1 (b 1 ⊗ s 1 ), where A 1 is the amplitude of user 1, a 1 denotes the complex gain introduced by the channel, d 1 is the information bit of user represents the direction J-vector of user 1, and s 1 = [c 1,0 , c 1,1 , . . . , c 1,NS−1 ] is the discretized spreading code NSvector of user 1. The notation (·) ⊗ (·) represents the Kronecker product of vectors, defined by For a linear array and identical element patterns, b 1 has the form where Here, λ is the signal-carrier wavelength, d is the spacing between antenna elements, and θ 1 is the angular antennaboresight bearing of user 1 in radians. The two hypotheses that the adaptive detector must distinguish at each sampling time are given by where the complex scalar g 1 in (12) shows that represents the interference-plus-noise environment without the target signal g 1 d 1 (b 1 ⊗ s 1 ). The interference-plus-noise process is assumed to approximate zero-mean, colored, complex Gaussian noise [15,21], where the associated covariance matrix is defined as The random vector x[i], when conditioned on the information symbol d 1 , is an approximate complex Gaussian process under both hypotheses. The conditional probability density of x[i] given H 1 can be expressed in terms of the conditional probabilities P(x[i]|H 1 , d 1 ) for d 1 = 1 or −1 as follows: where it is assumed that P(d 1 = 1) = P(d 1 = −1) = 1/2. Then, the Bayes-optimum likelihood-ratio test (LRT) evidently takes the form [23] This evidently reduces to where Re{·} denotes the real part. Evidently this test no longer depends on the values of d 1 . Since the hyperbolic cosine function cosh (·) is a monotonically increasing function in the magnitude of its argument, the test in (15) is clearly equivalent to the test where γ 1 is the detection threshold. Define what is called the steering vector g 1 (b 1 ⊗ s 1 ) as Thus, the test statistic in (16) can be reexpressed by To perform the test in (18) This identity in (19) implies that the quantity u d [i] under the expected value in (19) is an unbiased estimate of d 1 u defined in (17). That is, is the desired estimate of d 1 u. Even though the difference of a sign may exist between u d [i] in (20) and the vector u in (17) when d 1 = −1, they can be used interchangeably for the magnitude test, which is used for time-synchronization acquisition [24], in (18).

An equivalent GSC-form structure
Note that the likelihood ratio test in (18) has been proven to be conserved by any invertible linear transformation T in [24]. Therefore, in order to avoid the computational cumbersome estimation of the matrix R v [i], the nonsingular linear transformation T 1 , given by the JNS × JNS matrix, with the structure (17), and B 1 [i] is the blocking matrix which annihilates those signal components in the direction of the vector u such that B 1 Hence, the transformation of the vector x[i] by the operator Here, the data vector x[i] is split by the transformation T 1 [i] into two channels or paths, namely, δ 1 [i] and x 1 [i]. The δ 1 [i] channel has the same process which is obtained from the conventional crosscorrelation detector. The "auxiliary" channel x 1 [i] is used to cancel MAI with a Wiener filter which estimates the nonwhite residual noise process in the δ 1 [i] channel. Thus, the subsequent multistage decomposition process for a Wiener filter can provide a natural and optimal way to accomplish such a stage-by-stage interference cancellation task. The cor- where The signal-free correlation matrix R v [i], needed in (18), evidently is expressed in terms of R x [i] under hypothesis H 1 by the relation where uu † in (25) is the JNS × JNS outer product matrix of vector u in (17) with itself. If one defines the positive scalar (norm), ∆ 1 [i] = √ u † u, one obtains, using (25), the relations The matrix inversion of Rv is determined by the aid of the matrix inversion lemma for partitioned matrices [25], given by where Thus, the test statistic is given by where Evidently, this test statistic has the form of the classical GSC [26], as shown in Figure 1, that was used originally to suppress or cancel interferers or jammers of radars and communication systems.
When hypothesis H 0 is true, R v [i] is equivalent to R x [i] due to the absence of the target signal g 1 d 1 (b 1 ⊗ s 1 ) in (26). For this case, the correlation matrix Rv The integer time phase i ∈ {i, i − 1, . . . , i − NS + 1}, that coarse synchronization is most likely to occur within the interval (i − NS + 1, i), is determined by

Decision-feedback adaptation scheme
One of the cornerstones for the proposed algorithm is the estimation accuracy on the steering vector u in (17). In [27], the vector u is defined by the cross-correlation between the received space-time data vector x[i] and the desired information bit d 1 , as follows: under the assumption of τ 1 = 0, that is, equivalently hypothesis H 1 . In other words, only attention is focused on a synchronous DS-CDMA channel. The statistical expectation in (34) is taken with respect to information bits d 1 . In practice, vector u in (34) is realized by (35), in the form of the sample average on a "supervised" mode, given by where {x p [i]} P p=1 is a sequence of joint space-time data vectors.
In this paper, an accurate estimate about u in (17) can be achieved by means of an initial training symbol followed by the decision-directed adaptation manner and is then applied to an asynchronous DS-CDMA scenario. Thus, the estimated information symbol d 1 is utilized as the feedback information to provide an accurate estimation of vector u in (17). An efficient recursive formula for updating the estimate of vector u can be used within the pth symbol interval, given by where u (p−1) [i] is the estimate of vector u of the (p − 1)th symbol interval, and the term d is updated by the (p − 1)th observed data. Here d (0) 1 = 1 denotes an initial training symbol used as preamble. In addition, the vector u (p) [i] in (36) can be used to serve as the space-time RAKE filter for a slowly fading channel. To examine the adaptive learning capability of this iterative procedure proposed in (36) for the steering vector u, an asynchronous DS-CDMA system with the parameters J = 2, K = 6, N = 31, SNR = 10 dB, and NFR = 10 Γl/10 , where Γ l ∼ N(4, 16) is considered. In Figure 2, the normalized correlation coefficient, where i is defined and derived in (33), is shown versus the number of iterations p used in the recursive adaptation. Note that the detector is developed using only the minimum required information with only the desired spreading code vector being known at the receiver and having a limited computational complexity. Therefore, it is suitable not only for the base stations (on uplinks) but also for mobile users (on downlinks). The performance could be improved further by utilizing a more precise estimate of the steering vector that is derived on the correlations between users or the estimates of the K spatial channels. Any method that uses channel estimation [28,29] could be used to obtain a more precise estimate of vector u, but at the expense of extra computational complexity.
The decision statistic of the information symbol d 1 based on the MMSE technique [4] is shown next to be generated by the use of the GSC-form structure developed in Section 3.2. Let w MMSE [ i ] be the filter-weight vector based on MMSE criterion and let x[ i ] denote the observation vector at time phase i obtained upon coarse synchronization in (33). Then the estimate of the information symbol d 1 has the form where sgn denotes the sign operator. The decision statistic in (37) can be modified by the techniques used in Section 3.2 to the test function given as follows: The quantity ω 1 (30) can be proved to be strictly positive, due primarily to the fact that scalar κ −1 1 [i] is one of the diagonal elements of the positive-definite matrix, . This fact is also demonstrated experimentally in [30]. The term ω 1 [i] is a positive scalar over the symbol period, and as a consequence it could be ignored in the above test in (38) for the determination of the information-bearing symbol. Thus, the estimate of the information symbol d 1 can be obtained by ignoring the positive scalar (ξ −1 ) in (38) as follows: By (31) and (39), the term q[ i ] is obviously needed in common with both the coarse synchronization and the demodulation operations. This term can be computed and stored during the adaptive acquisition and synchronization process. It does not need to be recomputed for demodulation. However, to launch this DF adaptive estimation algorithm, an initially rough estimate of time delay is required which is determined by the term of | Re{q[i]}| in (31). In other words, the same test in (30) ignoring the term w 1 [i] is utilized because w 1 [i] does not vary significantly over the symbol interval [30].

Reduced-complexity multistage analysis
To derive the desired reduced-rank multistage decomposition of the test statistic in (30), a sequence of orthogonal projections is applied to the observed data vector. Thus, the same procedure for the multistage decomposition in the first stage is repeated in the second stage of this process. Define a new nonsingular transformation T 2 [i] as follows:  (30) can be rewritten as where Thus, the variance of the error signal 2 [i] in (42) is computed readily to be Furthermore, the variance ξ 1 [i] of the scalar process, 1 , can be expressed further by A continuation of this decomposition process, extending (41), yields the JNS-stage test statistic in terms of a sequence of only scalar quantities in a form given as follows [23]: Hence, this filter-bank structure is optimal in terms of reducing the MSE for a given rank, and if the multistage orthogonal decomposition is carried out for the full JNS stages, then the multistage filter is exactly equivalent to the full-rank classical Wiener filter. Rank reduction is concerned with finding a low-rank subspace, say of rank With the Q M [i] given in (46), the low-dimensional filterweight vector w M [i] ∈ C M×1 is obtained as  The error-synthesis filterbank of the M-stage MWF is composed of M nested scalar Wiener filters, which is given by † The where It is evident that the observation vector is projected onto a lower-dimensional subspace, and the proposed reduced-rank Wiener filter is then constructed to lie in this subspace. This procedure makes possible optimal signal detection and accurate signal estimation while allowing for a lower computational complexity and a smaller sample support. Remarkably, this multistage Wiener filter does not require an estimate of covariance matrix or its inverse when the statistics are unknown since the only requirements are for estimates of the cross-correlation vectors and scalar correlations, which can be estimated directly from the observed data vectors.
From (46) and (49), the mapping from the MWF with full JNS stages to the equivalent JNS-dimensional Wiener filter is given by

Forward Recursion
End The JNS × 1 correlated random vectord JNS [i] is computed to bed Finally, an equivalent Gram-Schmidt matrix of the errorsynthesis filterbank, defined in (55), is then applied tod JNS [i] to produce the uncorrelated error JNS-vector˘ JNS [i] as follows [20]: where

BATCH-MODE TRUNCATED MWF REALIZATION
In Algorithm 1, the jth-stage signal blocking matrix, B j [i] = null(u j [i]), may be computed using the methods detailed in [31, Appendices A and C], or any other method which results in a valid transformation matrix T j . Here a training-based (batch-mode or FIR) algorithm in [32,33,34] for the multistage decomposition is used. The dimension of the blocking matrix B j [i] is kept the same for every stage in this algorithm.
To make this possible, a blocking matrix of the form is employed. In this manner, the lengths of the registers needed to store the blocking matrices and vectors can be kept the same at every stage, a fact that is very desirable for either a hardware or software realization. To obtain this algorithm, let (1) [i], x (2) [i], . . . , x (L) [i]] denotes the initial L approximately independent snapshots of the observation vectors. The estimate of the cross-correlation vector r xj δj [i] is computed as Also let the estimated variance of δ j [i] be computed by Thus, the variance ξ j [i] of the error, , can be obtained from the difference equation Using the above results, a simplified version of Algorithm 1 is given in Algorithm 2. This new structure no longer requires the calculation of a blocking matrix and the computational burden is reduced significantly.

NUMERICAL RESULTS
In this section, simulations are conducted to demonstrate the performance of the proposed code-timing detector for asynchronous space-time joint DS-CDMA signals. Here an asynchronous 6-user (K = 6) BPSK DS-CDMA system is considered. The spreading sequence of each user is a Gold sequence of length N = 31. The detector to be simulated employs a uniformly spaced linear-array antenna with multiple elements of half-wavelength (λ/2) spacing. Each user signal is assumed to have different directions-of-arrival (DOAs) uniformly distributed in (−π/2, π/2). Also the performance of the asynchronous DS-CDMA detector equipped with a single antenna is derived for purpose of comparison. The power ratios between each of the five interfering users and the desired user are randomly chosen from the log-normal distribution with a mean 6 dB larger than that of the desired signal and a standard deviation of 6 dB. This power ratio is denoted by a quantity called the near-far ratio (NFR), defined by NFR = g l 2 g 1 2 = 10 Γl/10 , Γ l ∼ N (4,16). (1) [i], x (2) [i], . . . , x (L) [i]] be L independent samples. Here N(·, ·) represents the Gaussian distribution and the subscript "l" denotes user l (l = 1). The relative transmission delays of the different users denoted byτ l for l = 2, 3, . . . , K are the delays relative to user 1, that is,τ l = τ l − τ 1 . For simplicity,τ l is assumed to be multiples of T c . All experimental curves are obtained by performing 1000 independent trials.

Forward Recursion
First, the acquisition performance of the proposed detector as a function of the signal-to-noise ratio (SNR, E b /N 0 ) is shown in Figure 4 for a J-element antenna array, data size L = 6JN, and NFR = 0 dB, under the assumption that the channel parameters of all users are known at the detector. Hence, the precise covariance matrix is assumed to be available at the detector. The simulations in Figure 4 provide an upper bound on the acquisition performance of the proposed DS-CDMA detector.
In Figure 5, the acquisition-error-rate performance of a rank-2 filter using u d [i] in (20) (i.e., without using decisionfeedback adaptation mechanism) for various numbers of antenna elements is presented in terms of SNR under data size L = 6JN and NFR = 3 dB. A better acquisition performance is achieved when a larger antenna is employed. This is made possible because MAI can be mitigated successfully by placing spatial nulls, that are formed by the J-element adaptive beamforming array, in the directions of the interferers. Moreover, a 2-element antenna detector not only accomplishes the    competitive performance with the detectors with a larger antenna array (J = 4 and 6) but also achieves a substantial improvement in acquisition in comparison with a single antenna element (J = 1). Figure 6 shows that the acquisition performance versus the number of stages M of the MWF. The proposed detector provides superior performance as an increasing function of the size of the J-element antenna array. The full-rank performance is achieved at remarkably low ranks and is nearly independent on the number of signals. In Figure 7, the acquisition performance of a rank-4 filter versus the signal-to-noise ratio for (a) the size of the Jelement antenna array and (b) the amount of the L-sample support is presented. A better acquisition performance is achieved when a larger size of antenna array or a larger number of training data samples is available. Again a 2-element antenna array detector accomplishes a substantial improvement in acquisition in comparison with a single-element antenna. Moreover, it is evident that an additional performance gain in acquisition is achieved by observing the results shown in Figures 5 and 7a. This is due primarily to the incorporation of the decision-feedback adaptation mechanism.
In Figure 8a, the probability of correct acquisition of a 2element antenna detector is presented as a function of the number of training data samples L for SNR = 14 dB and NFR = 3 dB. It is demonstrated in the figure that the filters of lower rank provide a quite fast adaptive convergence rate while a much larger number of training data samples is required for the case of a full-rank filter. Thus, the adaptive reduced-dimension multistage filters converge substantially faster than an adaptive full-rank filter. In Figure 8b, the probability of correct acquisition of a 2-element array receiver is presented as a function of the number of training data samples L for a number of cases for fixed NFR values, NFR = 0 dB, 3 dB, 6 dB, and 9 dB. No significant degradation is observed in this figure when the case of rank 4 is compared for a wide range of NFR values, 0 dB to 9 dB. This shows that the proposed receiver still performs well under conditions of poor power control. Hence, a stringent power control mechanism is not required for the proposed detector. Figure 9a depicts the comparison between the convergence and acquisition capabilities of the training-based LMS and RLS algorithms and the proposed multistage algorithm. Beside having low complexity, the proposed algorithm also achieves a better acquisition performance and has a faster convergence rate, especially for a limited number of training support. Results in Figure 9b show the probability of correct acquisition for the subspace-based MUSIC algorithm and the proposed multistage filter with the extremely low rank and L = 6JN. The MUSIC algorithm outperforms the proposed multistage detector, especially at smaller values of SNR. This is due principally to the dependence of the proposed reduced-rank multistage detector on the accuracy of the steering vector.
In Figure 10a, the probability of correct acquisition of a 2-element antenna detector is presented as a function of the signal-to-noise ratio for L = 6JN and NFR = 10 dB. In this simulation, each interfering user is assumed to have a received power 10 dB larger than that of the desired user. Results demonstrate that the proposed multistage detector accomplishes a better performance as an increasing function of the rank M of the proposed MWF. In this figure, a rank-5 MWF approaches almost the same acquisition performance as the full-rank Wiener filter. It is evident that near full-rank performance can be achieved by the use of the proposed MWF at an extremely low rank. When the MAI becomes more severe, the proposed detector performs significantly better than the conventional detector. In Figure 10b, the probability of correct acquisition of the proposed detector is presented as a function of the signal-to-noise ratio for M = 3 and NFR = 0 dB, 3 dB, and 10 dB with either single or two antenna elements. These results show that the proposed detector performs well at a remarkably low rank under  conditions of poor power control, that is, severe MAI, when a 2-element antenna array is employed. Moreover, the proposed multistage detector equipped with 2 antenna elements outperforms significantly the conventional detector and the multistage detector only with a single antenna element.
In Figure 11, the acquisition-error-rate performance of a rank-2 filter for various number of antenna elements is presented in terms of the number of users K for data size L = 6JN and SNR = 14 dB. In this simulation, each interfering user is assumed to have a received power equal to the desired user, that is, NFR = 0 dB. Clearly a better acquisition performance is accomplished when a larger antenna is employed. Also results demonstrate that the potential to achieve a significant increase in system capacity is without a doubt when a larger antenna array is employed at a fixed performance allowance (requirement). Figure 12 shows the simulation results of the acquisition performance versus the signal-to-noise ratio for different timing estimation techniques. In [35], the timing acquisition problem by the use of an antenna array is formulated as a binary-hypothesis test on the assumption that the noise process is spatially correlated and temporally uncorrelated. Under the binary hypotheses, an adaptive generalized likelihood ratio test (GLRT) is developed to acquire the code timings of the user of interest in a fading environment. The GLRT of the desired spreading code vector denoted by Ω(Z[i]), originally due to Neyman and Pearson (NP) [ The test statisticΩ is used to test at each time phase within time period NT c for the existence of the desired signal. The decision on which timing phase the code synchronization is most likely to occur is attained by finding the maximum over the filter bank of tests in a symbol interval. That is, It is demonstrated in the figure that the proposed multistage array detector with M = 4 outperforms significantly the conventional detector and the spatial-only NP-type array detector proposed in [35]. This is because the space-time processing offers the capability of canceling more interferers compared to the time-only or space-only processing. However, the disadvantage of the space-time processing is that the associated computation and processing speed requirements are substantially greater than that needed for the time-only or space-only processing. To decrease the required computation and processing speed, a reduced complexity implementation of the space-time filter based on the MWF is presented in this paper. The essential and important property of the MWF is its orthogonal decomposition structure which extracts the most correlated information necessary to estimate the filterweight vector at the initial stages of the filter. This technique eliminates the need of a full decomposition of the space-time correlation matrix. Apparently, all above features make the proposed space-time adaptive truncated MWF perfectly suitable for the need of the high voice and data rate of wireless communications.

CONCLUSIONS
A low-complexity version of the proposed asynchronous DS-CDMA time-delay detector is developed that utilizes the concept of the multistage reduced-rank Wiener filter. This structure results in a substantial reduction of the computational burden and a rapid adaptive convergence for the filter coefficients without any need for a matrix inversion. Also due to its near-far resistant property of the proposed detector, this new DS-CDMA detector does not require the stringent power control mechanism that is needed in the conventional detector. Only knowledge of the desired signature vector is needed. A separated training period of signal-free observations is not necessary. Evidently those are the same requirements needed by the conventional DS-CDMA detector that uses a standard matched filter. Moreover, the proposed acquisition detector can be anticipated to combine with most previous multiuser DS-CDMA algorithms that require precise knowledge of the propagation delays of all users. The proposed DS-CDMA timing detector achieves superior performance under the environment of a lower filter rank and a smaller number of data samples. This makes it possible to design a lower-complexity detector without a huge loss in performance in comparison with the full-rank system. Furthermore, experimental results show that the proposed array detector substantially outperforms the conventional detector and the spatial-only NP-type code-timing array detector in all simulations at a considerably reduced rank and accomplishes a substantial improvement in propagation delay acquisition when a larger antenna array is employed. The proposed multistage algorithm also shows a better convergence and tracking capability over DS-CDMA systems with the training-based adaptive LMS or RLS algorithm. These facts make the novel space-time adaptive truncated MWF meet the requirements of a lower-complexity, small-size, and light-weight detector that mobile users demand today.