A Robust Parametric Technique for Multipath Channel Estimation in the Uplink of a DS-CDMA System

The problem of estimating the multipath channel parameters of a new user entering the uplink of an asynchronous direct sequence-code division multiple access (DS-CDMA) system is addressed. The problem is described via a least squares (LS) cost function with a rich structure. This cost function, which is nonlinear with respect to the time delays and linear with respect to the gains of the multipath channel, is proved to be approximately decoupled in terms of the path delays. Due to this structure, an iterative procedure of 1D searches is adequate for time delays estimation. The resulting method is computationally e ﬃ cient, does not require any speciﬁc pilot signal, and performs well for a small number of training symbols. Simulation results show that the proposed technique o ﬀ ers a better estimation accuracy compared to existing related methods, and is robust to multiple access interference.


INTRODUCTION
Direct sequence-code division multiple access (DS-CDMA) is a widely accepted multiple access technique already in use in several real-life systems, such as the universal mobile telecommunications standard (UMTS). Among its properties, that is, low power, high capacity, resistance to multipath, the latter is perhaps the most favourable. However, in many cases, in order to perform equalization, diversity combining, or multiuser detection at the receiver of a DS-CDMA system, knowledge of the multipath channel impulse response (CIR) is necessary. Thus, an efficient and accurate estimation of the CIR is highly desirable, in order to mitigate interference and achieve reliable data detection.
The wireless channel can be characterized either by the conventional tapped-delay line (TDL) model or by a parametric model where the CIR is expressed in terms of time delays and gains of dominant paths. As the chip rate increases, the channel experienced by DS-CDMA systems becomes sparse, making the parametric model more effective, since fewer parameters are adequate for accurate channel representation. Moreover the parametric model is more suitable for receiver structures such as RAKE [1], and for positioning purposes.
The channel estimation task becomes more difficult at the uplink due to the multiple access nature of DS-CDMA systems. In the presence of multipath, it is difficult to time synchronize mobile transmitters so that their signals arrive simultaneously at the base station (BS). Thus, the uplink of DS-CDMA systems is usually asynchronous, the orthogonality of signature sequences is violated, and multiple access interference (MAI) affects seriously channel estimation accuracy.
To combat MAI interference and multipath fading, joint multiuser detection and parametric channel estimation approaches have been proposed in [2][3][4]. The increased complexity of these algorithms renders them impractical in systems accommodating a large number of users in rich multipath environments. Thus, the channel estimation problem is usually treated separately from the detection one. Blind subspace-based channel estimation methods have been developed, which estimate either the parameters of all active users jointly [5][6][7][8][9], or the parameters of a single user [10]. The above methods require long observation intervals, which limit their tracking capability in rapidly varying channels. Maximum likelihood (ML) optimization is another approach usually adopted for multipath channel parameter estimation of a single user. ML-based methods make use of 2 EURASIP Journal on Wireless Communications and Networking training signals and model MAI as colored noise. In [11,12] interfering users are considered unknown at the BS, whereas in [13][14][15] channel estimates from MAI users are exploited during the estimation of a new user, but specific PN sequences are required. The only method that uses relatively few training symbols, exploits available information concerning other active users, and does not require specific signals to be employed, is the one proposed in [16]. The method in [16] follows an ML-based approach and employs a deflation scheme originating from the SAGE algorithm [17]. Specifically, the optimization is performed with respect to a single path, and after this path has been estimated, its contribution is subtracted from the received data. The deflation scheme applies similarly to the rest of the paths.
In this paper we propose a new method for estimating the multipath delays and gains in the uplink of a DS-CDMA system. First, we show that the estimation problem can be described via a nonlinear least squares (LS) cost function, which is separable with respect to the unknown parameter sets, that is, time delays and gains. Then, we prove that the time delays' cost function is approximately decoupled, which allows the development of a computationally efficient linear search method for the estimation of the unknown time delays. Finally, the gain parameters are estimated by solving a low-order linear LS problem. The new method constitutes an interesting alternative interpretation of the channel parameters' estimation problem. Moreover, the problem is formulated in a novel way allowing for easier analysis and manipulations. Simulations results show that the proposed method exhibits a lower mean squared estimation error than the method of [16], at the expense of a negligible increase of the computational complexity.
The outline of this paper is as follows. In Section 2, the signal model is defined and the estimation problem is formulated. In Section 3, the LS cost function is derived and the proposed algorithm is developed. Simulation results are presented in Section 4, while some conclusions are drawn in Section 5.

PROBLEM FORMULATION
Let us consider the reverse link of a DS-CDMA system accommodating K simultaneously active users. If T is the symbol period, {b k (i)} the transmitted symbols, and p k (t) the spreading waveform of kth user, then the baseband signal transmitted by this user can be expressed as Let N be the spreading factor, T c = T/N the chip period, {c k (n), n = 0, . . . , N − 1} the chip sequence, and g(t) the chip pulse. Then, the spreading waveform p k (t) is given by The signal s k (t) of each user is transmitted over a specular multipath channel with P discrete paths having impulse where a k,p and τ k,p are the gain and the delay of the pth path, respectively, and δ(·) is the Dirac function. The signal received by the BS is the superposition of the signals from all users, that is, contaminated by additive, white, Gaussian noise w(t) of power spectral density N 0 . The received signal is oversampled by a factor of Q samples per chip period, while a raised cosine function is used as the chip pulse. 1 The delay spread of the physical channel h k (t), usually encountered in the applications of interest, is restricted to a few chip periods [18]. Also, taking into account the asynchronous access of the kth user to the channel, the first delay τ k,1 could appear anywhere in the interval [0, NT c ) of the BS timing. Thus, a time support of two symbols can be adequate for the total CIR, which is the convolution of the physical channel, h k (t), with the chip sequence {c k (n)}.
Our goal is the estimation of the physical channel parameters for one user assuming that the parameters of all other (K − 1) users have already been estimated. To this end and using the formulation presented above, the samples collected at the BS receiver over a period of M symbols can be written in vector form as where a k , τ k are the vectors of delays and gains of user k, w is the MQN × 1 noise vector, and S k (τ k ) is expressed as follows: B k is a 2 × M data matrix with Hankel structure, C k is a 2N × 2N convolution matrix with its first row containing the chip sequence as , and G(τ k ) is a 2QN × P matrix whose columns contain the oversampled delayed chip pulses denoted in vector form as g(τ k,p ), p = 1, . . . , P. Note that each column of G(τ k ) is a function of a single delay parameter only. Symbol ⊗ stands for the Kronecker product and I Q is the Q × Q identity matrix.
Considering that a new user (called hereafter the desired user) is entering the system, (5) can be rewritten as where the user index has been dropped for simplicity 2 and η comprises the MAI from previously estimated users and thermal noise. We assume that the spreading sequences of all the users are known at the BS, while the desired user is in training mode and has been synchronized to the BS. Although the channel parameters of the interfering users have already been estimated, their symbol sequences have not been detected yet. Hence, MAI can be treated as a stochastic random process [16]. Specifically, MAI vector η can be modelled as a zero mean Gaussian vector with covariance matrix R η = E[ηη H ]. Since the channel parameters and the signature sequences of the interfering users are deterministic, the expectation operator is applied over the transmitted symbols and thermal noise.
Having defined the problem, we proceed with the definition of the cost function appearing in the estimation problem and the derivation of the new algorithm.

The new cost function
As can been seen from (7), the data available for the estimation of channel parameters are contaminated by colored noise η with covariance matrix R η (the estimation of R η is further discussed in the appendix). Hence, a first step for the derivation of the new cost function would be the prewhitening of additive noise as where R −1/2 η is a square root factor of R −1 η . Now, the required channel parameters may be estimated by minimizing the following least squares (LS) cost function with respect to τ and a: The cost function in (9) is linear with respect to the path gains and nonlinear with respect to the delays. Since the two sets of parameters are independent, the optimization problem can be split up with respect to each set [19], that is, where symbol † denotes the pseudoinverse of a matrix. It is apparent that the most difficult part of the above optimization procedure is the maximization in (10). After the optimum delay parameters have been estimated, path gain parameters can be easily computed through (11). The nonlinear problem (10) can be treated either by performing a multidimensional search over the parameter space of τ, or by applying an iterative Newton-type method. In the former case, the computational cost is prohibitive, whereas in the latter, the method can be trapped in a local maximum away from the global solution.
In the following, we show that the estimation of each delay parameter τ p , p = 1, . . . , P can be performed separately leading to a much more efficient estimation algorithm. We begin by rewriting the cost function in (10) as where It is readily seen from (6) that each column of S(τ) depends on a single delay parameter, that is, S(τ) = [s(τ 1 ) · · · s(τ P )]. Then it is obvious that the same property holds for the elements of vector y(τ) as well. Based on this observation, we deduce that the cost function F(τ) would be decoupled with respect to the delay parameters, if matrix D(τ) were diagonal and each element [D(τ)] i,i were associated only to the corresponding delay parameter τ i . Even though matrix D(τ) is not exactly diagonal, we show that it is strongly diagonally dominant, yielding to an approximate decoupling of the cost function (10) with respect to the delay parameters.
To this end, we invoke a proposition proved in [20,21].

Proposition 1.
Let a matrix A ∈ C n×n and let r A be the mean ratio of its off-diagonal and diagonal elements. 3 If this matrix is pre/post multiplied by a unitary matrix Q ∈ C n×m and m n, then the resulting matrix B = Q H AQ (and its inverse) have smaller mean ratios upper bounded by r B ≤ (m/n)r A .
Consequently, if matrix A has diagonal elements of much higher amplitude than the off-diagonal ones, and m n, then matrix B and its inverse are strongly diagonally dominant. To apply the aforementioned proposition in our problem, for example, for matrix D(τ) in (12), three conditions should be satisfied.
The second condition is proved in the appendix, where we show that the amplitude of the diagonal elements of R −1 η is much higher than the amplitude of the off-diagonal ones. Concerning the last condition, from (6), after some algebra, we get

EURASIP Journal on Wireless Communications and Networking
The term BB H is the sample covariance matrix of the information symbols, and can be approximated asymptotically by the identity matrix I 2 , so (14) is reduced to Moreover, the term CC H approximates the 2N × 2N covariance matrix of a PN code sequence. Given that PN sequences have favourable autocorrelation properties [1], this term can also be approximated by an identity matrix I 2N . Thus, (15) is simplified as follows: Recall that the columns of G(τ) contain delayed versions of a raised cosine pulse shaping filter. The inner product of two columns of G(τ), that is, g(τ i ) and g(τ j ), approximates the value of the autocorrelation function of the raised cosine pulse for a lag equal to Δτ = |τ i − τ j | [21]. (Similar analysis can be carried out for other pulse shaping functions as well.) As shown in [21], the raised cosine autocorrelation function very closely resembles the raised cosine function itself. As a result, if Δτ = 0, the inner product takes its maximum value, whereas it decays rapidly as Δτ increases. Even for Δτ as small as a chip period, the inner product is one order of magnitude smaller than its maximum. Accordingly, S(τ) has a structure very similar to a unitary matrix and the proposition can be applied to our problem. Thus, the cost function in (10) can be considered approximately decoupled with respect to the delay parameters. Apparently for delay spacing much smaller than a chip period, the near-to-unitary structure of G(τ) is violated. Despite this fact, by properly extending the above proposition, it can be shown [21] that delay decoupling may still be attained. This is also verified by simulation results in Section 4.

Decomposed form of the cost function
Next we consider a modification of the cost function (10) in order to derive an efficient estimation algorithm. To this end, matrix S(τ) in (7) is partitioned as where S (P−1) corresponds to the first (P − 1) columns of S(τ) and s P ≡ s(τ P ) is its last column. We define also matrix Φ(τ) as which is partitioned similarly to S(τ). Hence, matrix D(τ) in (14) may be partitioned as Using the matrix inversion lemma for partitioned matrices, matrix D(τ) is given by Then, by expressing vector y(τ) in (12) as and after some algebra, the cost function can be written as where τ P−1 = [τ 1 , . . . , τ P−1 ] and Notice that the cost function consists of two nonnegative terms. The first term, F(τ P−1 ) depends only on the first (P−1) delays, and it is actually the cost function (12) of order (P − 1). The Pth path delay appears only in the second term. Provided that the cost function (12) is almost decoupled with respect to the delays, each path can be estimated separately. Let us now assume that (P − 1) path delays have already been acquired and their estimates τ P−1 are accurate enough. Then according to (22)-(24), the estimation of the last delay τ P is reduced to the maximization of the second term, while keeping the rest of the delays fixed, that is, F(τ P | τ P−1 ). Some interesting comments on the cost function should be made here.
(1) The form of the cost function in (22) (b) Else if i = P, then a cycle has been completed. If one more estimation cycle is needed, go to step 3. (7) Obtain the path gain vector a by substituting τ in (11).
Algorithm 1: Summary of the decoupled parametric estimation (DPE) algorithm. equivalently for any permutation on the columns of S(τ). This implies that if any (P − 1) delays have been estimated, the remaining delay can be estimated through (24).
(2) The term F(τ P−1 ) in (23) can be further decomposed through the same procedure we applied to F(τ). It can be shown that F(τ) can be finally decomposed in P terms as Provided that F(τ) is approximately decoupled with respect to the delays, it is easily shown that the contribution of the ith delay to the cost function lies mainly in the ith term of (25). Thus, in case only (i − 1) out of P path delays have been estimated, the estimation of the ith delay can be performed by using the corresponding ith term of (25).

The new algorithm
Having analysed the cost function, we present a new estimation algorithm for the multipath parameters of the desired user. First, we assume that the number of dominant paths P is already known: either specified by the system, or detected by an information theoretic criterion. The channel parameters and signature sequences of MAI users are also assumed known to the BS receiver, and hence the covariance matrix R η can be constructed. The proposed decoupled parametric estimation (called hereafter DPE) algorithm is organized in steps and cycles. At each step, one delay parameter is estimated using the information of already acquired delays. A cycle consists of P steps and at the end of a cycle all delays have been estimated. During the first cycle and while searching for τ i , only (i − 1) delay estimates are available, and thus the optimization involves only the ith term of (25). In the next cycles, the estimates of the other (P − 1) delays obtained in the current and the previous cycles are exploited for the estimation of a single delay, and then (24) is used for maximization.
During each step, the estimation of one delay is performed by a line search: the ith term of (25) or (24) are evaluated over the points of a grid and the point attaining the maximum value is considered as the corresponding delay. Since the desired user has been synchronized with the BS and the delay spread of the physical channel is restricted to a number of chip periods, it is sufficient to scan the delay range [0, NT c /4) with a linear step size δ. Simulation results show that two or three cycles are adequate for the method to converge. After all cycles have been completed, path gains are computed through (11). The DPE algorithm is summarized in Algorithm 1, where matrix S( τ J ) is constructed in a way similar to S(τ) based on the already estimated path delays.
The value of the search step size δ affects the estimation accuracy of the maximization procedure. In any case, the estimates obtained through the line search over the grid are not optimum, although they lie close to it. Obviously, as δ decreases, the estimation accuracy is improved, while the computational complexity is increased. A further refinement of the estimates can be achieved by running some Gauss-Newton iterations or an interpolation method.
Having shown the approximate decoupling of the cost function in (25), the delay estimates acquired through the line search during the first cycle of the algorithm are expected to be close to the optimum point. In fact, if the cost function was perfectly decoupled and an infinite precision search grid was utilized, these first estimates would coincide with the true values. After the first cycle, a single delay is estimated based on the other delay estimates obtained in the current and the previous cycles. If these estimates are closer to their optimum values compared to the respective estimates of the previous cycle, the new delay estimate is likely to also lie closer to its optimum point. Thus, estimation accuracy improves from cycle to cycle and DPE is expected to converge. Of course, when path delays are closely spaced, estimates may not converge to the actual values. Simulations conducted for such scenarios and presented in Section 4 show that although some estimates may not reach their optimum values, the algorithm does not diverge and the total channel estimate, h = G( τ) a, remains close to h.
Among all methods proposed so far for the estimation of channel parameters in a CDMA system, the one that is more relevant to DPE is the method presented in [16]. The algorithm presented there (whitening sliding correlator with cancellation, called hereafter WSCC) stems from an ML cost function, while the subtraction of each estimated path from the received data comes as a natural application of the SAGE 6 EURASIP Journal on Wireless Communications and Networking algorithm. On the other hand, our method depends on a LS cost function, which is proven to be almost decoupled with respect to the delay parameters. Hence, the maximization can be performed on every delay parameter separately. The deflation procedure (i.e., extracting the contribution of already resolved paths) is encapsulated naturally in the cost function, yielding better estimation results. One of the main differences between the two methods concerns the estimation of path gains. WSCC estimates each path gain exploiting only the corresponding delay parameter, while DPE estimates jointly the path gains after all path delays have been estimated. Of course, such an approach could be easily adopted as a final step in WSCC as well. Even then, the two methods would not have the same performance, since the joint estimation of path gains in DPE is being exploited while estimating each delay parameter. As will be shown by simulation, DPE exhibits a lower estimation error at the expense of a slight increase in computational complexity compared to WSCC. More specifically, the computational complexity of both algorithms per iteration of the line search is (MQN) 2 + O(MQN). Moreover, both algorithms require as an initial step the inversion of the block diagonal matrix R η , which is O (MQ 2 N 3 ). The extra computational cost of DPE is related to the computation of matrix R −1 η at the beginning of each step, that is, at the beginning of the line search for a delay parameter. Without taking into consideration the block diagonal form of R η , as well as the order recursive form of S (P−1) between consecutive steps of the algorithm, this extra computation requires at most P(MQN) 2 + O(MQN) operations, which can be considered insignificant. Notice here that direct inversion of the block diagonal matrix R η can be avoided by using the approximation (A.7) provided in the appendix. Although this approximation has a significant computational advantage, it may limit the robustness of the scheme to MAI, and it is an issue of current investigation.

SIMULATION RESULTS
In this section, we investigate the performance of the new algorithm through computer simulations. Most of the system parameters used in the simulations were in agreement with the UMTS specifications for FDD (frequency division duplexing) [18]. Specifically, the scrambling codes were of length N = 256, the modulation used was BPSK, the chip pulse was a raised cosine function with roll-off equal to 0.22, and the oversampling factor Q was equal to 2. The pilot signal consisted of 5 to 8 symbols, in accordance with the UMTS specifications for channel estimation and other purposes.
ITU vehicular channel A [22], described in Table 1, was used in our simulations. The channel impulse response consisted of four paths (P = 4). The path gains for all users were random variables following a zero mean Gaussian distribution with variances [0, −1, −9, −10] dB, while the path delays of the desired user were fixed to the values [0, 1.19, 2.72, 4.18]T c . Considering the asynchronous nature of the system, the delays of the interfering users were modelled as random variables. The first delay of kth user, τ k,1 , followed a uniform distribution in the interval [0, NT c ), while the remaining three delays were uniformly distributed in the interval [τ k,1 , τ k,1 + 10T c ].
The estimation accuracy of the proposed algorithm was evaluated in terms of the normalized mean squared channel estimation error (NMSE), that is, the NMSE between actual and estimated total CIR: where h tot is a 2QN × 1 vector containing T c /2-spaced samples of the actual total CIR defined as and h tot is defined similarly as the estimated total CIR. The results presented in this section were obtained through 1000 Monte Carlo simulation runs.
Comparisons are made with the WSCC algorithm, since this is the most relevant method to DPE among all existing ones. The asymptotic CRB is also presented. Notice here that the parameter estimates τ, a, were obtained by running the basic versions of the two algorithms, that is, without any further refinement by Gauss-Newton iterations or interpolation. The step size used during the maximization procedure for both algorithms was set to δ = 0.125T c , and two estimation cycles were performed.
In Figures 1-2, the NMSE versus E b /N 0 is presented for a pilot signal of M = 5 and 8 symbols, respectively. E b is defined as the received bit energy for the desired user. There were K = 64 active users and the signal-to-interference ratio (SIR), defined as the received power ratio between the desired user and one interfering user (as specified for the UMTS in [18]), was set to SIR = 0 dB. It can be seen that the two algorithms at the low SNR region (below 15 dB) exhibit similar behaviour. But in the medium to high SNR region, DPE outperforms WSCC. Specifically, above 20 dB, each cycle of DPE has a 2 dB gain in NMSE compared to the corresponding cycle of WSCC. Moreover,the first cycle Vassilis Kekatos et al. of DPE attains the same NMSE as the second cycle of WSCC. The gain in estimation error is higher for increasing SNR.
To evaluate the channel estimation accuracy of the proposed algorithm under different system load conditions, we conducted simulations with K = 16, 64, and 128 active users. Figure 3 shows the NMSE achieved after the second cycle of each algorithm. As expected, heavier system loads result in performance degradation, while DPE still shows higher estimation accuracy. In Figure 4, the robustness of the two algorithms to the near-far problem is investigated. The system here accommodated K = 16 active users, and each of them had an SIR ranging from −20 to 10 dB. The SNR was kept fixed at 20 dB, and M = 5 training symbols were used. Notice that both algorithms are robust to MAI, since their accuracy remained almost constant for all tested SIR values. DPE algorithm exhibits again superior performance.
The simulation results presented before were obtained based on perfect channel estimates for the interfering users  and thus perfect knowledge of the MAI covariance matrix. In a more realistic scenario, the BS may not have all this information, either because of Doppler fading, or because one or more interfering users become active before the desired user parameters are estimated. To assess the effects of a timevarying channel, we assumed a maximum mobile velocity of 50 km/h, which at the operating band of 2 GHz leads to a Doppler frequency of around 100 Hz. The worst-case scenario would be when all channel estimates stored at the BS were the ones obtained at the previous slot (0.66 millisecond old [18]). Concerning the problem of unknown users, we tested the case where one or two out of K = 64 active users entered the system and the BS did not exploit their contributions in MAI covariance matrix. The NMSE curves of Figure 5 show that for both Doppler fading and unknown users, the method can still be applied with an inevitable performance loss.
The proposed algorithm assumes that the number of dominant channel paths P has been already estimated at the BS, for example, by using an information theoretic criterion (AIC, MDL). However, in practice, P can be overestimated or underestimated. To this end, we evaluated the performance of DPE for P = 2 and P = 6 paths, while the actual channel consisted of P = 4 paths. The simulation results illustrated in Figure 6 indicate that the new method is only slightly affected in case of overestimation with respect to the number of paths, while for high SNRs its performance may be even improved. This is intuitively justified by the fact that searching for more than the actual number of path delays increases the possibility to detect the ensemble of the true delays, especially those of low power. On the other hand, as expected, underestimation of P can result in severe performance degradation, since a part of the channel energy is not captured.   As shown in Section 3.1, decoupling of the delay parameters is based primarily on two conditions: matrix R −1 η should possess a "heavy" diagonal, and matrix S(τ) a nearto-unitary structure. To verify the validity of these assumptions, we plot in Figure 7 the maximum normalized amplitude across the diagonals of the main block of R −1 η for three completely different scenarios with respect to SNR, SIR, and number of users. The amplitudes for the first and third scenarios almost coincide, while the second scenario exhibits Vassilis Kekatos et al. off-diagonal elements of lower amplitude. In all three cases, the off-diagonal elements of the matrix are one order of magnitude smaller than the diagonal ones. As far as the second condition is concerned, in Figure 8, we plot the normalized amplitude of S H (τ)S(τ) by projecting a 3D mesh plot on the proper sideview. Matrix S H (τ)S(τ) was generated according to the four test environment channel models with different delay spreads, which are described in Table 1. Channel (a) used in the previous simulations, as well as channel (d), have a comparatively large delay spread, and thus matrix S(τ) is near-to-unitary. However channels (b) and (c) consist of closely spaced delays and near-to-unitarity condition is violated. To investigate DPE's robustness for closely spaced delays, we also simulated ITU indoor office channel B described in Table 1. Since path delays were closely spaced, the algorithm fails to estimate correctly all paths. A single path located at an intermediate delay and one more path of negligible power are usually the estimates for two closely spaced paths. As shown in Figure 9, the performance of the proposed algorithm is not actually affected and h tot remains a good estimate of h tot . The only possible drawback could be a diversity order loss in case of a RAKE receiver which naturally exploits multipath channel parameters.

CONCLUSIONS
In this paper, a new method for estimating the multipath channel parameters of a single user in the uplink of a DS-CDMA system has been proposed. The estimation procedure is performed at the BS, and multiple access interference from other active users is treated as colored noise. The new method is based on a proper description of the problem via a nonlinear LS cost function which is separable with respect to time delays and gains of the multipath channel. An approximate decoupling of the nonlinear cost function in terms of the delay parameters leads to an iterative procedure of 1D optimizations. At each step of the algorithm, a single delay is estimated while the rest are kept fixed. Additional cycles of the algorithm allow for further improvement of the estimates. The suggested method does not require any specific pilot signal and performs well for a short training interval (5-8 symbol periods). Simulation results have shown its robustness to multiple access interference, as well as its higher estimation accuracy compared to an existing method, at the expense of an insignificant increase in computational complexity. Moreover, in case of unknown users, severe Doppler fading, or underestimation, the method still maintains acceptable performance with an inevitable loss.