EURASIP Journal on Applied Signal Processing 2005:5, 698–708 c ○ 2005 Hindawi Publishing Corporation A Highly Efficient Generalized Teager-Kaiser-Based Technique for LOS Estimation in WCDMA Mobile Positioning

Line-of-sight signal delay estimation is a crucial element for any mobile positioning system. Estimating correctly the delay of the first arriving path is a challenging topic in severe propagation environments, such as closely spaced multipaths in multiuser scenario. Previous studies showed that there are many linear and nonlinear techniques able to solve closely spaced multipaths when the system is not bandlimited. However, using root raised cosine (RRC) pulse shaping introduces additional errors in the delay estimation process compared to the case with rectangular pulse shaping due to the inherent bandwidth limitation. In this paper, we introduce a novel technique for asynchronous WCDMA multipath delay estimation based on deconvolution with a suitable pulse shape, followed by Teager-Kaiser operator. The deconvolution stage is employed to reduce the effect of the bandlimiting pulse shape.


INTRODUCTION
Even though mobile phone positioning is a rather new concept, various potential technologies enabling the development of mobile location techniques have emerged and some of them are already in the market [1]. Positioning technologies have recently been devised using either cellular networkbased, mobile-based, or hybrid approaches [1,2,3,4]. In wideband CDMA (WCDMA) networks, mobile positioning is performed based on signal delay measurements from three or more base stations (BSs). In downlink transmission, the received signal strength, when coming from a remote BS can be quite weak, especially when the mobile terminal is close to the serving BS [5]. This situation is usually referred to as the hearability problem. One idea to overcome this problem, initially proposed in [6], is that each BS turns off its transmission for a well-defined period of time to let the terminals measure the other BSs within its coverage. This technique is known as idle period-down link transmission (IPDL) [7]. Hence, the estimation of the first arriving path, which hereinafter will be assumed to be the LOS signal, is done during these idle periods.
At the mobile terminal side, a typical received WCDMA signal is composed of a sum of multiple propagation paths that may arrive at subchip delay intervals, generating closely spaced multipaths [8,9,10]. This scenario of subchip overlapping multipath propagation causes a major degradation of the positioning accuracy [11,12,13]. Many techniques have been proposed in the literature to solve closely spaced paths. A subspace-based approach, which has been proposed in [14,15,16] proved to have good performance. However, this approach, the so-called MUSIC, suffers from high complexity of implementation in WCDMA systems. Another technique applied also to solve closely spaced multipath components is based on constrained inverse filtering methods. The best known ones are the least squares (LS) techniques [17,18,19] and the projection onto convex sets (POCS) algorithm [20,21,22]. The performance of all these techniques is significantly affected by the presence of the root raised cosine (RRC) pulses and further methods should be derived to improve the delay estimates. The use of nonlinear quadratic Teager-Kaiser (TK) operator for multipath delay estimation in such scenario was introduced for the first time in [23]. The impact of the pulse shape on the capability of acquiring correctly the multipath delays was also emphasized.
In this paper, we propose a highly efficient generalized Teager-Kaiser (GTK) approach for multipath delay estimation in WCDMA network mobile positioning with RRC pulse shape. The main emphasis is on the closely spaced multipath scenario (i.e., successive paths are at most one chip apart) because this is one of the most challenging situations in the delay estimation process. Also, solving closely spaced paths leads to other applications, such as maximum ratio combining (MRC) with increased diversity via subchipspaced components. In Section 2, the channel and signal models are first described. Then, the GTK-based delay estimation technique is introduced in Section 3 with emphasis on the optimization of the filter function in the frequency domain. Simulation results using different Rayleigh and Rician fading channels are provided in Section 4, comparing the proposed GTK-based technique with the conventional TK-based methods as well as the well-founded techniques such as MUSIC, LS, POCS, and the conventional matched filter (MF). Finally, conclusions are drawn in Section 5.

CHANNEL AND SIGNAL MODEL
The received WCDMA signal at the output of the receiver matched filter (i.e., filter matched to the transmitter RRC pulse shape), via an L-path fading multipath channel and N BS BSs, can be written as [5,10] where ⊗ stands for the convolution operator, p rc (t) is the raised cosine (RC) pulse shape, and r s (t) is the signal without the pulse shaping defined as E bu is the energy of the uth BS (we assume that all bits of the same BS have the same energy), L is the number of discrete multipath components (we assume the same number of paths for all BSs), α (m) l,u (t), and τ (m) l,u (t) represent, respectively, the instantaneous complex valued time varying channel coefficient and delay of the lth path of BS u during symbol m, η is an additive white Gaussian noise of double-sided spectral power density N 0 filtered with root raised cosine filter (pulse shaping at the receiver side), and s (m) u (·) is the signature of BS u including data modulation, defined as (for clarity, we assume that all BSs have the same symbol period and the same chip period) where c (m) k,u is the kth chip of BS u during the mth symbol, δ(·) is the Dirac function, and S F is the spreading factor assumed to be the same for all BSs. The signatures of all BSs are assumed to be known at the receiver. This corresponds to the situation where pilot signals are available, for example, the common pilot channel (CPICH) signals in downlink WCDMA environment [7].
The output of the matched filter (or correlator) during the mth symbol with lag τ is given by Inserting (1) into (4), and after some manipulations, y (rc) u (m, τ) can be rewritten as follows: whereη(·) is the filtered noise, and the function γ(·) is defined as The term β (m,m ) k,k ,u,v is defined as

GENERALIZED TEAGER-KAISER-BASED DELAY ESTIMATION
The nonlinear quadratic Teager-Kaiser (TK) operator was first introduced for measuring the real physical energy of a system [24]. It was found that this operator is simple, efficient, and able to track instantaneously varying spatial modulation patterns [25]. Since its introduction, several other applications have been found for TK operator, one of the most recent being the estimation of closely spaced paths in DS-CDMA systems introduced by the authors for GPS and WCDMA systems [23,26,27]. We found that the TK operator has good performance in separating closely spaced paths when rectangular pulse shaping is used. However, the performance degrades when using bandlimiting pulse shape (case of RRC) as is the case in WCDMA system.  The continuous-time TK energy operator of a complex signal φ c (t) is defined by and similarly the discrete-time TK operator applied to a discrete complex signal φ d (n) is readily defined by [27,28] In [23], the authors demonstrated the good performance and low computational complexity of TK approach, especially for ideal rectangular pulse shapes, when compared to well-known techniques (e.g., MUSIC) for estimating closely spaced multipath delays in CDMA systems. However, the probability of acquisition of all compared techniques deteriorates dramatically when the RC pulse shape filter is used. In this contribution, we generalize the TK-based multipath delay estimation technique to be used for bandlimited pulse shaping. The idea is to introduce a new deconvolution type filter function x(t) by which we filter the correlation function y (rc) u (m, τ) obtained via RC pulse shaping to recover an approximation of the correlation function y (triang) u (m, τ) ideally obtained via rectangular pulse shaping.

Basic deconvolution approach
Assuming that the WCDMA transmitter and receiver were using ideal rectangular pulse shaping filters, similar to (5), the matched filter or correlator output could be expressed as where p triang (·) is the triangular pulse shape (i.e., convolution of two rectangular pulses) andη(·) is an additive white Gaussian noise of double-sided spectral power density N 0 filtered with rectangular filter (pulse shaping at the receiver side). The desired deconvolution filter impulse response x(t) should satisfy the following equation: In the absence of noise, by substituting (5) and (10) into (11), we obtain A sufficient condition for x(τ) is that By applying the Fourier transform to (13), it follows that where P rc (·), P triang (·), and X(·) are the Fourier transforms of p rc (·), p triang (·), and x(·), respectively. Finally, the desired frequency response X( f ) should satisfy Some modifications in the frequency domain should be made to avoid the division by zero in (15). Basically, the filter function X( f ) should be optimized to maximize the probability of LOS acquisition. First, we consider a simple rectangular window in the frequency domain to eliminate the division by zero, and then, the effect of linear transition between passband and stopband is also considered. With the obtained filter x(t) and given y (rc) u (m, τ), we derive the filtered autocorrelation functionŷ

Frequency-domain optimization
The frequency responses of the raised cosine and triangular (i.e., convolution of two rectangular pulses) pulses are given by [10] P rc ( f )  and, respectively, where α is the rolloff factor (in WCDMA system, α = 0.22 [7]), and the function sinc(·) is defined by We point out here that at the frequency f z = (1 + α)/2T c , both triangular and raised cosine frequency responses have zero value, and X( f z ) = ∞. Therefore, the optimization of the deconvolution filter X( f ) consists of finding the frequency f max z beyond which the deconvolution filter frequency response is designed to approximate zero.
In Figure 2, we show the frequency responses of both triangular and raised cosine pulses. Also we show a possible value for the frequency f max z , which must satisfy the following equation: Here, we notice that when f max z is close to (1 + α)/2T c , the overall frequency response of X( f ) contains very high discontinuity at this frequency. To reduce the discontinuity, we set f max z in such way that X( f max z ) is in the same range as X(0) (in our case X is normalized to 1, X(0) = 1). Hence, for a chosen value of X( f max z ), f max z should approximately satisfy the equation where β is a design parameter, which will be discussed in Section 4.
The value retained for f max z can be determined empirically in an iterative manner as the searching range is limited (see (19)). Once the value of f max z is set, a rectangular tran- can be used. This kind of solution will introduce an abrupt change in the frequency response X( f ) and also will cut the range of frequencies between f max z and f z = (1 + α)/2T c (the first zero frequency in the frequency response of the raised cosine pulse), which may decrease the performance of GTK. One alternative to the abrupt transition is to use a linear transition, such as where We point out here that for an abrupt transition we have A 0 = B 0 = 0.

SIMULATION RESULTS
The simulations are carried out in a downlink multiuser WCDMA environment with both Rayleigh and Rician fading channel paths. The delay estimation is based on the CPICH signal available at the mobile terminal. The channel is supposed to be Rayleigh with the probability p R and Rician with the probability 1 − p R . The delay separation between successive paths is uniformly distributed in [T c /N s ; T c ], where N s denotes the oversampling factor and T c is the chip interval. This corresponds to the case of closely spaced paths. The path delays are modeled as integer multiple of the sampling period. Our choice of N s comes from the way of modeling the delays and the separation between successive paths. For a given N s , the resolution is 1/N s chips. Therefore, for N s = 8 we set the minimum path separation to 32. nanoseconds, which is quite reasonable in outdoor propagation. However, if we use a smaller oversampling factor (N s = 2 and 4), we notice that all the algorithms have the same behavior, but the estimation error is higher. In order to achieve enough accuracy, either we increase the oversampling factor or we use interpolation techniques to estimate the delays which are not multiple of the sampling interval (using, e.g., the polynomial-based interpolation filters which can be efficiently implemented using the Farrow structures [27]).
In all the simulations, we considered N BS base stations with N u users per BS, and the total power of dedicated physical data channel (DPDCH) is 10 dB higher than the power of CPICH. We used noncoherent averaging over several symbols after the square envelope detection to reduce the effect of noise (see Figure 1). We assumed that the fading is independent from one block of symbols to another, which corresponds to the situation with high mobile velocity.
The threshold for LOS component estimation is set adaptively based on the estimates of the signal to interferenceplus-noise ratio (SINR) of the system [29]. In weak interference conditions, we found that the statistics of the MF output (mean and median) can characterize efficiently the received signal power, and thus they are enough to build the threshold. However, in strong interference conditions, the statistics of the MF output are very noisy and they cannot be used for accurate estimation of the delay of the LOS signal. We showed that a threshold depending only on the inverse of the estimated SINR has good performance in terms of acquiring correctly the delay of the LOS signal. This kind of threshold is empirically found [29]: where SINR is an estimate of the signal to interference-plusnoise ratio, G m is the geometrical mean between the mean and the median of the MF output, and ξ T is a threshold parameter that defines the region between weak and strong interference (here, ξ T = 14) [29]. First, we show the effect of the choice of f max z on the performance of GTK. Then, we present the effect of the linear transition used in the expression of X( f ) when compared to the abrupt transition, and finally we compare the GTK to TK and other known algorithms such as MUSIC, LS, POCS, and MF in terms of capability of acquiring correctly the LOS component. The effect of these deconvolution filters on GTK performance in the case of 4 closely spaced paths with average powers −2, 0, 0, and −3 dB is presented in Figure 4. The channel profile is Rayleigh with the probability p R = 0.7 and it is Rician with the probability 1 − p R (Rician factor is exponentially distributed with mean µ Rice = 4). In this figure, we show the probability of LOS acquisition within ±1 sample error. The number of BSs used is 3 (i.e., 2 interfering BSs) with equal spreading factor S F = 256 [7]. The near-farratio (NFR) is defined as in [14,15], NFR = 10 log 10 (P k /P 1 ), where P k is the power of the interfering BSs (all interfering BSs have same power), and P 1 is the power of the desired BS. The probability of LOS acquisition is computed over N random realizations of the channel (N random = 200). We can see that by using filters FA 2 and FA 3 , we can achieve the highest probability of LOS acquisition. If we choose f max z to be very close to the frequency (1 + α)/2T c (zero frequency), where the discontinuity is very high, we can see that GTK performs worse than TK. Also, if f max z is very close to the frequency (1 − α)/2T c (filter type FA 4 ), the discontinuity is low and GTK performs better than TK but still the probability of acquisition is less than that in the cases of FA 2 and FA 3 . Therefore the optimum filters with abrupt transition correspond to the situation where the discontinuity is moderate (i.e., β ∈ [0.7, 1.3]).

Frequency response optimization
Next, we consider the effect of linear transition in the range of frequencies [ f max z , f z ]. The level of the discontinuity is kept the same as for the abrupt transition. Figure 5 shows the four different types of filters considered: The effect of these deconvolution filters on GTK performance under the same simulation parameters as those described for abrupt transition are characterized by the probability of LOS acquisition shown in Figure 6. We can see that the filters FL 2 and FL 3 give the highest probability of LOS acquisition. For example at NFR = 0 dB the probability increases from 0.5 with TK to approximately 0.68 with both FL 2 and FL 3 .
Therefore, in both cases, with abrupt transition and linear transition the optimum filters correspond to the situation where the discontinuity is moderate (cases of FA 2 , FA 3 , FL 2 , and FL 3 ).
The comparison of optimum filters with abrupt transition to the optimum filters with linear transition is shown in Figure 7. We can see that the probability of LOS acquisition is higher with linear transition than it is with abrupt transition. Therefore, the optimum filter which will be retained is FL 2 (linear transition and moderate discontinuity).

Performance comparison
To compare the performance of GTK with other delay estimation methods (TK, LS, POCS, MUSIC, and MF), we selected the best filter among the tested ones, which is FL 2 .  Figures 8 and 9 show the probability of LOS acquisition for TK, GTK, MUSIC, MF, LS, and POCS in the case of 5 closely spaced paths with average powers −1, 0, 0, −2, and −3 dB. The first channel tap is Rayleigh fading with probability p R = 0.9, and it is Rician with probability 0.1. The number of BSs used is 3 with 8 users in each. In Figure 8 the spreading factor S F = 64 and in Figure 9 the spreading factor S F = 256. The reason why we used different spreading factors in Figures 8 and 9 is that with a spreading factor S F = 256, which is the value given by the standard, the simulation for MUSIC algorithm is time and memory consuming. Therefore, for S F = 64, MUSIC algorithm will be included in the comparison and for S F = 256, only GTK, TK, POCS, MF, and LS will be included. The noncoherent block averaging length is equal to 30 symbols and E b /N 0 = 0 dB. It is clear that GTK outperforms TK, POCS, MF, LS, and MU-SIC in terms of probability of LOS acquisition. At high NFR, MF and LS algorithms give very poor results (NFR ≥ 10 dB, P LOS ≤ 0.01). At low NFR, GTK achieves an improvement of 5 to 15 % in the probability of LOS acquisition when compared to TK, for example, at NFR = −10 dB, the probability increases from 0.55 with POCS to 0.58 with TK, and to 0.63 with GTK (see Figure 8). POCS and TK have almost the same performance. MUSIC algorithm performs better than MF and LS, especially with moderate to high NFR (when MUSIC is compared to LS algorithm, an improvement of 7 to 25 % can be achieved (7% at NFR = 20 dB and 25% at NFR = 0 dB)).
When we compare Figures 8 and 9, we can see that we have slightly better performance for S F = 64. This can be explained by the fact that we have different channel realization. However, the overall behavior is the same for both cases.
Also, we mention here that the measure of performance considered in this paper is the probability of LOS acquisition within 1 sample error. This means that the first arriving path is detected within an error of 32.5 nanoseconds (given a chip rate of 3.84 × 10 6 and oversampling factor N s = 8 [7]). If we assume that the first arriving path from at least 3 BSs is always the LOS signal, then 1 sample error means a location error of less than 1 meter. However, when the first path is NLOS component, the error is larger and depends on the maximum delay spread of the channel. To present the results in terms of location error, a procedure of detecting LOS and NLOS situations should be provided. This topic is of utmost importance in the positioning procedure but it is a separate research topic, which is outside the scope of this paper. The authors have presented two different techniques for detecting whether if the first path received by the mobile station is a LOS or NLOS signal [30]. However, in this paper we chose to concentrate on the multipath delay estimation. To see the robustness of these algorithms to the noise level, we fixed the level of NFR to −10 dB. Figure 10 shows the probability of LOS acquisition for TK, GTK, MUSIC, MF, LS, and POCS in the case of 5 closely spaced paths with average powers −3, −1, 0, −3, and −4 dB. The first channel tap is Rayleigh fading with the probability p R = 0.7, and it is Rician with the probability 0.3. The number of BSs used is 3 with 32 users in each. We can see that the LS algorithm is very sensitive to the noise level. The superiority of GTK over all other algorithms is maintained for all values of E b /N 0 .

Complexity comparison
In this paper, we consider mostly the delay estimation performance in the design of the deconvolution filter x(t). In practical implementation, the system can be simplified in many ways. In frequency domain implementation, the overall cost depends mainly on the TK operator complexity, which is very low as seen in (9), and also on the FFT/IFFT blocks. The required FFT/IFFT length requires further investigation to obtain good tradeoff between low complexity and best performance (in our simulation we used a pulse length of 2 14 ). The deconvolution filter impulse response can also be approximated in time domain and implemented by an FIR filter. With the current digital signal processors (DSPs) and field-programmable gate arrays (FPGA) performance, we can achieve quite efficient implementation and with low complexity using either a programmable or nonprogrammable solution.
Considering mobile positioning applications where the mobile station is actively involved, which is the case for WCDMA systems, very good tradeoff between delay estimation accuracy and implementation complexity should be achieved. Taking into account the simulation results, we can see that the most promising techniques which are able to solve closely spaced paths are GTK, TK, and POCS algorithms. The complexity of POCS (i.e., the number of required multiplications per delay estimate) is of the order of O(N iter τ 3 max ), where N iter is the number of POCS iterations (e.g., equal to 5 in our simulations) and τ max is an integer number related to the expected maximum delay spread of the channel. The high complexity of POCS is mainly due to the matrix inversion operations, and can be a limiting factor for implementation in mobile terminal. However, the complexity can be substantially reduced if the matrices and their inverses are computed only once and stored in memory throughout the iterative process. The TK algorithm, as shown by its expression in (9), has very low implementation complexity, but its performance for estimating the first arriving path is not as good as for GTK algorithm. The main difference between TK and GTK implementation as shown in Figure 1 is the FFT/IFFT blocks (the coefficients of X( f ) need to be computed only once as it is independent of the received signal). The good performance of GTK can justify the small increase in the implementation complexity.
In Table 1, we show a comparative summary of all the presented algorithms from the point of view of the estimation accuracy and implementation complexity.

CONCLUSIONS
In this contribution, a new generalized TK technique for LOS signal delay estimation in downlink WCDMA transmission, using root raised cosine pulse shaping, over closely spaced multipath fading channels was introduced and its performance was compared with well-established methods. The correlation function between the received signal and the locally generated code is filtered with the proposed deconvolution filter to approximate the rectangular-pulse-shapebased correlation function. Then, the nonlinear TK operator is applied to the filtered correlation function to estimate the LOS delay through the use of an adaptive threshold. Simulation results confirmed the good performance of the proposed GTK deconvolution-based technique compared to MF, MU-SIC, POCS, LS, and the conventional TK method. With the proposed modifications, the performance of GTK technique approaches that of TK in systems with rectangular pulse shape. The overall structure has reasonable computation complexity and it is simple to implement. However, further work is still needed to find an efficient practical implementation structure.