EURASIP Journal on Applied Signal Processing 2005:3, 359–368 c ○ 2005 Hindawi Publishing Corporation Comparison between Coherent and Noncoherent Receivers for UWB Communications

We present a comparison between coherent and noncoherent UWB receivers, under a realistic propagation environment, that takes into account also the effect of path-dependent pulse distortion. As far as coherent receivers are concerned, both maximal ratio combining (MRC) and equal gain combining (EGC) techniques are analyzed, considering a limited number of estimated paths. Furthermore, two classical noncoherent schemes, a differential detector, and a transmitted-reference receiver, together with two iterative solutions, recently proposed in the literature, are considered. Finally, we extend the multisymbol approach to the UWB case and we propose a decision-feedback receiver that reduces the complexity of the previous strategy, thus still maintaining good performance. While traditional noncoherent receivers exhibit performance loss, if compared to coherent detectors, the iterative and the decision-feedback ones are able to guarantee error probability close to the one obtained employing an ideal RAKE, without requiring channel estimation, in the presence of static indoor channel and limited multiuser interference. 1. INTRODUCTION Ultra-wideband (UWB) systems are based on the transmission of subnanosecond pulses, typically obtained by directly driving an antenna with short electrical pulses. According to the FCC regulation of February 2002, signals belonging to this category are required to possess a −10 dB bandwidth which exceeds 500 MHz or 20% of its fractional bandwidth [1]. Recently, this technology has been considered for both adhoc [2] and indoor wireless personal area networks (IEEE 802.15.3a). UWB characteristics are claimed to meet the requirements of these applications, in particular, low complexity , low cost, low power consumption, and high data rate connectivity [3]. Furthermore, the fine delay resolution, guaranteed by the large signal bandwidth, provides a high ro-bustness in dense multipath environments [4]. On the other hand, to fully exploit the channel diversity, a conventional coherent RAKE receiver must be able to capture and track the energy associated with a high number of mul-tipath replicas. In [5], it is shown that the number of paths to be considered to reach the 85% of the overall energy can sometimes exceed 100. In addition, the radiation and propagation processes can act on the transmitted pulse as a filter


INTRODUCTION
Ultra-wideband (UWB) systems are based on the transmission of subnanosecond pulses, typically obtained by directly driving an antenna with short electrical pulses. According to the FCC regulation of February 2002, signals belonging to this category are required to possess a −10 dB bandwidth which exceeds 500 MHz or 20% of its fractional bandwidth [1].
Recently, this technology has been considered for both adhoc [2] and indoor wireless personal area networks (IEEE 802.15.3a). UWB characteristics are claimed to meet the requirements of these applications, in particular, low complexity, low cost, low power consumption, and high data rate connectivity [3]. Furthermore, the fine delay resolution, guaranteed by the large signal bandwidth, provides a high robustness in dense multipath environments [4].
On the other hand, to fully exploit the channel diversity, a conventional coherent RAKE receiver must be able to capture and track the energy associated with a high number of multipath replicas. In [5], it is shown that the number of paths to be considered to reach the 85% of the overall energy can sometimes exceed 100. In addition, the radiation and propagation processes can act on the transmitted pulse as a filter whose characteristics vary from path to path. Therefore, the received signal can be seen as a train of distorted waveforms that often show little resemblance with the transmitted pulse [6,7].
Due to complexity constraints, only a small subset of the received replicas is expected to be selected and combined, a fact that justifies the performance loss illustrated in [4,8,9,10] for various selection combining methods. Furthermore, the presence of pulse distortion increases the complexity of the channel estimation algorithm [11,12], a topic that has not been fully analyzed in the literature yet. In general, it can be expected that complexity constraints will impose suboptimal solutions and determine a further performance loss.
A different approach to overcome all the abovementioned disadvantages is based on the use of noncoherent reception techniques. These techniques do not require channel estimation and allow to capture a large amount of the received energy, despite distortions and multipath propagation. They represent, however, a suboptimal solution, if compared to coherent receivers, because of the adoption of a noisy signal as a reference waveform for the demodulation process.
A technique belonging to this category is based on the transmitted-reference (TR) or the autocorrelation principle (see [10,13,14]). According to this technique, the reference waveform is obtained by averaging over a preamble of unmodulated signals.
The same principle is employed by differential receivers (DRs) [15]. In this case, since the data is differentially modulated, the signal associated to the information transmitted at time n − 1 represents a valid template for the demodulation of the signal at time n.
Finally, in [2], some reception schemes, based on the adoption of energy detectors and orthogonal modulations, are presented.
All those techniques lead to low-complexity receivers, able, in principle, to capture a large portion of the transmitted energy and less sensible than coherent demodulators to channel variations and synchronization mismatch [15].
Some strategies have been recently proposed to minimize the suboptimality of noncoherent detectors [13]. Assuming the channel static over a block of N, N > 1, transmitted signals, the premise of these strategies is the consideration that each received signal contains information that can be used to improve the estimation of the reference waveform. In [13], in particular, TR systems are considered and two maximum likelihood (ML) iterative strategies for template estimations are analyzed. The complexity of these techniques is enhanced by the fact that the iteration process involves the correlation operation, so that the samples of the received signals must be memorized and reprocessed.
As far as traditional differential demodulator systems are concerned, a well-known technique to reduce its suboptimality is the multisymbol detection, developed in [16] for narrowband systems. This method does not require iterations; its complexity, however, is exponential with the block length and quadratic with the number of correlation operations. An established strategy to reduce the complexity of this technique is based on feeding back to the demodulator the estimates on a certain number of previously received symbols. This drastically simplifies the demodulator operations.
In this paper, we present a comparison in terms of complexity and performance of coherent and noncoherent receivers for UWB communications. To this aim, we propose a simple channel model, based on [5], able to take into account the pulse-based distortions effect due to propagations. After briefly describing the principal coherent and noncoherent strategies available in the literature and discussing their complexity, we theoretically analyze how to extend the multisymbol concept to the UWB case. In addition, we propose a decision-feedback (DF) strategy to overcome the complexity issue shown by the above-mentioned receiver.
Finally, we compare the performance of these systems in terms of error probability. In order to fulfill this task, we obtained by simulations the bit error rate (BER) curves for both coherent and noncoherent detectors. In particular, both a single-user and a multiple-access time-hopping scenarios are considered.
The paper is organized as follows. In Section 2, the analyzed UWB system is described in its three main components, the transmitter, the channel, and the receiver. In Section 3, the coherent and noncoherent schemes are presented, analyzing, in particular, the architectural complexity of each solution. In Section 4, the simulation results are presented and, finally, some concluding remarks are given in Section 5.

Introduction
We consider a UWB system employing binary pulse amplitude modulation (2PAM). The signal transmitted by the user k,s (k) (t), is divided into blocks of length T seconds, each one carrying N d 2PAM symbols {α (k) j }. In formulae, The expression of s (k) i (t) is related to the transmission technique and it will be detailed in the next sections. From now on, we assume that T is chosen such that the propagation channel can be assumed static over this interval.
If N u users are active and denoting by h (k) i (t) the channel impulse response associated to the ith signal block transmitted by the user k, the received signal corresponding to the ith block can be written as where τ (k) is a random variable modelling the delay between the transmitter k and the reference transmitter 1, for which τ (1) is assumed equal to zero; n i (t) is a white Gaussian noise process with a two-side power spectral density N 0 /2.

Channel model
A fair general expression to describe a block-static UWB channel is the following one [7]: where p l (t) is the impulse response of the filter associated to the lth propagation path. This permits to take into account the distortions caused by the physical phenomena related to the propagation of the pulses. In our analysis, the set of delays and amplitudes {a l , τ l } N−1 l=0 is generated according to the model proposed by the IEEE 802.15.3a Working Group [5]. This model is based on a modification of Saleh-Valenzuela's [17] one and is able to reproduce the clustering phenomena observed in several UWB channel measurements. In particular, in [5], it is underlined that the number of paths with nonvanishing energy can exceed 50.
The exact characterization of the propagation distortions is a complex task; it is analyzed for example in [7], in which the shape of the received pulse is obtained via numerical integration for different propagation conditions. Following the final remarks of [7], we adopt in this paper a rather simplified model, according to which the path impulse response p l (t) is approximated by an ideal lowpass filter with bandwidth B w,l . Figure 1 illustrates the effect of the lowpass filtering operation over the transmitted pulse x(t), modelled as the second derivative of a Gaussian pulse and with time duration equal to 0.7 nanosecond. It is worth noting that a reduction of the filter bandwidth causes an enlargement of the time duration of the received pulse. This translates in an increase of the interpulse interference and a reduction of the correlation between the transmitted and received waveforms.

Correlation receivers
A general symbol-by-symbol correlation receiver structure for UWB is shown in Figure 2. The received signal is multiplied by a locally available template waveform v n (t), 0 < t < T b , where T b is the bit time, with T b < T, and n is the symbol index. The template signal is generated according to the information acquired by the bit and multiple-access code synchronization algorithms and, possibly, channel estimation. The result of the multiplication is finally passed through an integrator with integration time T w and through a decision block.
Without loss of generality, we focus on the first transmitted block and drop, for notation simplicity, the block index i. Assuming that user 1 is the user of interest, the correlator output α (1) n corresponding to the nth transmitted symbol inside the block is given bŷ In order to compare the complexity of the correlation structures that will be analyzed in the next sections, we will  characterize them according to four complexity parameters: (i) the length of the buffer at the receiver used to memorize the information necessary for the computation; (ii) the operations needed to construct and update the template signal; (iii) the number of correlation operations required to demodulate the data block; (iv) the decision rule.

Pseudocoherent RAKE receivers
In case of perfect channel estimation and absence of intersymbol and multiuser interference, it is well known [18] that a RAKE receiver is the optimal detection scheme in the sense that it minimizes the probability of error.
However, if the knowledge of the channel is not ideal but, rather, is acquired through a suitable estimation algorithm, this structure reduces to a heuristic approximation of the optimal detection scheme. Adopting the same terminology as in [19], we will refer to this class of receivers as pseudocoherent RAKE receivers. With a different notation, this particular structure was analyzed also in [4,10].
We assume that the transmitted signal s (k) i (t) is constituted by a train of N d amplitude modulated waveforms where and E x is the energy per pulse, equal in this case to the energy per bit E b . Finally, the block repetition time T is equal to contains the TH code employed by the user k for multiple-access purpose. The codes considered in this paper are the sequences based on quadratic congruence described in [20]. In our analysis, we will impose that 0 We assume that a suitable channel estimation algorithm (see, e.g., [11]) is able to provide the template generator with a reliable information about the delays and amplitudes of the N p strongest path. However, for complexity reasons, it does not attempt to estimate the received waveform shape, assuming no distortions during the propagation. This estimation algorithm may require the transmission of pilot symbols that would modify the structure of the transmitted signal, as described by (5). However, we will not take into account the presence of this training sequence, in order to keep low the complexity of our model.
If a maximal ratio combining (MRC) technique is employed, the template signal v MRC (t) for the user 1 can be written as where B Np = {l 0 , l 1 , . . . , l Np−1 } is the ordered set of the N p strongest paths indices. A soft estimate of the nth transmitted symbol is then given bŷ The integration time must be chosen such that Assuming, instead, that the information passed by the channel estimator is only partially used by the template generator, then simpler combining techniques can be employed. It is possible, for example, to adopt EGC, exploiting only the information about the delays and the sign of the amplitudes of the received strongest paths. In this case,

Transmitted-reference receiver
A different approach is based on the transmit-reference principle. This concept avoids the channel estimation step, as a previous received signal is employed as template waveform. This approach is clearly suboptimal, if compared to the coherent reception; however, it possesses an inherent architectural simplicity that makes it an ideal candidate for low cost implementations [21]. With this strategy, the transmitted signal within a block consists of N transmitted waveforms, grouped into a preamble of N r reference signals followed by N d data signals. Using the same notation as in [13], the signal transmitted by the user k can be expressed as where this time E b = N d E x /(N r + N d ). In order to limit the effect of the noise in the demodulation process, a bandpass filter z(t), of bandwidth B f , is employed at the receiver. Since the template waveform is a noisy signal, the bandwidth of the filter must be chosen so as to trade signal energy with noise reduction.
Denoting byr(t) the signal at the output of the bandpass filter,r(t) = r(t) * z(t), the template signal is calculated as an average of the reference pulses.
The description of the demodulation process can be simplified if an equivalent discrete model of the received signal is considered. We denote byr n the vector containing the samples of the received signal, associated to the nth transmitted pulse, inside the integration window T w : where T w = N w T s and T s is the inverse of the Nyquist sampling rate. The template signal v(t) can then be expressed as Finally, the soft output of the demodulator is equal tô From an implementation complexity point of view, this receiver requires a buffer capable of containing at least N w N r samples. The template waveform is then calculated averaging on the data contained in the buffer and it is updated on a blockbase. The number of correlation operations for each block is equal to N d . Finally, a symbol-by-symbol threshold comparison is adopted as decision strategy.

Differential receiver
In the DR, the template waveform employed in the demodulation process consists of a delayed replica of a previously received signal. The correlation operation reveals the amplitude variations from one pulse to the other, carrying the transmitted information. Given the following transmitted signal where β (k) 0 is an arbitrary phase and β (k) the output of the integrator can be expressed aŝ It is reasonable to expect that the performance of this receiver, in presence of random hopping codes, will be similar to the one of the TR when N r = 1. The architecture of this receiver consists of an N w samples buffer, fed by a delay block and with symbol-base update. The number of correlations and the decision strategy is equal to the TR case.

Iterative transmitted-reference receivers
We focus on the single-user case, and assume that the system parameters are set such as to avoid intersymbol interference. As described in [13], a strategy to improve the performance of TR receivers is based on the adoption of an ML estimation of the template signal, given the observed block of N vectors r n . We will refer to this receiver as ITR-ML. Calculating this estimator is equivalent to solving v = arg max z∈R Nw   P R0,...,RN−1 (r 0 , . . . , r N−1 | z). (15) This expression can be solved iteratively, leading to the following recursive equation [13]: where m is the iteration counter and The interpretation of (16) is straightforward. The template signal is obtained as an average of the received signals, where the modulated ones are weighted according to their reliability. In fact, In [13], another estimation strategy, based on the generalized likelihood ratio test (GLRT), is presented. We will refer to this receiver as ITR-GLRT. According to this strategy, the template waveform for the nth transmitted symbol is given by v (±1) n = arg max z∈R Nw P R0,...,RN−1 r 0 , . . . , r N−1 | z, b n = ±1). (19) As in the previous case, the equation can be solved iteratively, leading to v (m) with v (0) Evidently, the architectural complexity of these receivers is rather high. In particular, they both require a buffer of length N w N samples to memorize all the received sequence for each block. In addition, the ML-ITR requires an extra buffer of N w samples to memorize the template waveform. On the contrary, the extra buffer length required by the GLRT-ITR is equal to N d N w , as the receiver constructs a template waveform for each one of the received symbols. The template waveform is obtained combining during each iteration the information contained in the buffer with the previous step correlators outputs. Denoting by N i the number of iterations, the ML-ITR requires N d N i correlation operations per block, while the GLRT-ITR requires N d (N d − 1)(N i − 1) + N d . This increase is again due the fact that a different template waveform is associated to each one of the received symbols. Finally, both receivers adopt a symbol-by-symbol threshold comparison decision rule.

Multisymbol receivers
Up to now, we analyzed receivers based on a symbol-bysymbol decision strategy. However, as noted by Divsalar and Simon in their milestone work [16], this strategy is not optimal if the random parameter that prevents us from using coherent detection (in [16], the channel phase rotation, in our case, the entire channel impulse response) is constant over an interval in which more than two symbols are transmitted. The basic idea in [16] is to exploit this time invariance, making a joint decision on several symbols, simultaneously, through an ML sequence estimator.
Our approach is similar to the one adopted in [16]; however, two main differences must be underlined. In our case, the modulation technique is not limited to be differential. Furthermore, as already mentioned, the random parameter, under which the receiver minimizes the probability of erroneously detecting the entire information sequence, is not the channel phase (whose distribution can be assumed uniform) but the entire channel impulse response. Therefore, the strategy developed in [16] based on the existence of a least favorable a priory distribution for the unknown parameter [22] cannot be applied directly to our case. We tackle this inconvenience considering the problem of jointly detecting the information sequence and the channel impulse response. Before analyzing, in details, the receiver structure, it is worth noting that, since the decision strategy is not a symbol-bysymbol one in this case, the model depicted in Figure 2 does not apply. However, we will continue using the same notation as in the previous sections.
We focus on the single-user case. Each received vectorr n in (10) can be decomposed into a useful signal part s and a Gaussian noise part n i . In particular, according to (9), the following expression holds for 0 ≤ i < N: For simplicity of notation, we will term by r the concatenation of the vectorsr i , that is, r = [r 0 ,r 1 , . . . ,r N−1 ], and with a the vector containing the N d 2PAM symbols {α i } Nd−1 i=0 transmitted inside the block.
Starting from the a posteriori probability of r, given s and a and using standard techniques [22], the following expression for the log-likelihood function (LLF) of the couple (a, s) can be derived: The maximization of LLF(a, s) can be carried out in two steps [11]. First, s is varied while a is kept constant. A maximum is then found forŝ =ŝ(a). Finally, in the second step, the function LLF(a,ŝ(a)) is maximized with respect to a. The first step yieldsŝ Substituting (24) in (23), it can be shown that Consequently, the decision rule for the transmitted sequence a becomes the following one: chooseâ such that a = arg max a LLF(a,ŝ(a)).
We will term the receiver adopting the above-mentioned decision rule as multisymbol receiver (MSR). The computation of the LLF of the pair (a, s) can be easily extended to the case of differential modulation. We will refer to this receiver as multisymbol differential receiver (MSDR). Using the same notation as in Section 3.3, one can show that, in this case, the estimateâ of the binary input symbols a is given byâ As far as the complexity issue is concerned, We focus for simplicity on the differential case. As in the previous section, the buffer length is equal to N w N samples, the amount of samples necessary to memorize all the received signals for each block. On the contrary, the receiver does not require this time a template construction or update. The number of correlation operations for each block can be derived from (27) and it is equal to N 2 . Finally, the joint decision algorithm consists of finding the maximum of the function LLF(a,ŝ(a)) : {±1} → R. The complexity of this operation is exponential with the block length N [23].

Decision-feedback differential receiver
The main increase of complexity associated with the MSR is due to the maximization of the LLF. A well-known strategy to overcome this problem is based on the DF technique [24]. This concept consists of a symbol-by-symbol decision strategy, obtained by feeding back the decisions taken on a certain number of previous symbols.
We start with (27) and assume that an estimate of the first N d − 1 symbols of the block is available at the receiver. We denote by {α i } Nd−1 i=1 these estimated values. Substituting into (27), the following decision rule can then be used for the N d th symbol α Nd : (28) Evidently, this is equivalent to employing the same zero threshold decision rule adopted by all the other symbol-bysymbol techniques analyzed up to now. Generalizing (28), the soft outputsα n for the nth symbol in the block can be calculated as follows: It is interesting to note that, owing to the block structure of the transmitted signal, the demodulator operation in (29) requires the knowledge of the decisions taken over a progressively increasing number of previously received symbols. A slightly different approach, similar to the one employed in [19], consists of keeping constant for each symbol the dimension on the feedback vector. In order to do so, the transmitted signal format must be modified such that the block structure is removed, that is, with E b E x for an infinitely long transmitted sequence. In this case, after an initial transient period, it is possible to employ a demodulation rule with DF constant window, equal to N − 1 symbols. In particular, the decision-feedback differential receiver (DF-DR) turns out to be equivalent to a correlation receiver that uses the following template waveform: The complexity of the template update operation in (31) can be reduced noting that it admits a recursive solution. In other words, it is possible to calculate the template signal for the symbol n+1, starting from the template signal for the symbol n. In formulae, it is possible to show that v n+1 =α n v n − r n−N n−1 j=n−N+1α whereα n is the estimated value of the symbol α n . Like in [19], the absence of a block structure has the convenient side effect of allowing time-varying channel model. As far as complexity is concerned, this receiver has the attractive feature to conserve the same number of correlation operations and the same decision rule as the DR. However, it requires a buffer capable to contain all the samples belonging to the feedback window (N w N samples), plus the memory necessary for storing the template signal, and the decisions taken on the previously received N −1 symbols. The template update operation must be performed symbol by symbol, employing the low-complexity operation described in (32). We will refer to this as DF-DR.

SIMULATION RESULTS
In order to compare the receiver structures described in the previous sections also in terms of performance (BER), we simulated a UWB system operating in indoor environment. The system parameters are chosen so as to obtain a pulse rate of around 36 Mpulse/s per user. In particular, we set T c = 4 nanoseconds and N h = 7. With this parameter choice, the effect of intersymbol interference should be in average reduced if the TX-RX pair is linked by a channel defined in [5] as type 1 or 2. These channels are in fact characterized by an average delay spread of 5 and 10 nanoseconds, respectively. For simplicity, we do not consider here the effect of channel coding.
For the block noncoherent schemes, we considered two settings. In the first one, we fix the block length N equal to 10. That corresponds to assume that the multipath channel is static over an interval of 0.28 microseconds. All the noncoherent block schemes are characterized by a bit rate of around 32 Mbps per user that corresponds to N r = 1. The bit rate is instead equal to 36 Mbps for the coherent schemes and the DF-DR. In this setting, the small value of N allows us to evaluate by simulation also the performance of the MSDR. In the second situation, we enlarged the value of N to 20, keeping N r = 1; therefore an improvement of the performance of iterative and adaptive noncoherent schemes is expected. However, we were not able to simulate the performance of the MSDR because of the high computational complexity.
As a multipath channel, we considered the model described in Section 2.2 to take into account the effect of pulse distortions. In particular, the filter p n (t) in (3) is randomly chosen between a 5, 3.5, and 2.5 GHz bandwidth ideal lowpass filter. The transmitted waveform x(t) is a second derivative of a Gaussian pulse, with time duration equal to 0.7 nano second.
As far as coherent RAKE receivers are concerned, we compare MRC and EGC, assuming that the strongest 10 and 5 paths are perfectly estimated. For the noncoherent receivers, the window amplitude T w is set to N h T c = 28 nano seconds and a −10 dB bandwidth bandpass filter is considered. This choice was found to guarantee a good compromise between the amount of captured energy and noise reduction.
The delays τ (k) are modelled as uniformly distributed random variables over [0, T], and a different realization of the IEEE 802.15.3 channel model, modified as described in Section 2.2, is assigned to each user. The BER curves presented in this section are obtained by averaging over the results relative to 20 different indoor scenarios. In the singleuser case, the hopping code has not been simulated.
As far as noncoherent receivers are concerned, it is possible to derive an asymptotical BER curve, to which the performance of iterative or DF structures should be compared. In fact, a successful template estimation should lead to a reconstructed waveform equal to the convolution between the transmitted signal and the channel impulse response of the desired user. In formulae, where the limit is for the block length N and number of iterations m that tend to infinity (the latter is obviously only for iterative receivers). More precisely, the signalv(t) in (33) can be seen as a matched filter to the convolution of the transmitted signal and the channel impulse response, truncated over T w . We will refer to the receiver employingv(t) as template waveform as an ideal RAKE receiver (ARAKE, following the notation of [4], where "A" stands for "all"). In practical cases, the convergence process is limited by the effect of the intersymbol and multiuser interference. For comparison purposes, we derived also the performance obtained employing a filter (MF) matched to the transmitted signal only. In Figure 3, the BER curves for the single-user case are plotted for pseudocoherent receivers. The MF shows rather poor performance due to the low amount of energy the receiver is able to collect. As already noted in [9], the channel statistics allow to obtain with EGC nearly the same performance as MRC. Compared to the performance obtained employing an ideal receiver, able to perfectly reconstruct the received waveform in the observation window T w , the suboptimal RAKE receivers present a loss, at 10 −4 of BER, equal to 5.5 and 8.5 dB, for 10 and 5 fingers, respectively.
In Figures 4 and 5, we plot the BER curves for noncoherent receivers for N = 10 and N = 20, respectively. In both cases, the TR and DR receivers show a loss of approximately 7.5 dB. Therefore, in our setting, these low-complex receivers are already able to outperform a pseudocoherent RAKE receiver equipped with 5 fingers.
The ITR-ML techniques show a limited improvement in performance for N = 10, where its loss from the ARAKE curve at 10 −4 is equal to 6.5 dB, while the loss reduces to 4.5 dB when the block length is enlarged to 20. The results   were obtained by stopping the iterative process after 5 iterations (this value will be kept constant for all the iterative structures that will be analyzed from now on). Moving to the higher-complexity ITR-GLRT technique, the loss reduces to 4 dB and 2.5 dB for N = 10 and N = 20, respectively.
As far as the DF-DR is concerned, its performance is better than those of all the noncoherent receivers analyzed up till now, its loss from the ARAKE curve being equal to 3 and 2 dB, respectively. This is an interesting result, as the complexity of this receiver is definitely lower than the one of the iterative ones. Finally, in Figure 4, we plot the performance of the MSDR able to perform close to the ARAKE receiver, showing a good convergence to this curve for high E b /N 0 at the cost of a really high computation complexity.
It is interesting to point out that for N = 20, the DF-DR is able to perform as well as the pseudocoherent receivers with 10 fingers also in the low signal-to-noise ratio region (corresponding to BER of around 10 −2 ). This is an important result, as it suggests that this receiver is able to guarantee performance similar to a pseudocoherent receiver when adequate channel coding techniques are employed. Therefore, it represents a promising candidate for UWB communications, having good performance and limited complexity.
In Figures 6 and 7, the results for the multiuser case are illustrated, for both pseudocoherent and noncoherent schemes, for a block length of 20. All curves show a BER floor caused by the multiuser interference. TR, ITR-ML, and DR are characterized by really poor performance similar to the MF. On the contrary, both ITR-ML and DF-DR are capable of achieving BER similar to the one obtained by employing a RAKE receiver equipped with 10 fingers. In this case, the convergence to the ideal ARAKE curve is less evident because of the effects of the stronger interference.

CONCLUSIONS
In this paper, a comparison between coherent and noncoherent receivers for UWB systems has been presented under a realistic propagation environment that takes into account also the effect of path-dependant pulse distortion. Two classical noncoherent schemes were considered: a differential detector and a transmitted reference receiver. In addition, the iterative noncoherent schemes recently proposed in [13] were analyzed and their complexity was discussed. Furthermore, the multisymbol principle was theoretically extended to the UWB case and a limited-complexity decision-feed-back differential receiver was considered as a low-complexity suboptimal implementation of the multisymbol technique.
Through simulation, we assessed the performance of all the analyzed techniques in terms of BER, both in the singleuser and multiple-access environments. The results show that the DF-DR is able to outperform coherent RAKE receivers equipped with up to 10 fingers, obtaining performance similar to the MSDR while maintaining a really low complexity. Therefore, it can be considered a promising candidate for low-cost UWB communications.
Future works will be oriented to the design of efficient synchronization algorithms for the noncoherent receivers, and to the study of the effect of channel coding at low signalto-noise ratios.