Impact of the Gaussian Approximation on the Performance of the Probabilistic Data Association MIMO Decoder

The probabilistic data association (PDA) decoder is investigated for use in coded multiple-input multiple-output (MIMO) systems and its strengths and weaknesses are determined. The conventional PDA decoder includes two approximations. The received symbols are assumed to be statistically independent and a Gaussian approximation is applied for the interference and noise term. We provide an analytical formula for the exact probability density function (PDF) of the interference and noise term, which is used to discuss the impact of the Gaussian approximation in the presence of a soft-input soft-output channel decoder. The results obtained resemble those obtained for the well-known PDA multiuser detector in coded CDMA systems for which similar investigations have been done before.


INTRODUCTION AND BACKGROUND
Probabilistic data association (PDA) has originally been developed for target tracking by Yaakov Bar-Shalom in the 1970s. Since then, it has been applied in many different areas, including digital communications. In the area of digital communications, the PDA algorithm is a reduced complexity alternative to the a posteriori probability (APP) decoder/detector/equalizer. Near-optimal results were demonstrated for a PDA-based multiuser decoder (MUD) for code division multiple access (CDMA) systems [1,2]. Recently, This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. probabilistic data association has been shown to achieve good results in multiple-input multiple-output (MIMO) systems [3,4]. In [5], a PDA was presented for turbo equalization of a single antenna system. It should also be noted that the Gaussian assumption made in the PDA decoder is used in several other MUD detection schemes, especially when applying iterative detection and decoding schemes, for example, [6,7,8]. In [9], it was shown that the performance of a coded CDMA system with PDA decoder degrades if the number of users is not large enough.
In this paper, results for a PDA MIMO decoder in combination with a soft-input soft-output channel decoder are presented, where both decoders are not forming an iterative detection and decoding scheme (see Figure 1). This is done in order to demonstrate the impact of the unreliable soft outputs which is far less obvious when using an iterative decoding scheme. Because the PDA decoder inherently provides estimates of the a posteriori probabilities of the transmitted data symbols, it seems to be well suited for the use in conjunction with a soft-input channel decoder. However, the results presented in the following show that the PDA MIMO decoder does not always work as well as expected. We provide an exact formula for the probability density function (PDF) of the interference and noise term to calculate the exact symbol probabilities for the symbol-by-symbol detection done in the PDA. Simulations based on these probabilities show that the Gaussian approximation made in the PDA decoder has a large impact on the quality of the soft outputs provided to the channel decoder, and therefore on the channel decoding itself. It can be concluded that the quality of the Gaussian approximation, and therefore of the soft outputs, depends on the number of transmit antennas and on the cardinality of the symbol alphabet. To our best knowledge, such an analysis of the PDA MIMO decoder has not been presented before.
The remainder of this paper is organized as follows. We first introduce the system model under investigation in Section 2. In Section 3, a PDA decoder for use in a coded MIMO system is presented, followed by an analysis of the Gaussian approximation and its impact on the decoding process. A confirmation of the analytical results in form of simulations is given in Section 4. Conclusions are drawn in Section 5.

SYSTEM MODEL
Consider a MIMO system with M transmit and N receive antennas. Like in V-BLAST [10], a single data stream is multiplexed into M parallel data streams and then mapped onto complex modulation symbols. The M symbols are transmitted simultaneously by the corresponding antennas. Before the multiplexing is done, the data stream is encoded by a channel encoder and interleaved by a channel interleaver. Assuming flat fading, the equivalent discrete-time channel model can be written in complex baseband notation as where baud-rate sampling is assumed. The channel matrix H ∈ C N×M is assumed to be constant during one data block (block fading assumption) and perfectly known at the receiver. The channel matrix coefficients h n,m represent the gain between transmit antenna m (1 ≤ m ≤ M) and receive antenna n (1 ≤ n ≤ N). The vector x ∈ Q M×1 consists of the complex-valued transmitted modulation symbols taken from a symbol alphabet Q with cardinality Q, while the vector r ∈ C N×1 contains the received samples. Additive noise is given by v ∈ C N×1 , with complex elements that are independent and identically distributed white Gaussian noise samples with zero mean and variance σ 2 v = E{|v n | 2 }. At the receiver, the demultiplexing, or MIMO decoding, operation is performed by the PDA followed by a deinterleaver and a softinput channel decoder. An overview of the system is given in Figure 1. Please note that no turbo equalization as in [4,5] or feedback from the channel decoder to the PDA as in [11] is used.

Basic algorithm
The conventional PDA decoder 1 uses two approximations. Firstly, the PDA decoder looks only at one transmitted symbol at a time, treating the received symbols as statistically independent. A second approximation is the Gaussian approximation ("Gaussian forcing") of the PDF of the interference and noise. The PDA decoder approximates a posteriori probabilities Pr(x m | r) for every element x m of x. All symbols interfering with x m and the noise are modeled as a single vector where h k denotes the kth column of H, and x k the kth element of x. The interference and noise term in (2) is assumed to be an n-variate Gaussian distributed random variable with mean and covariance (4) If no a priori information is available, the PDA decoder initializes the symbol probabilities as a uniform distribution. Assuming the Gaussian distribution of the noise and interference term, the a posteriori probabilities for the possible symbols x m can be computed using (3) and (4): For an estimate of the symbol x m , no information on symbols x k , k ≥ m, is available. In order to provide information on these symbols, the PDA decoder may use multiple iterations, in each iteration using the symbol probabilities obtained by the previous iteration. As in [1], the mean (3) and the variance (4) are updated for every symbol probability estimate, incorporating the new information gained from symbol probabilities already computed in the current or previous iterations. Given the PDF in (5), log-likelihood ratios (LLRs) can be computed to serve as soft-input for the channel decoder after the last iteration of the PDA decoder: where

Actual PDF of interference and noise term
The actual PDF of the interference and noise term is a sum of Q M−1 Gaussian distributions, each of them caused by one possible interfering symbol constellation as a convolution of the discrete symbol probabilities and the PDF of the Gaussian noise vector v. Let X s be the set of all possible symbol vector combinations causing interference for a fixed x m . It can be easily shown that the actual PDF of the interference and noise term is (9) The PDF in (9) is a summation of Q M−1 = |X s | single Gaussian distributions with means depending on the channel as well as the interfering modulation symbols. It is not the PDF used for optimal (APP) detection; being the exact PDF of the interference for one of the detected symbols, it is not employing the Gaussian approximation but still treating the symbols   Figure 2: BER performance of a turbo-coded M ×N MIMO system with PDA decoder. As a benchmark, the BER performance for an APP decoder is shown as well.
as statistically independent. A derivation for the CDMA case can be found in [12,Chapter 3.1] and was also published in [6]. According to the central limit theorem, the quality of the Gaussian approximation used in the PDA decoder improves by increasing the number of transmit antennas. On the other hand, the approximation becomes worse when modulation schemes with more constellation points are used. With an increasing number of constellation points, a soft bit according to (6) is calculated by a larger number of (approximated) probabilities, and is therefore more likely to be unreliable. It should also be noted that the approximation is better in the presence of strong noise. As can be seen in (9), the variance of the single Gaussian distribution is larger for a larger σ 2 v , which makes the sum more likely to be Gaussian-like.

Consequences for soft-input channel decoder
Soft-input channel decoders use reliability information on the input in form of LLRs. The reliability of the LLRs is essential for channel decoding; unreliable soft inputs cause wrong estimates of the information bits. The LLRs delivered by the PDA decoder are calculated from the symbol probabilities which are based on the approximated PDF of the interference and noise term. As shown above, the Gaussian approximation, and therefore the soft inputs of the channel decoder, can be quite poor and thus inhibits the channel decoder from achieving good performance. Similar results were obtained for a coded CDMA system in [9].

NUMERICAL RESULTS
In order to illustrate the influence of the Gaussian assumption on the performance of the PDA decoder, an M × N  Figure 3: BER performance of a turbo-coded 2 × 2 MIMO system with conventional PDA decoder and PDA decoder using the actual PDF of the interference and noise term. As a benchmark, the BER performance for an APP decoder is shown as well.
MIMO system in conjunction with a turbo code has been investigated. As a benchmark, the BER performance for an APP decoder has been simulated as well. A block length of 2304 information bits is used. The bit energy to noise ratio is defined as with q being the number of bits per modulation symbol and R denoting the code rate. The average power per symbol constellation point is denoted by σ 2 x . The elements h n,m of H are statistically independent random variables (each component being complex Gaussian distributed with zero mean and variance σ 2 h = E{|h n,m | 2 }). A rate 1/2 turbo code with polynomials (5,7) and 4 iterations in the turbo decoder is applied. The rate matcher ensures that the coded block length is a multiple of qM, and therefore can be multiplexed to the M transmit antennas.
The number of iterations given in Figures 2 and 3 are the iterations done in the PDA algorithm before the soft estimates of the bits are given to the channel decoder. While the PDA achieves good results when using no channel code [3], the results of the coded system can be far from the optimum. In Figure 2, it can be seen that the difference between the APP and the PDA decoder is the largest for the 2 × 2 system and improves with an increasing number of antennas. Especially for the 2 × 2 system, the gap between the APP and the PDA decoder is getting larger with an increase in E b /N 0 . Furthermore, the third iteration is not, as it should be, the best one. This is explained by the quality of the soft-output generated by the PDA decoder, which degrades with every iteration as (unreliable) probabilities computed by the previous iteration are used.
To demonstrate the impact of the Gaussian approximation on the performance of the coded PDA system, in Figure 3, the results for the 2 × 2 system are shown for the PDA decoder using the Gaussian approximation compared to the decoder using the actual PDF of the interference and noise. It is clearly seen that the problems arise from the Gaussian approximation made in the PDA, as the PDA decoder using the nonapproximated PDF achieves near-optimal results. We have found similar results for convolutional codes and different code rates.

CONCLUSION
The impact of the Gaussian approximation in the conventional PDA MIMO decoder on the performance of a MIMO system using a soft-input channel decoder was shown. It was shown that the Gaussian approximation is the best for a large number of transmitting antennas and a small number of constellation points in the modulation scheme. Its influence on the quality of the soft outputs, and therefore the channel decoder has been investigated. Furthermore, it has been illustrated that the main degradation of the performance of the PDA decoder is the Gaussian approximation and not the symbol-by-symbol decoding. The results of this paper hold, in principle, also for a multiuser detection scenario where the usually large number of interferers results in a good approximation. The PDA decoder was applied in iterative decoding schemes for CDMA [2] and MIMO [11] systems. In iterative schemes, the PDA decoder may achieve a performance close to optimum. A formula for the actual PDF of interference and noise for CDMA MUD can be found in [12]. A way to improve the performance when using the PDA MIMO decoder with a soft input channel decoder might be importance sampling as proposed in [11] or the combination with sphere decoding [13].

Wireless Mobile Ad Hoc Networks Call for Papers
Wireless mobile ad hoc networks (MANETs), due to their dynamic nature, pose many unique challenges compared to traditional wired or cellular wireless networks. MANETs must be self-organized without any requirement for base stations. Their topologies are unpredictable due to mobility and change with the number and distribution of active nodes in the network. Fading and channel variations also induce changes in the network topology and introduce additional complexities in these networks. Given power and energy constraints, as well as the shared nature of the wireless medium, communications may be expected to be multihop. In such a harsh environment, robustness and quality of service (QoS) are essential factors to be considered. MANETs may consist of a heterogeneous mixture of nodes with variety of traffic types and different QoS requirements. Scaling laws for these networks are not fully understood. Many tradeoff studies related to capacity, delay, bandwidth, and energy consumption are currently under intense investigations.
The goal of this special issue is to collect cutting-edge research results in the field of wireless MANETs. We solicit papers that deal with pressing problems unique to wireless MANETs. The scope of this issue includes all aspects of MANETs, including scaling laws, tradeoff studies, coding, interference management, protocol design, cross-layer design, and, more importantly, fundamental limits of MANETs under different conditions. We seek original and unpublished work. The potential list of topics is not necessarily exhaustive and other appropriate subjects will be considered.
Topics of interest include (but are not limited to): •

Call for Papers
Some modern applications require an extraordinary large amount of complexity in signal processing algorithms. For example, the 3rd generation of wireless cellular systems is expected to require 1000 times more complexity when compared to its 2nd generation predecessors, and future 3GPP standards will aim for even more number-crunching applications. Video and multimedia applications do not only drive the complexity to new peaks in wired and wireless systems but also in personal and home devices. Also in acoustics, modern hearing aids or algorithms for de-reverberation of rooms, blind source separation, and multichannel echo cancelation are complexity hungry. At the same time, the anticipated products also put on additional constraints like size and power consumption when mobile and thus battery powered. Furthermore, due to new developments in electroacoustic transducer design, it is possible to design very small and effective loudspeakers. Unfortunately, the linearity assumption does not hold any more for this kind of loudspeakers, leading to computationally demanding nonlinear cancelation and equalization algorithms. Since standard design techniques would either consume too much time or do not result in solutions satisfying all constraints, more efficient development techniques are required to speed up this crucial phase. In general, such developments are rather expensive due to the required extraordinary high complexity. Thus, de-risking of a future product based on rapid prototyping is often an alternative approach. However, since prototyping would delay the development, it often makes only sense when it is well embedded in the product design process. Rapid prototyping has thus evolved by applying new design techniques more suitable to support a quick time to market requirement.
This special issue focuses on new development methods for applications with high complexity in signal processing and on showing the improved design obtained by such methods. Examples of such methods are virtual prototyping, HW/SW partitioning, automatic design flows, float to fix conversions, automatic testing and verification, and power aware designs.

Call for Papers
Field-Programmable Gate Arrays (FPGAs) are increasingly used in embedded systems to achieve high performance in a compact area. FPGAs are particularly well suited to processing data straight from sensors in embedded systems. More importantly, the reconfigurable aspects of FPGAs give the circuits the versatility to change their functionality based on processing requirements for different phases of an application, and for deploying new functionality.
Modern FPGAs integrate many different resources on a single chip. Embedded processors (both hard and soft cores), multipliers, RAM blocks, and DSP units are all available along with reconfigurable logic. Applications can use these heterogeneous resources to integrate several different functions on a single piece of silicon. This makes FPGAs particularly well suited to embedded applications.
This special issue focuses on applications that clearly show the benefit of using FPGAs in embedded applications, as well as on design tools that enable such applications. Specific topics of interest include the use of reconfiguration in embedded applications, hardware/software codesign targeting FPGAs, power-aware FPGA design, design environments for FPGAs, system signalling and protocols used by FPGAs in embedded environments, and system-level design targeting modern FPGA's heterogeneous resources.
Papers on other applicable topics will also be considered. All papers should address FPGA-based systems that are appropriate for embedded applications. Papers on subjects outside of this scope (i.e., not suitable for embedded applications) will not be considered.
Authors should follow the EURASIP JES manuscript format described at http://www.hindawi.com/journals/es/. Prospective authors should submit an electronic copy of their complete manuscript through the EURASIP JES manuscript tracking system at http://www.mstracking.com/es/, according to the following timetable:

Call for Papers
As chips grow in speed and complexity, global control of an entire chip using a single clock is becoming increasingly challenging. In the future, multicore and large-scale systems-onchip (SoC) designs are therefore likely to be composed of several timing domains.
Global Asynchrony and Local Synchrony (GALS) is emerging as the paradigm of choice for SoC design with multiple timing domains. In GALS systems, each timing domain is locally clocked, and asynchronous communication schemes are used to glue all of the domains together. Thus, unlike purely asynchronous design, GALS design is able to make use of the significant industrial investment in synchronous design tools.
There is an urgent need for the development of sound models and formal methods for GALS systems. In synchronous designs, formal methods and design automation have played an enabling role in the continuing quest for chips with ever greater complexity. Due to the inherent subtleties of the asynchronous circuit design, formal methods are likely to be vital to the success of the GALS paradigm.
We invite original articles for a special issue of the journal to be published in 2006. Articles may cover every aspect related to formal modeling and formal methods for GALS systems and/or target any type of embedded applications and/or architectures combining synchronous and asynchronous notions of timing: • Formal design and synthesis techniques for GALS systems • Design and architectural transformations and equivalences • Formal verification of GALS systems • Formal methods for analysis of GALS systems • Hardware compilation of GALS system • Latency-insensitive synchronous systems • Mixed synchronous-asynchronous systems • Synchronous/asynchronous interaction at different levels • Clocking, interconnect, and interface issues in deepsubmicron design • Modeling of interfaces between multiple timing domains • System decomposition into GALS systems • Formal aspects of system-on-chip (SoC) and network-on-chip (NoC) designs • Motivating case studies, comparisons, and applications Authors should follow the EURASIP JES manuscript format described at http://www.hindawi.com/journals/es/. Prospective authors should submit an electronic copy of their complete manuscript through the EURASIP JES manuscript tracking system at http://www.mstracking.com/es/, according to the following timetable:

Call for Papers
Vision systems allow computers to understand images, and to take appropriate actions, often under hard real-time constraints.
Most vision systems need high computer performance. The decisive constraint to develop pattern recognition or monitoring systems was therefore to consider computer hardware with excellent key features to fulfill the high requirements. This causality has several disadvantages. The costs of the final products are high, the size of the hardware becomes voluminous, the electromagnetic capability is reduced, and the energy consumption is often a problem. Therefore, the pressure to realize vision systems on the base of Embedded Systems was and is still increasing dramatically. Meanwhile, the number of possible applications has exploded since several disadvantages of classic systems can be avoided. The history of mobile phones evolution is one of the best examples. It would not have been possible without Embedded Systems, and especially not in such an affordable way. However, it is not necessary to consider only the mass market where Embedded Vision Systems can improve the current situation dramatically. If many cameras are installed to watch a scene, one is able to define a virtual camera, which always shows the most important angle of a view. If a bank note should be checked under the conditions of high accuracy, high probability of error recognition, and high throughput, the realization is only feasible, if the computer is assisted by a network of special parallelized chips. Usually, the algorithms can be divided into three areas, the prestage, where data is compressed, the specialized computational phase, and the interpretation stage. With this setup, the bandwidth and the data throughput may be improved in an amazing way.
Many other ideas could be presented. The main issues are the parallelization of processes, as well as the communications between them, which are based on networked chip sets. The challenge for the research work is to find optimal structures concerning real-time problems, energy consumption, low-price solutions, and so forth. However, not all algorithms for vision systems are suitable to be implemented in Embedded Systems; better solutions have to be discovered. In this sense many tasks and problems in the research field have to be solved, and many application areas are concerned.  The enormous available bandwidth, the wide scope of the data rate/range trade-off, as well as the potential for very-lowcost operation leading to pervasive usage, all present a unique opportunity for UWB systems to impact the way people and intelligent machines communicate and interact with their environment.
The aim of this book is to provide an overview of the state of the art of UWB systems from theory to applications.
Due to the rapid progress of multidisciplinary UWB research, such an overview can only be achieved by combining the areas of expertise of several scientists in the field.
More than 30 leading UWB researchers and practitioners have contributed to this book covering the major topics relevant to UWB. These topics include UWB signal processing, UWB channel measurement and modeling, higher-layer protocol issues, spatial aspects of UWB signaling, UWB regulation and standardization, implementation issues, and UWB applications as well as positioning.
The book is targeted at advanced academic researchers, wireless designers, and graduate students wishing to greatly enhance their knowledge of all aspects of UWB systems. For any inquiries on how to order this title please contact books.orders@hindawi.com