Multi-user MIMO MMSE non-regenerative relaying using local channel state information

In this article, we investigate a two-hop relaying communication where all nodes are equipped with antenna arrays. We derive the multiple-input multiple-output (MIMO) processing matrices using the mean-squared-error cost function and assuming that each node uses only locally available channel state information estimates. Spatial processing at the base station and at the user terminals is same as in the case of a direct communication. The emphasis is on the design of the MIMO precoding matrix at the relay as it has to process the noise and the interference on the first and on the second hop at the same time. The resulting system performance is close to the performance of the system that jointly optimizes matrices at the source and at the relay. The proposed solution requires significantly less computational power and feedback overhead than the solutions proposed in the literature.


Introduction
An important part of future wireless communication systems is multi-user (MU) multiple-input multiple-output (MIMO) processing. It has been shown that the linear increase of the MU MIMO systems' data rate in the number of transmit antennas can be achieved by serving users simultaneously using the space-division multiple access (SDMA) [1]. In multi-hop-based systems additional, intermediate radio access points, or relay nodes (RNs), are used to reduce distances between individual nodes and simultaneously improve the channel conditions. The relays traditionally have been used to mitigate the effect of path loss for obtaining robust communication. The three-terminal relay channel where a single intermediate node supports a single communication pair was introduced in seminal paper [2]. Different relaying protocols which still serve as a basis for many relaying strategies were proposed later in [3]. The idea of relaying was first applied to wireless fading channels in [4]. Wireless relays are essential to provide reliable transmission, high throughput, and broad coverage for next generation wireless networks. The http://asp.eurasipjournals.com/content/2012/1/186 MU MIMO communication system, where a single source, a base station (BS), communicates with a group of user terminals (UTs) over a single RN was investigated in [11][12][13][14]. The optimum design of non-regenerative relays for MU MIMO relay systems in [11] is based on sum rate optimization. Assuming zero-forcing (ZF) dirty paper coding (DPC) at the BS and linear operations at the RN, it proposes upper and lower bounds on the achievable sum rate, neglecting the direct links from the BS to the UTs. The authors in [12] investigate different power allocation algorithms assuming MIMO processing only at the RN. At the BS, there is no processing or simple eigen decomposition is used. At the RN, the authors use QR decomposition in combination with DPC. In [13], the authors extend the MIMO two-way relaying scheme with XOR precoding to a MU cellular relaying scenario, where a BS communicates with K UTs via a single DF relay. Different UTs are spatially multiplexed using ZF beam forming or ZF DPC. A novel iterative semidefinite programming-based algorithm is used for sum rate maximization. The problem of joint linear optimization of MIMO processing matrices at the RN and at the BS for both downlink (DL) and uplink (UL) in MU non-regenerative MIMO relay systems based on MMSE criterion was investigated in [14]. The resulting MIMO processing matrices are calculated iteratively, the nodes require the knowledge of global CSI, and the solution for one matrix is a function of the other MIMO processing matrix. As a consequence, we either need to have some central node that would use global CSI to find the optimum MIMO processing matrices or the BS and the RN have to exchange the CSI and the respective MIMO processing matrices. The exchange of the information between the BS and the RN means that the part of the system throughput has to be used for this purpose. Because the algorithm is iterative, the computational complexity and the part of the system throughput that is used for the information exchange will be higher. This significantly reduces the practicality of this algorithm. Additionally, the antenna configuration in [14] considers only single antenna UTs and requires that the number of the antenna at the BS, RN and the number of UTs to be equal. In this article, we consider a more general scenario where the number of the antennas at the BS, RN and UTs is not limited.
In our article, we derive MIMO processing matrices at the BS, RN, and the UTs using MMSE criterion, as opposed to the ZF criterion used in [11][12][13]. Unlike [14], we assume that these matrices are designed using only local CSI available at the nodes, and the feedback overhead is used only to provide the information about the additive noise variances at the receivers to the transmitters. The MMSE criterion is motivated by robustness to channel estimation errors and a lower implementation complexity. Moreover, MIMO processing matrices designed using the MMSE criterion do not have the same limitations as the MIMO processing matrices that are designed using the ZF criterion, i.e., that the total number of the antennas at the UTs is less than or equal to the number of the antennas at the BS/RN. The use of only local CSI means that the MIMO processing matrix at the BS is designed using only MIMO channel matrix from BS to RN and the MIMO processing matrix at the RN is designed using MIMO channel matrices from BS to RN and from RN to the UTs. Also, our goal is to design MIMO processing matrices at the BS and at the RN independently from each other. In [11,14], the authors assume that the BS and RN each have multiple antennas, but that the UTs have only a single receive antenna. We do not have any restrictions regarding the total number of the antennas at the UTs.
The article is organized as follows. In section "System model", we describe the relaying system. In section "Design of MIMO processing matrices", we derive the MIMO processing matrices and in the section "Numerical results", we present the results of simulations. A short summary follows in the section "Conclusions".

System model
We consider a MU MIMO DL system, where a BS communicates with K UTs over a single RN. The direct links from BS to the UTs are neglected assuming large path loss. There are M B antennas at the BS, M R antennas at the RN, and M U k receive antennas at the kth UT, k = 1, 2, . . . , K. The total number of antennas at the UTs is We use the notation M U 1 , . . . , M U K × M R × M B to describe the antenna configuration of the system. A block diagram of such a system is depicted in Figure 1. The channel matrix from RN to the kth UT is denoted as H 2,k ∈ C M U k ×M R , and the combined channel matrix from RN to the UTs is given by Channel matrix from BS to the RN is denoted as H 1 ∈ C M R ×M B . The transmit data vectors x k ∈ C r k ×1 , and the receive data vectors y k ∈ C r k ×1 , k = 1, . . . , K, for the K UTs are stacked in vectors where r k denotes the number of spatially multiplexed data streams to the kth user.
The input-output signal model is given by the following equation: where is the stacked vector of the zero mean additive white Gaussian noise at the input of the UT antenna arrays, and n 1 ∈ C M R ×1 is the zero mean additive white Gaussian noise vector at the RN antenna array. MIMO precoding matrix at the BS is denoted as F ∈ C M B ×r and MIMO receive matrix at the RN is denoted as D R ∈ C r×M R . The combined MIMO precoding matrix at the RN and the combined MIMO receive matrix at the UTs are denoted as F R ∈ C M R ×r and D ∈ C r×M U , respectively where F R k ∈ C M R ×r k is the RN MIMO precoding matrix corresponding to the k-th UT and D k ∈ C r k ×M U k is the kth UT MIMO receive matrix. The parameters β 1 and β 2 are chosen such to set the total transmit power at the BS and at the RN to P T B and P T R , respectively. The total number of spatially multiplexed data streams is denoted as r = K k=1 r k ≤ min(rank(H 1 ), rank(H 2 )) ≤ min(M B , M R , M U ). The elements of vectors x, n 1 , and n 2 are assumed to be statistically independent.

Channel estimation
We assume the system operates in time division duplex (TDD) so that we can exploit the estimated UL channel for DL transmission due to the reciprocity principle. In general, on the DL the UTs need only to estimate the effective MIMO matrix that includes the MIMO processing at the BS to perform the MIMO receive processing. However, on the UL, the BS requires both the effective channel matrix and the over-the-air MIMO channel estimates to perform resource allocation and MU MIMO processing. Therefore, on the DL, we would need only one type of pilot symbols for CSI estimation, while on the UL we need two types of pilots that are used for estimation of over-the-air UT's MIMO channel matrices and the effective channel matrices. Different types of pilot symbols used in MIMO channel estimation are described in [15]. In our case, the BS has the estimate of H 1 , RN has the estimates of H 1 F and H 2 and the UT has the estimate of H 2,k F R k .

Design of MIMO processing matrices
The design of MIMO precoding matrix at the BS and MIMO receive matrices at the RN and the UTs will be straightforward as we use only local CSI.
At the RN, we have the estimate of H 1 F, and the optimum MIMO receive matrix D R is obtained from where R x = E{xx H } denotes the transmit vector correlation matrix, R n 1 = E{n 1 n H 1 } denotes the additive noise correlation matrix, and (·) H denotes conjugate transpose.
Let us define the singular value decomposition (SVD) of the channel matrix H 1 as From [16,17], we can assume that the matrix F is in the form: where V (r) 1 contains the first r columns of the matrix V 1 and ∈ C r×r . Then, from Equation (5) matrix D R can be also written as where R ∈ C r×r and U At the BS, we assume we have the estimate of the channel matrix H 1 . MIMO precoding matrix at the BS is derived from the following optimization: The MIMO precoding matrix F can be obtained in several ways. Using the approach presented in [16,17], we can substitute the solution for D R from (5) in (9) and then find the optimum F. Another approach is used in [10]. The matrices F and D R are designed iteratively. In this case, we start with some solution for F, then we calculate D R , then use this solution to update the matrix F and so on. Unlike these approaches, in this article, we want to be able to design the spatial processing matrices at the transmitter and at the receiver independently. The transmit MIMO processing matrices are designed assuming only eigenmode decomposition at the receiver, regardless of the actual spatial processing used at the receiver. This is the worst case assumption as only the transmitter would have to deal with the noise and spatial interference. Therefore, the matrix F is designed in a non-iterative way by assuming at the BS that R = I r , where I r ∈ R r×r denotes the identity matrix. At high signal-to-noise ratios (SNRs) this assumption is true. Equation (9) can be written then as where (r) 1 ∈ C r×r is a diagonal matrix with r largest singular values of H 1 on the main diagonal and n 1 = U (r) H 1 n 1 . Using the method of Lagrangian multipliers, from Equation (10) it can easily be shown that the optimum F is in the form of From Equation (11), it follows that the optimum is diagonal positive definite power-loading matrix. If the elements of the additive noise vector at the input of the RN antenna array are independent and identically distributed (i.i.d.) zero mean complex Gaussian random variables with variance σ 2 n 1 then tr R n 1 = rσ 2 n 1 . Then, we only need to feedback the noise variance σ 2 n 1 from RN to the BS to design the MIMO precoding matrix F.
Under the assumption that the estimate of H 2,k F R k is available at the kth UT, the kth UT MIMO receive matrix is obtained from as where R x R ,k = E x R k x H R k denotes the kth UT's RN transmit vector correlation matrix, and R n 2 ,k = E n 2,k n H 2,k denotes the correlation matrix of the additive noise at the input of kth UT antenna array.
Our goal is to use as much as possible of the available users' spatial resources and at the same time minimize the MU interference (MUI) between different users. Let us consider the MSE at the UTs: where and We can rewrite this equation as Matrix DH 2 F R is a block diagonal matrix with matrices D k H 2,k F R k on the main diagonal. Matrix DH 2 F R is given by and represents the MUI. http://asp.eurasipjournals.com/content/2012/1/186 In order to design the MU MIMO precoding matrix at the RN we have to meet two contradictory requirements. First, we need to minimize the co-channel interference between different users by reducing the overlap of the row spaces spanned by the effective channel matrices of different users. However, to maximize the spatial processing gains we need to use as much as possible of the available UTs' channel row vector subspaces. Therefore, we factor the MU MIMO precoding matrix at the RN as where the matrix F R a ,k is used to minimize the MUI from the kth UT to the co-channel UTs, matrix F R b ,k is used to maximize the received power of the kth UT and the matrix F R c ,k is used to optimize the kth UT performance according to a specific criterion. Matrix F R a is obtained from Equation (17) using the following optimization: assuming matrices D k , F R b ,k , and F R c ,k are unitary, r k = rank(H 2,k · H 1 ) and without the loss of generality that the elements of vectors x R and n R are i.i.d. zero mean unit variance random variables. These assumptions correspond to the initial requirement that all UTs use as much as possible of the available subspace for communication. Equation (20) can be written as The joint co-channel UTs channel matrix H 2,k ∈ C (M U −M U k )×M R is defined as Let us define the SVD of H 2,k as then the non-trivial solution for F R a in Equation (21) is given by [17] assuming matrices D k and F R c ,k are unitary and r k = rank(H 2,k · H 1 ). Again, without the loss of generality we can assume that the elements of vectors x R and n R are i.i.d. zero mean unit variance random variables. Equation (25) is rewritten as The non-trivial solution of (26) is given by Finally, we can design the optimum matrix F R c ,k according to a specific optimization criterion. In our case, we use the MMSE criterion so the optimum F R c ,k is obtained from assuming the MU MIMO channel is decomposed into the set of parallel SU MIMO channels using matrices F R a ,k . Let us define the SVD of H 2,k F R a ,k F R b ,k as Again, we can assume in the worst case scenario that at the UTs we perform only eigenmode decomposition of the http://asp.eurasipjournals.com/content/2012/1/186 effective UTs' channel matrices, i.e., D k = U (r k ) H R k . We can rewrite Equation (28) as where we have assumed that the optimum F R c is in the form [17] and n 2,k = U (r k ) H R k n 2,k . After setting the derivative of (30) to zero, we have From Equation (32) we have Finally, the parameter β 2 is chosen such to set the total transmit power at the RN to P T R :

Numerical results
In this section, we compare the performance of the proposed algorithm to the performance of a system using hard decision DF relaying and to the optimal joint (OJ) MMSE algorithm proposed in [14] that jointly optimizes MIMO processing matrices at the BS and at the RN. We denote the algorithm proposed in this article as regularized block diagonal AF (RBD AF) as at high SNRs and when the total number of the antennas at the UTs is less than or equal to the number of the antennas at the RN, the combined effective channel matrix from RN to the UTs, H 2 F R , is block diagonal since the UTs transmit only in the null subspace of the co-channel UTs. We assume that the RN is placed half-way between the BS and the UTs, and that the path loss exponent is n = 4. The transmit power at the BS and the transmit power at the RN are equal, P T B = P T R = P T . Additive noise variances at the input of the RN and the UTs' antenna arrays are assumed also to be equal, σ 2 n 1 = σ 2 n 2 = σ 2 n . MIMO channel matrices between the BS and RN, and RN and UTs, are modeled as spatially white uncorrelated MIMO channels H w . The elements of the channel matrices are zero mean, unit variance complex Gaussian variables.
First, in Figure 2, we compare the bit error rate (BER) performance of RBD AF and OJ MMSE under the assumption used in [14] that all UTs are equipped with only one antenna. The system antenna configuration is {1, 1, 1} × 4 × 4, i.e, there are K = 3 UTs in the system equipped with single antenna each, there are M R = 4 antennas at the RN and M B = 4 antennas at the BS. Data are uncoded and mapped using quadrature amplitude modulation (4QAM). As we can see from the figure, the OJ MMSE algorithm that jointly optimizes the MIMO processing matrices at the BS and RN has only slight advantage over RBD AF at low SNRs. At high SNRs the difference between RBD AF and OJ MMSE is negligible.
However, if we consider UTs equipped with multiple antennas then the RBD AF algorithm gains significantly over OJ MMSE. In Figure 3 is transmitting one data stream per UT using the 4QAM modulation. In order to keep the comparison fair, in case of OJ MMSE we have two data streams per UT modulated using binary phase shift keying (BPSK). The RBD AF algorithm extracts higher array and diversity gains than OJ MMSE.
In Figure 4, we compare the performance of RBD AF algorithm and a system using hard decision DF relaying in an overloaded system, i.e., the system where the total number of antennas at the UTs is greater than the number of antennas at the RN. In case of DF system we use again the RBD algorithm to design the MIMO precoding matrix at the RN. However, we omit the influence of the additive noise at the input of the RN antenna array. The DF system has slightly higher spatial processing gains, and an SNR gain over RBD AF of around 3dB at BER = 10 −3 .

Conclusions
In this article, we investigated a two-hop communication from BS to the UTs over one RN. We derived the MIMO processing matrices at the BS, RN and the UTs using only local CSI. In order to be able to design the spatial processing matrices at the transmitter and the receiver independently, the transmit MIMO processing matrices are designed assuming only eigenmode decomposition at the receiver, regardless of the actual spatial processing used at the receiver. This is the worst case assumption as only the transmitter would have to deal with the noise and spatial interference. The emphasis is on the design of the MIMO precoding matrix at the relay as it has to process noise and interference on the first and the second hop at the same time. The MU MIMO precoding matrix at the relay is designed using a criterion which minimizes the MU interference while at the same time tries to exploit as much as possible of the available UT spatial processing gains. In our simulations, we have shown that the proposed system has the negligible performance loss compared to the system that iteratively and jointly optimizes MIMO processing matrices at the BS and at the RN, and around 3 dB SNR loss at the BER of interest compared to the system that is using hard decision DF relaying.