On the user performance of orthogonal projection signal alignment scheme in MIMO relay systems

In this article, an orthogonal projection signal alignment (OP-SI) scheme is proposed for multiple-input and multiple-output (MIMO) relay systems, and the user performance is studied with and without perfect channel state information (CSI). Particularly we focus on an important scenario in cellular systems, where the base station exchanges messages with multiple users via a relay. By jointly designing the precoding matrices at the base station and the relay, information exchanging can be accomplished within two time slots. When the perfect CSI is available, closed form expressions of ergodic sum rate are developed for the proposed scheme, where we can demonstrate that the proposed scheme yields a better performance than the existing zero-forcing signal alignment scheme. When the channel estimation is not perfect, both the intra-stream and inter-stream interference cannot be completely removed. To evaluate the impact of channel estimation error, an approximation of ergodic sum rate for the OP-SI scheme is developed to show that the system performance could significantly decrease as the covariance of channel estimation error increases. Moreover, the outage capacity is also analyzed. Furthermore, an enhanced relay precoding scheme is also introduced to improve the transmission performance. Numerical results are provided to show the accuracy of the developed analytical results.


Introduction
To increase the system throughput and enlarge the coverage for cellular systems, relaying is introduced as a key technology for the next generation mobile telecommunication systems [1]. However, the extra radio resource consumed by relay transmissions could lead to a loss of the spectrum efficiency and system throughput as well. As a promising method to overcome such shortcomings of relaying, the application of network coding in wireless communications has been studied recently, specifically in cellular communication scenarios [2][3][4][5]. Network coding was first introduced to two-way relaying systems, where two source nodes exchange messages via a relay. By asking the relay to broadcast the mixture of the source messages, the communication between two source nodes can be accomplished within two time slots, where each http://jwcn.eurasipjournals.com/content/2012/1/308 decode-and-forward (DF) MIMO relay systems was investigated in [7]. The application of analog network coding to MIMO two-way relaying channel was researched in [8], where the optimal design of beamforming was provided. Most of these existing works assume the perfect CSI either at the transmitters or the receivers.
In this article, we proposed an orthogonal projection signal alignment (OP-SI) scheme in MIMO relaying channels. The main contribution of this article are as follows: first, we propose a new signal alignment scheme for a classical cellular communication scenario, where a base station exchanges messages with multiple users with the help of one relay. By jointly designing the precoding at the base station and the relay, the messages from and to the same user can be aligned together. Different to [10], the precoding matrix at the relay is constructed by projecting each aligned message on a single carefully chosen direction of the null space, respectively, where the inverse of large size matrices can be avoided. Since the co-channel interference can be eliminated, the multiple uplink and downlink transmissions can be accomplished within two time slots.
Second, the analytic results, such as the user ergodic sum rate and the outage capacity with and without channel estimation error, are derived to evaluate the performance of proposed OP-SI scheme. When the channel estimation is free of error, our developed analytic results clearly show that the OP-SI scheme achieves higher ergodic sum rate at the user nodes than the zero-forcing signal alignment (ZF-SI) scheme proposed by [10]. Furthermore, the impact of channel estimation error on the OP-SI scheme is also evaluated. Such analysis is rarely introduced in the existed works: Different to the work in [12][13][14], we focus on the MIMO relaying channels in this article, which is a more challenging scenario compared to the one-hop MIMO system studied in previous works. Moreover, many existing analysis of signal alignment transmissions is based on the assumption that the perfect global CSI is available, such as in [6,7,10]. However, the channel estimation error cannot be avoided completely in practical wireless systems. And such error causes severe interference, especially in the context of signal alignment transmissions, where the incompletely removed selfinterference, an unique phenomenon for signal alignment, will severely degrade the transmission reliability. Based on practical channel estimation error models, we are able to fully characterize the impact of channel estimation error in this article. Finally, an improved OP-SI scheme with optimal relay precoding selection is introduced, which can furthermore provide significant performance gains in the presence of channel estimation error.
The rest of this article is organized as follows. Section 'System model and protocol description' describes the system model, and introduces the proposed OP-SI transmission scheme. In Section 'Performance analysis for the proposed OP-SI scheme' , the performance analysis with and without channel estimation error is derived, and the comparison with the ZF-SI scheme is also presented. Then in Section 'The enhanced precoding design for the OP-SI scheme at the relay' the enhanced relay precoding design for OP-SI is provided, which can improve the transmission performance. Then the numerical results are shown in Section 'Numerical results' , and followed by the conclusions in Section 'Conclusion' .
Notation: Vectors and matrices are denoted as boldface small and capital letters, respectively, e.g., A and b. The trace for A is denoted as tr(A), and D A is the determinant for A. (A) i,j represents the element located at the ith row and the jth column of A. E{x} is the expectation of various x. |x| denotes the norm for x, and x can be a number, a vector or a matrix. a is denoted as the floor function for a. (·) is Gamma function. ci(·) and si(·) are the cosine integral and the sine integral, respectively. ψ(·) is the Euler psi function, and Ei(·) is the exponential integral function.

System model and protocol description
Consider a communication scenario including M users, one base station and one relay. As shown in Figure 1, the base station and the relay are both equipped with multiple antennas, whose antenna numbers are M and N, respectively, and each user is equipped with a single antenna. It is assumed that the number of relay antennas satisfies N > M in order to meet the power constraint, as well as ensures that enough degrees of freedom can be provided. The half-duplexing constraint is applied to all nodes. All the channels are assumed to be quasi-statically independent and identically Rayleigh fading, and there is no direct link between the base station and the users. Both the base station and the relay have the access to the global CSI.
For such a scenario, the base station is required to transmit M messages to the M users individually, while each user wants to send its own information to the base station Figure 1 System diagram for the addressed bidirectional communication scenario. This figure is provided as an illustration for the system model, which helps the readers better understand the transmission procedure. http://jwcn.eurasipjournals.com/content/2012/1/308 simultaneously. By jointly designing the precoding matrices at the base station and the relay, the messages from and to the same user can be aligned together, and transmitted to the users without co-channel interference, if the perfect CSI can be obtained. In the following part, we will introduce such a signal alignment scheme, which can accomplish the transmission between the base station and the M users in two time slots. Unlike the ZF-SI scheme proposed by [10], the orthogonal projection is applied for relay precoding design in the proposed OP-SI scheme, which avoids severe relay noise amplification caused by the ZF-SI scheme and achieves significant performance gains.
During the first time slot, the base station and all the users transmit their messages to the relay simultaneously. To ensure that the matched messages from the base station and the users can be grouped together at the relay, the precoded messages are transmitted from the base station. Then the relay receives where r is the N ×1 observation vector at the relay, G is the N × M channel matrix between the base station and the relay, H is defined similarly for the relay and the users, n R is the N ×1 additive Gaussian noise vector at the relay, s = [ s 1 , . . . , s M ] T and x =[ x 1 , . . . , x M ] T are the M × 1 power normalized message vectors from the base station and the users, and x i and s i are denoted as the messages from and to the ith user, respectively. During the second time slot, the relay broadcasts the precoded observation to the base station and all the users. Denoting Q as the precoding matrix at the relay, the relay transmission t R = Qr. Then the observations at the base station can be given as and the observation at the ith user can be given as where h H i is the N × 1 channel vector between the relay and the ith user, n B and n i are the receive Gaussian noises at the base station and the ith user, respectively.

The precoding design at the base station
The purpose of designing the precoding matrix P is to ensure that the relay can align the messages from and to the same user. To achieve such a target, the precoding matrix at the base station is defined as Note that a major part of this article focuses on the performance evaluation of the following proposed precoding scheme, which is complicated and intractable. To simplify the derivation, the long-term power constraint is used in this article, which has been commonly used as shown in [15,16]. In this article, we assume that the transmission power at each antenna is constrained at 1, and therefore the total transmission power at the base station can be derived as To obtain the simplified expression of α B , the following lemma is introduced.

Lemma 1.
Denoting that A = GG H and B = HH H , we can derive , and the numbers of antennas must follow the constraint that N > M.
Proof. Please refer to the Appendix 1.
Applying the upper bound provided by Lemma 1, the power constraint factor α B can be defined as which is a constant only related with M,N. According to the precoding matrix proposed by (4), the relay can obtain the aligned messages, which can be expressed as The precoding design at the relay As given by (1), the aligned messages for all the users are mixed together, which causes strong interference at the user side. To eliminate such interference, we introduce the relay precoding design in the following part. First, we can rewrite (1) as the mixture of the aligned messages for the users, The key idea of OP-SI scheme at the relay is to project the desired message to an orthogonal direction of the null space, i.e., the aligned message for the ith user should be project to a vector which is belongs to the null space generated by h 1 , . . . , h i−1 , h i+1 , . . . , h M . And such a vector http://jwcn.eurasipjournals.com/content/2012/1/308 can be generated by Gram-Schmidt process of h 1 , . . . , h M , which can be presented as the determinant formula [17], where W i is a N × (M − 1) submatrix of the channel matrix H by removing its ith column, i.e., , and W j is defined similarly. Since u i can ensure that u H i h j = 0(i = j), the precoding matrix Q can be generated as where α R is defined as the long-term relay power constraint, and U =[ u 1 , . . . , u M ]. Apparently the precoding matrix can diagonalize channel matrix, which ensures that the interference at each user can be eliminated, When the relay is with large number of antennas, the total transmission power at the relay can be approximated as a To satisfy the transmit power constraint, α R can be set as Since the precoding matrices are both constructed, we can rewrite the observation at the ith user as and the observation at the base station is

Performance analysis for the proposed OP-SI scheme
Due to the single antenna setting and the limited processing capability, the reception reliability at the user nodes becomes the bottleneck for the system transmissions. Therefore we will focus on the evaluation of the performance at the user node. Particularly both the ergodic sum rate and the outage capacity for the OP-SI scheme are studied in this section.

Ergodic sum rate at the users with perfect CSI
Due to the observed signal given by (15), the SNR for the ith user can be expressed as To obtain the ergodic sum rate at the user, the joint PDF must be derived first, whose approximation is provided by the following lemma.
x 1 and x 2 are independent when the number of relay antennas N is large enough, and the joint PDF can be derived as Proof. The key idea is to prove that the two introduced Gaussian variables h i and h H i u i are independent when the number of relay antennas is large, which can ensure that their squared norms |h i | 2 and |h H i u i | 2 are also independent. Apparently h i and h H i u i are joint normal distributed, since each linear combination of their components is normally distributed [18]. Denoting z 1 = h i and z 2 = h H i u i , the joint PDF can be given as where is the (N +1)×(N +1) covariance matrix. Recalling the generation of u i , H H H in (10) is a complex Wishart matrix, whose determinate is distributed as the product of M independent chi-squared random variables v 1 , . . . , v M , When the number of relay antennas N tends to infinity, the degrees of freedom for v k also approach infinity for a fixed M. Due to the law of large numbers, v k becomes a constant, which can be given as v k = 2(N + 1 − k). Then Since W i is a N × (M − 1) submatrix of H, a similar http://jwcn.eurasipjournals.com/content/2012/1/308 expression for W i can be obtained as well. For such a condition, u i in (10) can be further derived as Substituting the last equation into the covariance matrix , it becomes an identity matrix, which can ensure the independence of h i and h H i u i [19], and so do their squared norms. It is well known that |h i | 2 and |h H i u i | 2 are chisquared distributed and exponential distributed, respectively, then the joint PDF can be presented as And the proof is finished.
Based on the conclusion in Lemma 2, the following theorem about the ergodic sum rate at the users with perfect CSI can be presented.

Theorem 3. When the number of relay antennas N is large, the ergodic sum rate for the ith user with perfect CSI can be given as
Proof. Due to the SNR given by (17) and the joint PDF in Lemma 2, the ergodic sum rate C i can be derived as following, The first term of C i in the last equation can be further developed as To achieve a closed-form expression for C OP−SI i , an approximation for the large number of relay antennas is applied here. Note that the base station power constraint α B becomes large as N is large, and thus ρα 2 B α 2 R x 2 1 + 1 grows large as well. Then the last equation can be approximately expressed as [20] E{log And the second term of C OP−SI i in (24) can be given as Substitute (26) and (27) into (24), the theorem can be proved. http://jwcn.eurasipjournals.com/content/2012/1/308 To further evaluate the performance of our proposed OP-SI scheme, a comparable scheme is selected, which is the ZF-SI scheme introduced by [10]. And the analysis result is presented by the following corollary.

Corollary 4.
The proposed OP-SI scheme can achieve higher ergodic sum rate than the ZF-SI scheme, which is presented by [10], and the ergodic capacity gap of two schemes is provided as following when the SNR is high, where C ZF−SI i is denoted as the ergodic sum rate for the ZF-SI scheme.
Proof. To achieve the comparison results, we first derive the ergodic capacity for the ZF-SI scheme. Compared to OP-SI scheme, the same precoding is applied at the base station, while the ZF beamforming is utilized at the relay, which can be given as And then the SNR for ZF-SI scheme at the ith user can be derived as where β 2 B and β 2 R are the power constraints at the base station and the relay, respectively. To simplify the derivation, the long-term power constraint is also used for the ZF-SI scheme, and the average power at the relay can be presented as where the last equation follows the fact that (H H H) −1 is an inverse Wishart matrix [21]. Then β R can be given as and β B = α B at the base station for the ZF-SI scheme.
For high SNR region, the ergodic sum rate for the ZF-SI signal alignment scheme can be derived as Moreover, the ergodic sum rate of OP-SI can be approximated as Then the gap between the ergodic capacities of two introduced schemes can be expressed as And the corollary has been proved. http://jwcn.eurasipjournals.com/content/2012/1/308 Corollary 4 shows clearly that our proposed scheme can achieve higher ergodic sum rate than the ZF-based scheme in [10]. According to the protocol description, the key idea of relay precoding design in this article is quite similar to the block diagonalization (BD) scheme, where the co-channel interference can be eliminated by using the null space of channel matrix. For the addressed scenario, the BD-based precoding schemes can outperform the ZF-based scheme, and such a phenomenon has been previously reported in [23].

Ergodic sum rate at the users with channel estimation error
Based on the perfect CSI assumption, the ergodic sum rate for the OP-SI scheme has been introduced in the previous part, where the interference can be eliminated completely. However, restricted by the imperfect feedback or signal processing, it is hard for the users to get the perfect CSI. Typically channel estimation error can be modeled as follows. Particularly the perfect CSI h i can be expressed as the sum of channel estimatesĥ i and the estimation error e i , As discussed in [12][13][14],ĥ i andê i are independent, whose entries are complex Gaussian distributed with the variance σ 2 e and (1 − σ 2 e ), respectively. When the channel estimation is imperfect, the interference at the users cannot be removed, since the precoding matrices are derived based on channel estimates. Then the observation at the relay can be expressed as And the observation at the base station is and the observation at the ith user can be given as Then the signal to interference plus noise ratio (SINR) for the ith user can be expressed as where ξ s = α 2 B α 2 R σ 2 e |ÛÛ HĤ | 2 and ξ x = α 2 R σ 2 e (|ÛÛ HĤ | 2 + 3|ĥ i HÛÛ H | 2 ) are denoted as the interference caused by the imperfect reception of s and x, respectively,ñ i = 1 ρ (α 2 R |ĥ i HÛÛ H | 2 + σ 2 e |ÛÛ H | 2 + 1) is the total noise at the ith user. Then we can have the following theorem about the ergodic sum rate of the OP-SI protocol.

Theorem 5. When the number of base station antennas M is large, the ergodic capacity with channel estimation error can be approximately expressed as
Proof. First, we focus on the signal part of SINR OP−SI i , where |ĥ i | 2 can be expressed as |ĥ i | 2 = N k=1 |h ik | 2 , and |h ik | 2 is exponential distribution with the parameter (1 − σ 2 e ). Due to the strong law of large numbers, the average of |h ik | 2 is closed to its expectation E(|h ik | 2 ) = 1 − σ 2 e when N is large. Thus |h ik | 4 in (40) can be replaced by Then we turn to the noise part in (40), which can be further derived asñ Based on the forgoing approximations, SINR OP−SI i can be rewritten as wherex 1 = |ĥ i Hû i | 2 andx 2 = j =i |ĥ j Hû j | 2 , andx 1 andx 2 follow exponential distribution and Chi-squared distribution, respectively. Similar to the signal part,x 2 approaches to (M − 1)(1 − σ 2 e ) when the base station equips with large-scale antennas. Then the ergodic sum http://jwcn.eurasipjournals.com/content/2012/1/308 rate C OP−SI i for the ith user with imperfect CSI can be derived as The proof has been finished.
By using C OP−SI i introduced in the above theorem, we can analyze the impact of channel estimation error on the ergodic sum rate. When the channel estimation is free of error, both intra-stream and inter-stream interference can be perfectly avoided, and C OP−SI i equals to the ergodic sum rate with perfect CSI. When the channel estimation error exists with a fixed variance σ 2 e , we can derive the following equation with the high SNR assumption, The last equation shows that C OP-SI i trends to a constant as SNR ρ increases, which means that the gap between the ergodic sum rate with and without channel estimation error grows infinitely. Therefore, the system capacity is severely limited by the existence of channel estimation error, and it is necessary to improve the channel estimation accuracy.

Outage capacity at the users for OP-SI
To further evaluate the robustness of proposed OP-SI scheme, the analysis of outage capacity for each user is necessary. In this section, the outage capacities with and without channel estimation error are both studied. First, the definition of outage capacity can be presented as follows,

Definition 1. Outage capacity is the maximum transmission rate with a specified outage probability, which can be defined as
where R is the transmission data rate, p tar is the threshold for outage probability, and P out (R) is the outage probability that can be defined as follows where C is the capacity.
According to the definition of outage capacity, the outage probability should be derived at first. Since the proposed OP-SI scheme aims to remove the co-channel interference at the users, the outage probability for each user is considered independently in this article, and then we can provide the following theorem about outage probability with perfect CSI.

Theorem 6. The closed form expression of outage probability for the ith user can be expressed as
where R tar i is the target transmission data rate for the ith user.
By setting R tar i = C out−i and P i out (R tar i ) = p i , which is a specified value of outage probability, the outage capacity for the ith user with perfect CSI can be presented as where (P i out ) −1 (·) is the inverse function of the outage probability P i out (R).
Proof. When the perfect CSI is available, the SNR for the ith user is provided by (17) in the article. Due to the definition of outage probability presented by (46), the outage probability for the ith user can be developed as P i out (R tar ) = Pr where R tar i is the target transmission data rate for the ith user, x 1 and x 2 follows the notation in the previous section of the article. By substituting the joint PDF of x 1 and x 2 into (49), P i out (R tar ) can be further derived as Note that when ρ is large enough, can be ensured for the last equation. Since the expression of P i out (R tar i ) is so complicated that the closed-form expression of C out−i is not easily tractable. In fact, outage capacity is still intractable even in a more simple scenario, such as in. Therefore, we can only obtain the symbolic expression of C out−i . Particularly by setting R tar i = C out−i , the outage capacity for the ith user can be expressed as follows for a given outage probability p i , And the proof has been finished.
Then we focus on the outage capacity for the user with channel estimation error. Similar to the analysis above, the closed-form expression of outage probability needs to be derived at first, and then the outage capacity equals to the target data rate with fixed outage probability. And the analysis results are presented in the following theorem. Theorem 7. When the channel estimation error exists, the outage probability P i out (R tar i ) can be presented as R σ 2 e ρ and all the other notations follows the previous subsection.
By setting R tar i = C out−i and P i out (R tar i ) = p i , which is a specified value of outage probability, the outage capacity for the ith user with perfect CSI can be presented as where (P i out ) −1 (·) is the inverse function of the outage probability P i out (R).
Proof. Similar to the proof of Theorem 6, the outage probability P i out (R tar i ) is first derived. By using the SINR i for the ith user given by (42), P i out (R tar i ) can be expressed as where all the notations follow the previous subsection.
As described in the proof of Theorem 5,x 1 andx 2 are two independent random variables that follow exponential and Chi-squared distributions, respectively. Then the derivation of P i out (R tar i ) can be given as follows, Then the outage capacity C out−i can be derived by deriving the inverse function of P i out (C out−i ) with a fixed outage probability p i , which can be expressed as And the theorem has been proved.

The enhanced precoding design for the OP-SI scheme at the relay
In Section 'System model and protocol description' , the design for the precoding matrix Q at the relay was introduced, whose vectors are generated by Gram-Schmidt http://jwcn.eurasipjournals.com/content/2012/1/308 With long−term power constraints in (7) and (14) With instant power constraints Total transmit power at the base station Total transmit power at the relay Figure 2 Total transmit power at the base station and the relay for OP-SI with different power constraints. In Figure 2, the total transmit power at the base station and the relay are plotted. Particularly the numbers of antennas at the relay and the base station are set as N = 3 and M = 2, respectively, and the power of noise is fixed as −10 dBm. As shown in the figure, the differences of between the schemes using the long term and instantaneous power constraints are quite small.  Figure 3 Ergodic sum rate of OP-SI with different power constraints. In Figure 3, the ergodic sum rate of OP-SI with different power constraint schemes are also plotted by Monte Carlo simulation, where M = 2, 4, 6, N = M + 1. As shown in the figure, the ergodic sum rate curves of OP-SI with two types of power constraints are very close in high SNR region, and the performance gap is less than 0.2 Bit/s/Hz. http://jwcn.eurasipjournals.com/content/2012/1/308 process. In fact, there exists other available vectors for Q when N > M. To further improve the performance gain, an enhanced OP-SI (eOP-SI) scheme is described in this section, where the relay follows an improved precoding generation with optimal precoding selection.
To eliminate the interference at the users, the relay precoding matrix can be designed as follows. Based on a submatrix of H, an orthogonal projection matrix U † i can be generated for the aligned message of s i and x i , Due to the definition of the orthogonal projection matrix, the null space dimension for U † i is (M − 1). Therefore, the dimension of the signal space equals to (N − M + 1), and the fact that can be easily observed [6]. Thus U † i is qualified to group the messages from and to the ith user together without interference, and M precoding matrices are needed. It will put a heavy burden on the relay. To derive a simplified precoding matrix at the relay, U † i can be further decomposed as where u † i,1 , . . . , u † i,(N−M+1) are the normalized basis vectors of the non-null space. The decomposition follows from the fact that the eigenvalues of U † i are either 1 or 0.
Therefore, each column vector of U † i can remove the interference. Assuming a qualified vector u † i has been selected, the precoding matrix Q can be generated as  Figure 4 shows the comparison of ergodic sum rate between the OP-SI scheme and the ZF-SI scheme with perfect CSI. The number of user are set as 2, 4, 6, respectively, and the number of relay antennas is M + 1. As shown in the figure, the OP-SI scheme can always achieves higher ergodic sum rate than the ZF-SI scheme. http://jwcn.eurasipjournals.com/content/2012/1/308 where α R is defined as the relay power constraint, and Apparently the precoding matrix can diagonalize channel matrix, which ensures that the interference at each user can be eliminated.
Due to the SNR given by (17), the transmission performance of the ith user is decided by the chosen precoding vector u † i . To improve the transmission performance, an appropriate precoding matrix Q opt should be generated, and the optimal precoding vector for each user can be selected according to the following rule, } is the set of available precoding vectors for the ith user, and SNR i,k is denote as the SNR achieved by the ith user when u † k is applied. Furthermore, the power is allocated equally for each antenna in OP-SI. To further improve the transmission performance, the power optimization can be utilized at the relay. Since the ergodic sum rate is studied in this article, the object of power allocation can be set as maximizing the transmit sum rate, and the global optimal solution can be achieved by water filling [24].
When the channel estimation error exists, which is unpredictable, the exact SINR cannot be provided for the relay node. A feasible solution is to select the relay precoding matrix according to the estimated SNR, which can also improve the transmission performance as shown in the following section.

Numerical results
In this section, the performance of proposed OP-SI scheme is evaluated based on Monte Carlo simulations. In Figure 2, the total transmit power at the base station and the relay are plotted. And the transmit power are defined as  Figure 5 shows the ergodic sum rate for the OP-SI scheme. The number of users is fixed as M = 2, and the number of relay antennas is N = 3, while σ 2 e , which is the variance of channel estimation error, are set as 0, 0.01, 0.05, and 0.1, respectively. The simulation results show that the performance of OP-SI scheme is severely impacted by the channel estimation error. http://jwcn.eurasipjournals.com/content/2012/1/308 The comparable scheme is based on the instantaneous power constraint. Particularly the numbers of antennas at the relay and the base station are set as N = 3 and M = 2, respectively, and the power of noise is fixed as −10 dBm. As shown in the figure, the transmit power with the long-term power constraint is very close to that with instant power constraint, which implies that the used bound is quite tight. Similarly, the difference of relay transmit power between the schemes using the long term and instantaneous power constraints is also quite small.
The ergodic sum rate of OP-SI with different power constraint schemes are also plotted by Monte Carlo simulation in Figure 3. As shown in the figure, the ergodic sum rate curves of OP-SI with two types of power constraints are very close in high SNR region, and the performance gap is less than 0.2 Bit/s/Hz. Therefore, although the ergodic sum rate is derived by using the long-term power constraints in this article, it also provides some insights for the performance of the scheme with the instant power constraint. Figure 4 shows the ergodic sum rate of each user with perfect CSI, and the numbers of user and relay antennas are set as M = 2, 4, 6 and N = M + 1, respectively. To show the performance gains clearly, the ZF-SI scheme is selected as a comparable scheme. As shown in the figure, the capacity of each user decreases as the number of users increases, and the OP-SI scheme always achieves higher capacity than the ZF-SI scheme. Such a result verifies the analysis results given by Corollary 4. Moreover, the numerical results based on Theorem 3 is provided as well, and the simulation results confirm that our derived closed-form expression for the ergodic sum rate perfectly matches the Monte Carlo results, specifically in the high SNR region.
In Figure 5, the parameter of the number of users is fixed as M = 2, and the number of relay antennas is N = 3, while the different σ 2 e , which is the variance of channel estimation error, are set. When the OP-SI scheme is free of channel estimation error, the ergodic sum rate increases linearly as the SNR raises. Note that the ceiling of ergodic sum rate appears when the CSI is not perfect, and the slopes of the curves for the ergodic sum rate are saturated faster with the increasing of σ 2 e , which means that the system capacity is seriously effected by the  Figure 6 provides the ergodic sum rate for the eOP-SI scheme. The number of users is set as 2, and the number of relay antennas is 4. Compared with the OP-SI scheme, the eOP-SI scheme can always achieve higher ergodic capacity. http://jwcn.eurasipjournals.com/content/2012/1/308 channel estimation error. In addition, the approximation presented by Theorem 5 is also shown. The numerical results demonstrate that our provided approximation is quite closed to the curves plotted by Monte Carlo simulation. Figures 6 and 7 provide the performance evaluation for the eOP-SI scheme proposed in the previous section, and the number of users is set as M = 2. Figure 6 presents the ergodic sum rate of the eOP-SI scheme with and without channel estimation error. Compared with OP-SI, the eOP-SI scheme can always achieve higher ergodic capacity, whenever the perfect CSI is available. The capacity gap between the two schemes enlarges as the SNR increases at first, and then tends to stable in the high SNR region. In Figure 7, the ergodic sum rate with different relay antennas numbers is presented, where we fix the number of relay antennas as N = 3, 4, 5, respectively. As shown in the figure, the ergodic sum rate of eOP-SI scheme raises with the increment of the relay antennas. Based on the simulation results, the optimal relay precoding selection of eOP-SI scheme achieves higher ergodic sum rate than the fixed relay precoding design of OP-SI scheme, and such performance gain grows larger as the number of relay antennas increases. Figure 8 provides the relationship between outage probability and outage capacity. Particularly the numbers of antennas at the base station and the relay are set as M = 2 and N = 3, respectively, and the SNR is fixed as ρ = 15 dB. As shown in the figure, the outage capacity can be improved by increasing the outage probability. And the outage performance becomes worse when the variance of channel estimation error σ 2 e increases. The simulation results also show that the derived analysis results are quite close to the Monte Carlo simulation.

Conclusion
In this article, an OP-SI scheme is proposed for the MIMO relaying channels, where the base station exchanges messages with multiple users via the help of a relay. When the perfect global CSI is provided, the desired messages can be aligned at the relay by carefully constructing precoding at the base station and relay, and the co-channel interference can be removed completely. The derived closed-form expression demonstrates that our proposed   Figure 8 provides the relationship between outage probability and outage capacity, where M = 2 and N = 3, and the SNR is fixed as ρ = 15 dB. As shown in the figure, the outage capacity can be improved by increasing the outage probability. And the outage performance becomes worse when σ 2 e increases. The simulation results also show that the derived analysis results are quite close to the Monte Carlo simulation.
OP-SI scheme can achieve higher ergodic sum rate than the existing ZF-SI scheme. To evaluate the effort of channel estimation error, the ergodic sum rate with imperfect CSI is also investigated. Both the analytical and numerical results indicate that the system performance is seriously decreased with the increment of the covariance of channel estimation error. To improve the transmission performance, an eOP-SI scheme is also introduced, which can further improve the transmission performance by using precoding selection at the relay.

Proof of Lemma 1
It can be easily verified that A and B are two independent Wishart matrices, and thus A −1 and B are positive semidefinite matrices, whose trace is positive. Due to the Von Neumann's trace inequality [25], it can be derived that where λ i and ω i are the eigenvalues for the matrices A and B, respectively, λ 1 · · · λ M , ω M · · · ω 1 , and λ min is the minimal eigenvalue of A. And the joint probability density function (PDF) for the ordered eigenvalues of A can be given as following, where . (65) Then the PDF f λ min (λ) of λ min can be bounded as Then the expectation of 1/λ min can be bounded as it is important to point out that the integral of λ N−M−1 exp − 1 2 λ tends to infinity when N = M. Therefore, to obtain the upper bound of 1/λ min in (68), the constraint of antennas numbers that N > M should be followed. It is easy to know that tr(B) is Chi-square distributed, and tr(B) ∼ And the lemma is proved.
Endnotes a Due to the proof of Lemma 2 in the following section, u i approaches to 0 as N increases, and u H i u j (i = j) also tends to 0. And thus UU H is approximately treated as an identity matrix. b The approximation follows the inequality that |ÛÛ H | 2 |Û| 2 |Û H | 2 = M 2 , and an upper bound ofñ i is derived here. In fact, it is quite close toñ i , and the following derivation of SINR OP-SI i cannot ensure this inequality. Thus it is treated as an approximation.