Optimal Multiuser MIMO Linear Precoding with LMMSE Receiver

The adoption of multiple antennas both at the transmitter and the receiver will explore additional spatial resources to provide substantial gain in system throughput with the spatial division multiple access (SDMA) technique. Optimal multiuser MIMO linear precoding is considered as a key issue in the area of multiuser MIMO research. The challenge in such multiuser system is designing the precoding vector to maximize the system capacity. An optimal multiuser MIMO linear precoding scheme with LMMSE detection based on particle swarm optimization is proposed in this paper. The proposed scheme aims to maximize the system capacity of multiuser MIMO system with linear precoding and linear detection. This paper explores a simpliﬁed function to solve the optimal problem. With the adoption of particle swarm optimization algorithm, the optimal linear precoding vector could be easily searched according to the simpliﬁed function. The proposed scheme provides signiﬁcant performance improvement comparing to the multiuser MIMO linear precoding scheme based on channel block diagonalization method. Copyright © 2009


Introduction
In recent years, with the increasing demand of transmitting high data rates, the (Multiple-Input Multiple-Output) MIMO technique, a potential method to achieve high capacity has attracted enormous interest [1,2]. When multiple antennas are equipped at both base stations (BSs) and mobile stations (MSs), the space dimension can be exploited for scheduling multi-user transmission besides time and frequency dimension. Therefore, the traditional MIMO technique focused on point-to-point single-user MIMO (SU-MIMO) has been extended to the point-tomultipoint multi-user MIMO (MU-MIMO) technique [3,4]. It has been shown that time division multiple access (TDMA) systems can not achieve sum rate capacity of MU-MIMO system of broadcast channel (BC) [5] while MU-MIMO with spatial division multiple access (SDMA) could, where one BS communicates with several MSs within the same time slot and the same frequency band [6,7]. MU-MIMO based on SDMA improves system capacity taking advantage of multi-user diversity and precanceling of multiuser interference at the transmitter.
Traditional MIMO technique focuses on point-to-point transmission as the STBC technique based on space-time coding and the VBLAST technique based on spatial multiplexing. The former one can efficiently combat channel fading while its spectral efficiency is low [8,9]. The latter one could transmit parallel data streams, but its performance will be degraded under spatial correlated channel [10,11]. When the MU-MIMO technique is adopted, both the multiuser diversity gain to improve the BER performance and the spatial multiplexing gain to increase the system capacity will be obtained. Since the receive antennas are distributed among several users, the spatial correlation will effect less on multi-user MIMO system. Besides, because the multiuser MIMO technique utilizes precoding at the transmit side to precanceling the cochannel interference (CCI), so the complexity of the receiver can be significantly simplified. However multi-user CCI becomes one of the main obstacles to improve MU-MIMO performance. The challenge is that the receiving antennas that are associated with different users are typically unable to coordinate with each other. By mitigating or ideally completely eliminating CCI, the BS exploits the channel state information (CSI) available at the 2 EURASIP Journal on Wireless Communications and Networking transmitter to cancel the CCI at the transmitter. It is essential to have CSI at the BS since it allows joint processing of all users' signals which results in a significant performance improvement and increased data rates.
The sum capacity in a multiuser MIMO broadcast channel is defined as the maximum aggregation of all the users' data rates. For Gaussion MIMO broadcast channels (BCs), it was proven in [12] that Dirty Paper Coding (DPC) can achieve the capacity region. The optimal precoding of multi-user MIMO is based on dirty paper coding (DPC) theory with the nonlinear precoding method. DPC theory proves that when a transmitter has advance knowledge of the interference, it could design a code to compensate for it. It is developed by Costa which can eliminate the interference by iterative precoding at the transmitter and achieve the broadcast MIMO channel capacity [13,14]. The famous Tomlinson-Harashima precoding (THP) is the non-linear precoding based on DPC theory. It is first developed by Tomlinson [15] and Miyakawa and Harashima [16] independently and then has become the Tomlinson-Harashima precoding (THP) [17][18][19][20] to combat the multiuser cochannel interference (CCI) with non-linear precoding. Although THP performs well in a multi-user MIMO scenario, deploying it in real-time systems is difficult because of its high complexity of the precoding at the transmitter. Many suboptimal MU-MIMO linear precoding techniques have emerged recently, such as the channel inversion method [21] and the block diagonalization (BD) method [22][23][24]. Channel inversion method [25] employs some traditional MIMO detection criterions, such as the Zero Forcing (ZF) and Minimum Mean Squared Error (MMSE), precoding at the transmitter to suppress the CCI Channel inversion method based on ZF can suppress CCI completely; however it may lead to noise amplification since the precoding vectors are not normalized. Channel inversion method based on MMSE compromises the noise and the CCI, and outperforms ZF algorithm, but it still cannot obtain good performance. BD method decomposes a multi-user MIMO channel into multiple single user MIMO channels in parallel to completely cancel the CCI by making use of the null space. With BD, each users precoding matrix lies in the null space of all other users channels, and the CCI could be completely canceled. The generated null space vectors are normalized vectors, which could avoid the noise amplification problem efficiently. So BD method performs much better than channel inversion method. However, since BD method just aims to cancel the CCI and suppress the noise, its precoding gain is not optimized.
It is obvious that the CCI, the noise, and the precoding gain are the factors affecting on the performance of the preprocessing MU-MIMO. The above linear precoding methods just take one factor into account without entirely consideration. A rate maximization linear precoding method is proposed in [26]. This method aims to maximize the sum rate of the MU-MIMO system with linear preprocessing. However, the optimized function in [26] is too complex to compute. In this paper, we solve the optimal linear precoding with linear MMSE receiver problem in a more simplified way. An optimal MU-MIMO linear precoding scheme with linear MMSE receiver based on particle swarm optimization (PSO) is proposed in this paper. PSO algorithm has been used in many complex optimization tasks, especially in solving the optimization of continuous space [27,28]. In this paper, PSO is firstly introduced into MIMO research to solve some optimization issues. The adoption of PSO to MIMO system provides a new method to solve the MIMO processing problem. In this paper, we first analyze the optimal linear precoding vector with linear MMSE receiver and establish a simplified function to measure the optimal linear precoding problem. Then, we employ the novel PSO algorithm to search the optimal linear precoding vector according to the simplified function. The proposed scheme obtains significant MU-MIMO system capacity and outperforms the channel block diagonalization method.
This paper is organized into seven parts. The system model of MU-MIMO is given in Section 2. Then the analysis of optimal linear precoding with linear MMSE receiver is given in Section 3. The particle swarm optimization algorithm is given in Section 4. In Section 5, the proposed optimal linear precoding MU-MIMO scheme with LMMSE detection based on particle swarm optimization is introduced. In Section 6, the simulation results and comparisons are given. Conclusions are drawn in the last section. The channel block diagonalization algorithm is given in the appendix.

System Model of MU-MIMO
The MU-MIMO system could transmit data streams of multiple users of the same cellular at the same time and the same frequency resources as Figure 1 shows.
We consider an MU-MIMO system with one BS and K MS, where the BS is equipped with M antennas and each MS with N antennas, as shown in Figure 2. The point-to multipoint MU-MIMO system is employed in downlink transmission.
Because MU-MIMO aims to transmit data streams of multiple-users at the same time and frequency resources, we discuss the algorithm at single-carrier, for each subcarrier of the multicarrier system, and it is processed as same as the single-carrier case. Since OFDM technique deals the frequency selective fading as flat fading, we model the channel as the flat fading MIMO channel: EURASIP Journal on Wireless Communications and Networking The received signal at the kth user is where y k is the received signal of user k. The elements of additive noise n k obey distribution CN(0, N 0 ) that are spatially and temporarily white. p k is the transmit signal power of the kth data stream, and p 0 is the total transmit power.
The received signal at the kth user can also be expressed as where s is the transmitted symbol vector with K data streams, W is the precoding matrix with K precoding vectors, and [·] T denotes the matrix transposition: The channel matrix H k can be assumed as the virtual channel matrix of user k after precoding. At the receiver, a linear receiver G k is exploited to detect the transmit signal for the user k. The detected signal of the kth user is The linear receiver G k can be designed by ZF or MMSE criteria, and linear MMSE will obtain better performance.
In order to simplify the analysis, the power allocation is assumed as equal β = p k /N 0 = p 0 /KN 0 , and linear MMSE MIMO detection is used in this paper as where (·) −1 indicates the inverse of the matrix, (·) H denotes the matrix conjugation transposition, and I N is the N × N identity matrix: where [·] k denotes the kth column of the matrix. Then the detected SINR for the user k with the linear detection is where · 2 denotes the matrix two-norm.
Because the nonnormalized precoding vector will amplify the noise at the receiver, the precoding vectors T k are assumed to be normalized as follow: for k = 1, . . . , K.

Optimal Multiuser MIMO Linear Precoding
We assume that the MIMO channel matrices H k (k = 1, . . . , K) are available at the BS. It can be achieved either by channel reciprocity characteristics in time-division-duplex (TDD) mode or by feedback in frequency-division-duplex (FDD) mode. And the channel matrix H k is known at the receiver k through channel estimation. We just discuss the equal power allocation case in this paper. The optimal power allocation is achieved through water-filling according to the SINR of each user. The MIMO channel of user k can be decomposed by the singular value decomposition (SVD) as If we apply T k = [V k ] 1 to precode for user k, it obtains the maximal precoding gain as follow.

Lemma 1. One has
where [V k ] 1 denotes the first column of V k , and λ max k denotes the maximal singular value of H k .

Proof. One has
So where [U k ] 1 denotes the first column of unitary matrix U k .
Thus, precoding with the singular vector corresponding to the maximal singular value is an initial thought to obtain good performance. However, if the singular vector is directly used at the transmitter as the precoding vector, the CCI will be large, and the performance will be degraded severely.
Only for the special case that the MIMO channel among all these users are orthogonal that the CCI will be zero if we directly use the singular vector of each user as its precoding vector. But in realistic case, the transmit users' channels are always nonorthogonal, and so the singular vector could not be utilized directly. We have drawn some analysis as follow.
(1)Ideal channel case. The ideal channel case is that the MIMO channels of transmitting users' are orthogonal. There is If we apply T k = [V k ] 1 to precode for user k, the maximal precoding gain will be obtained as (13) shows, and the CCI will turn to zero as follow.

Lemma 2. One has
Proof. One has Because we assume that Since where After linear MMSE detection at the receiver, user k obtains the maximal SINR as follows.

Lemma 3. One has
EURASIP Journal on Wireless Communications and Networking 5 Proof. One has According to (13) and (22) (2)Ill channel case The ill channel case is that all these transmitting users' channels are highly correlated. There is If we still apply T k = [V k ] 1 to precode for user k, the multiuser CCI will be very large, and the system performance will be degraded severely. The SINR after MMSE detection with equal power allocation for user k is as follows.

Lemma 4. One has
Proof. Since we have proven that when T k = [V k ] 1 to precode for user k, According to (19) Since we assume that Let the diagonal matrix Σ k = Σ k V H k WW H V k Σ k , and so there is where λ k indicates the first diagonal element of the diagonal matrix Σ k . So there is So the SINR for user k is (3) Practical case. The practical case is that the transmitting users' channels are neither orthogonal nor ill. There is The practical case is usually in realistic environment. If we apply   1 |can be the parameter to measure the precoding gain, and ρ i = |T H k [V i ] 1 | can be the parameter to measure the CCI. The SINR for user k according to the above analysis can be approximated denoted as The system capacity is related to SINR of the transmit users k, (k = 1, . . . , K). So in order to obtain the system capacity, we should obtain the SINR k . Thus, when the optimal precoding vector is obtained by the PSO algorithm, the system capacity could be calculated by (41). The system capacity of the MU-MIMO system can be indicated as We aim to maximize the system capacity of the MU-MIMO system in this paper. The optimal MU-MIMO linear precoding vector for the MU-MIMO system is the vector that can maximize the SINR at each receiver as where U denotes the unitary vector that U H U = I. From the above equation, it is clear that if we want to maximize the system capacity of MU-MIMO, then the SINR of each user should be maximized. The SINR of user k is associated with three parameters as the singular vector correspond to the maximal singular value of all users and the noise.

The Particle Swarm Optimization Algorithm
Particle swarm optimization algorithm was originally proposed by Kennedy and Eberhart [27] in 1995. It searches the optimal problem solution through cooperation and competition among the individuals of population. Imagine a swarm of bees in a field. Their goal is to find in the field the location with the highest density of flowers. Without any prior knowledge of the field, the bees begin in random locations with random velocities looking for flowers. Each bee can remember the location that is found the most flowers and somehow knows the locations where the other bees found an abundance of flowers. Torn between returning to the location where it had personally found the most flowers, or exploring the location reported by others to have the most flowers, the ambivalent bee accelerates in both directions to fly somewhere between the two points. There is a function or method to evaluate the goodness of a position as the fitness function. Along the way, a bee might find a place with a higher concentration of flowers than it had found previously. Constantly, they are checking the concentration of flowers and hoping to find out the absolute highest concentration of flowers.
Suppose that the size of swarm and the dimension of search space are C and D ,respectively. Each individual in the swarm is referred to as a particle. The location and velocity of particle i (i = 1, . . . , C) are represented as the vector Each bee remembers the location where it personally encountered the most flowers which is denoted as P i = [p i1 , p i2 , . . . , p iD ] T , which is the flight experience of the particle itself. The highest concentration of flowers discovered by the entire swarm is denoted as P g = [p g1 , p g2 , . . . , p gD ] T , which is the flight experience of all particles. Each particle is searching for the best location according to P i and P g . The particle i updates its location and velocity according to the following two formulas [27]: where t is the current iteration number; v t id and x t id + 1 denote the velocity and location of the particle i in the dth dimensional direction. p t id is the individual best location of particle i in the dth dimensional direction, p t gd is the population best location in the dth dimensional direction. ϕ 1 and ϕ 2 are the random numbers between 0 and 1, c 1 and c 2 are the learning factors, and w is the inertia factor. Learning factors determine the relative "pull" of P t i and P t g that usually content c 1 = c 2 = 2. Inertia factor determines to what extent the particle remains along its original course unaffected by the pull of P t g and P t i that is usually between 0 and 1. After this process is carried out for each particle in the swarm, the EURASIP Journal on Wireless Communications and Networking 7 process is repeated until reaching the maximal iteration or the termination criteria are met.

The Optimal Linear Precoding Multiuser MIMO with LMMSE Detection Based on Particle Swarm Optimization
With the adoption of PSO algorithm and the simplified function (40), the optimal linear precoding vector T k (k = 1, . . . , K) could be easily searched.
The proposed optimal MU-MIMO linear precoding scheme based on PSO algorithm will search the optimal precoding vector for each user following 6 steps.
T and the initial location . In order to accelerate the searching process, the initial location x 1 i,k could be initialized as [V k ] 1 , while the initial velocity v 1 i,k could be produced randomly. The real and imaginary parts of the initial velocity obey a normal distribution with mean zero and standard deviation one.
(3) The BS begins to search with the initial location x 1 i,k and velocity v 1 i,k . The goodness of the location is measured by the following equation: where the fitness function f t i,k indicates the obtained SINR for user k precoded by x t i,k . The PSO algorithm finds P t i,k and P t g,k that are individual best location and population best location measured by (44) for the next iteration. P t i,k denotes the individual best location which means the best location of particle i at the tth iteration of the kth user. P t g,k denotes the population best location which means the best location of all particles at the tth iteration of the kth user.
(4) For the tth iteration, the algorithm finds a P t i,k and a P t g,k . The location and velocity for each particle will be updated according to (43) for the next iteration. In order to obtain the normalized optimal precoding vector to suppress the noise, the location x t i,k should be normalized in each iteration. (5) When reaching the maximal iteration number I, the algorithm stops, and P I g,k is the obtained optimal precoding vector for user k. (6) For an MU-MIMO system with K users, the scheme will search the precoding vectors according to the above criteria for each user.

Simulation Results
We simulated the proposed MU-MIMO scheme, the BD algorithm in [22] (Coordinate Tx-Rx BD), and the channel inversion algorithm in [25] in this paper to compare their performance under the same simulation environment.    Figure 5 are based on the PSO parameters with the particle number C = 20 and the iteration number I = 20. It could be seen that the proposed MU-MIMO scheme can effectively increase the system capacity compared to the BD algorithm and channel inversion algorithm. Figure 6 is the average BER performance of the proposed MU-MIMO scheme and the coordinated Tx-Rx BD algorithm with M = 4, K = 4, N = 4. Figure 7 is the average BER performance of the proposed MU-MIMO scheme and the coordinated Tx-Rx BD algorithm with M = 4, K = 4, N = 2.    From the simulation results, it is clear that the proposed MU-MIMO linear precoding with LMMSE detection based on particle swarm optimization scheme outperforms the BD algorithm and the channel inversion algorithm. The reason lies in that the BD algorithm just aims to utilize the normalized precoding vector to cancel the CCI and suppress the noise. The channel inversion algorithm also aims to suppress CCI and noise. So the users' transmit signal covariance matrices of these schemes are generally not   optimal that are caused by the inferior precoding gain. The proposed MU-MIMO optimal linear precoding scheme aims to find the optimal precoding vector to maximize each users' SINR at each receiver to improve the total system capacity. Figure 8 shows the BER performance of the proposed MU-MIMO scheme with the same particle size and different iteration size when M = 4, K = 4, N = 4. It adopts equal power allocation, MMSE detection, QPSK, and no channel coding. The particle number C is 20, and the iteration number scales from 5 to 30. We could see that when the iteration number is small, the proposed scheme could not obtain the best performance. With the increase of the iteration number, more performance as well as the computational complexity will increase too. However, when the iteration number is larger than 20 for this case, the algorithm could not obtain more performance gain. Generally, for different case, the best iteration number is different. The iteration number is related to the transmit antenna number M at the BS and the transmit user number K (K ≤ M). With the increasing of M or K, the iteration number should increase in order to obtain the best performance.

Conclusion
This paper solves the optimal linear precoding problem with LMMSE detection for MU-MIMO system in downlink transmission. A simplified optimal function is proposed and proved to maximize the system capacity. With the adoption of the particle swarm optimization algorithm, the optimal linear precoding vector with LMMSE detection for each user could be searched. The proposed scheme can obtain significant system capacity improvement compared to the multi-user MIMO scheme based on channel block digonolization under the same simulation environment.