Sparse Channel Estimation for Large-Scale MISO-OFDM SystemA Bayesian VMP Approach

The problem of channel estimation, in largescale multiple input single output orthogonal frequency division multiplexing (MISO-OFDM) systems, is studied in this paper. In order to take full advantage of the sparse property, an intermediate random vector is introduced to control the sparsity of the estimation of the channel state information (CSI) based on the maximum a posteriori estimator. After carefully designing the prior probability density function (PDF) of the intermediate random vector and the unknown CSI conditioned on it, the sparse optimization problem over the CSI is constructed. The Bayesian inference theory is applied to relax the optimization problem by calculating an approximated PDF with simpler form. After that, variational message-passing (VMP) is used to obtain the solution in iterative analytical form. Furthermore, block sparse structure is implemented to improve the performance. Simulation results demonstrate the merit of proposed algorithm over the traditional ones.


Introduction
One of the key problem of wireless communication is to improve the spectrum efficiency due to its scarcity [1], [2].Recently, multiple-input-multiple-output orthogonal frequency division multiplexing (MIMO-OFDM) has gained great development to address that problem.Upto-date research efforts have found that MIMO systems using a large number of antennas can improve the transmission performance significantly.Against this background, the large-scale MISO (multiple-input-single-output) technology has been viewed as a promising technology for current and future wireless communication systems [3].However, the complexity of obtaining the accurate channel state information (CSI) increases heavily as the number of antennas, which makes it difficult to use this technology in practical systems [4][5][6].One important reason is that the number of unknown variables becomes so large that too much spectrum resources need to be used.Therefore, we need to study on higher efficiency CSI estimation algorithms.Fortunately, the problem can be relaxed by the fact that the outdoor wireless channel always exhibits a sparse structure, which means that only a small fraction of the paths of the impulse response of the channel have feasible values [7].In this situation, the problem of channel estimation can be transformed into a sparse estimation problem [8].
A lot of published works have addressed the sparse estimation problem.Compressed sensing (CS) is viewed as a promising method to solve this kind of problem.Based on the sparse structure of CSI, the problem of the large-scale MISO channel estimation can be constructed as an l 1 norm constrained optimization problem known as Least Absolute Shrinkage and Selection Operator (LASSO).Also, the greedy algorithm is a useful approach to solve this, such as Orthogonal Matching Pursuit algorithm and Subspace Pursuit algorithm [9], [10].Besides, the sparse Bayesian learning (SBL) is another kind of approach to obtain sparse solution [11].It aims at finding a sparse maximum a posteriori (MAP) estimate of unknowns, resulting in higher accuracy of estimation.
The Bayesian sparse estimation algorithm has been proven to be an effective tool to solve the problem of sparse channel estimation in single-input-single-output (SISO)-OFDM [12].Besides SISO-OFDM systems, recent research indicates that MISO system would also have a sparse CSI structure.What's more, it always has a common channel support between different transmit-and-receive antenna pairs, due to the similar path delay induced by the so closely located antennas [13].Furthermore, the channel statistical properties, such as spatial covariance and time correlation, can be used to improve the estimation performance.Taking this into consideration, a block sparse structure based model of the channel vector is also studied in this paper.

The contribution of this paper is as follows:
Firstly, the optimization problem over sparse CSI estimation by means of intermediate random variables is constructed.
Secondly, both the general case and the block-sparsestructure based algorithm is introduced, using the theory of variational message passing (VMP).
Finally, an iterative analytical form of the channel estimator is obtained.Notation: Let CN (y|J, C) be the complex multivariate Gaussian probability density function (PDF).Also, diag(x) denotes the diagonal matrix created from vector x.Let A m be the matrix consisted of the first m columns of matrix A. Let Ga(x|a, b) = b a Γ(a) x a−1 e −bx be the Gamma PDF.Let K v (•) be the modified Bessel function of the second kind with order v, and A ⊗ C represent the Kronecker product of the two matrices A and C. Finally, let < f (x)> p(x) denote the expectation of function f (x) with respect to the density p(x).

System Model
We consider a downlink transmission scenario with a base station and several terminals.The base station and the terminals are equipped with N t transmit antennas and one single received antenna, respectively.The transmission symbols are organized in frames where the preamble signals and data symbols are time multiplexed.Each frame contains one preamble signal and several data symbols.At the beginning of each frame, the pilot symbols, which can be expressed as are assigned to the M subcarriers in every transmission block, where l denotes the l-th antenna.These pilot signals in frequency domain can be transformed into time domain by the inverse discrete fourier transform (IDFT).Therefore, the time-domain preamble signals of the l-th antenna can be written as where 1 ensures the unit transmit power, and F denotes the order-M normalization DFT matrix with Without loss of generality, in order to distinguish the preamble signals from different antennas, it is assumed that s l s i for l i.The cyclic prefix (CP) is also inserted to avoid inter symbol interference (ISI).
After removing CP, the matrix-form received signal in time domain can be written as where q is the received preamble signal of length M, and H l is the cyclic matrix composed by the l-th CSI vector, i.e. l-th transmit-receive antenna pair.
The l-th CSI vector can be expressed as T , where L denotes the maximum delay spread of the CSI.Additionally, n denotes the sampled noise vector, the elements of which are independent and identically distributed complex Gaussian vector with zero mean and variance σ 2 n .
After performing DFT operation on the received signal q, its frequency-domain form vector y can be written as Since H l is a cyclic matrix, (1) can be rewritten as where w = Fn is the noise vector in frequency domain.
Considering the property of diagonal matrices, diag(F m h l )r l in ( 2) can be further rewritten as Furthermore, by stacking all of the channel coefficients as an aggregate channel vector h can be expressed as where The equation ( 3) describes the linear relationship between the unknown vector h and the observation vector y.Since M < LN t , it is of high difficulty to obtain the unknowns from the observation vectors exactly.

Sparse Channel Estimation
In this section, we introduce a new method based on VMP algorithm to solve the problem of (3).

Bayesian Prior Model for Sparse Channel Estimation
Using maximum a posterior (MAP) algorithm, the sparse channel estimation h can be written as the solution of Since h is sparse, we need to introduce some constraint conditions to utilize the sparse structure of h in (4) in order to get better performance.What we need is a kind of mechanism that can control the sparsity of the solution of (4).Inspired by [12], the Bayesian prior model can be used as a useful tool to achieve this.According to this model, we introduce an intermediate random vector χ first.
Based on χ, the prior p(h) can be demonstrated by a two-layer hierarchical structure, including a conditional prior p(h| χ) and a p(χ), as shown in The core of this model is to find some proper distribution p(χ) and p(h| χ).By carefully designing p(h| χ) and p(χ), the sparsity of the solution can be controlled by χ, e.g., the larger the element of χ is, the more closer to zero the corresponding element of h is.Meanwhile, when p(χ) and p(h| χ) have some proper form, we can construct inference algorithms to obtain analytical expressions.Furthermore, it contributes to simplification of the computation.
Then the joint PDF can be decomposed as where p(y|h, σ) = CN (y|Ph, σ 2 I) based on (3).Furthermore, let the conjugate prior and respectively.Also, it is assumed that p(χ) = Then we can use the product of the generalized inverse Gaussian (GIG) to compute the prior of h as

VMP Based Sparse Channel Estimation
In this section we present a VMP algorithm for estimating h in (3).
To clearly encode the factorization of p(Ω Ω Ω, y), the factor graph is shown in Fig.
Based on Fig. 1, q(Ω Ω Ω) can be computed by VMP algorithm as follows In ( 6), S Ω Ω Ω i is the set including all factor nodes whose neighbor is the factor node Ω Ω Ω i and m(g n → Ω Ω Ω i ) denotes the message from factor node g n to variable node Ω Ω Ω i with where S g n is the set composed by the variable nodes that neighbour the factor node g n .
Using (6), the unknowns can be calculated iteratively by viewing the other variables as constant, which are shown as follows.

Fig. 2.
A factor graph of the signal model of the joint PDF of (17). and Then the expression of q(χ) can be obtained as where the auxiliary probability q(χ) can be viewed as the product of GIG PDF with order ε − ρ.Therefore, the < χ i l > q(χ) is given by [15] |h l | 2 q(h) 3) Update of q(σ): From (7) and Fig. 1, it can be easily obtained that Then we have

Block-Structured Channel Model
As the number of transmit antennas increases, channel estimation performance may be degraded significantly, due to the increment of the length of the aggregate channel vector and the limited pilot signal size.Specially, due to the close antenna spacing at the base station, the times of arrival from different transmit antennas are similar to each other.With the limited sampling resolution at the receiver, the nonzero tap locations of each CSI vector are considered identical [16], that is supp(h i ) = supp(h j ), i j (14) where supp(h i ) is defined as the length−L support vector for the i-th vector h i , and where supp(h i )(k) is the k-th element in h i .Based on this analyses, let G be the number of the nonzero values in h l .Without loss of generality, it is assumed that the numbers of nonzero value of h satisfies (G × N t ) (L × N t ) and (G × N t ) < M in (3).Thus, the aggregate channel vector h can be rearranged into Furthermore, the equation ( 3) can be rewritten as y = Pb + w (16) where and Also, a l is the (l + 1)-th column vector of matrix A. In this situation, the new vector b exhibits block sparsity.It means that among the L blocks of vector b, only a small number of b l in the multiple channel vectors {b l } L−1 l=0 are nonzero vectors.Based on the assumption of the wide sense stationary uncorrelated scattering (WSSUS) model, the blocks can be viewed as independent of each other, which implies that we only need to consider the internal correlation of blocks.
We aim at obtaining the estimation of b in (16).Since b has the feature of block sparsity, it is meaningful to introduce the block-sparse structure into the sparse estimation problem ( 16) to improve the system performance.In this section, we will propose a variational message passing (VMP) based Bayesian-block-sparse channel estimation algorithm to estimate b with the given observation y.

Bayesian Prior Model for Block-Sparse Estimation
Since b has a block-sparse structure, the covariance matrix B l is introduced to describe the character of the correlation between the elements of b l .Similar to [17], it is assumed that B = B l for all l.In this situation, Then the PDF of b can be written as Also, p(χ) is written as with p( χ l ) = Ga( χ l |ε, η l ).Thus the computation of the prior of b is The factor graph of the block-sparse model is shown in Fig. 2.

VMP-Block-Sparse Channel Estimation
Using VMP algorithm, the calculation of each variable is as follows.
1) Update of q(b): Since where and respectively.Multiply these messages, and the auxiliary PDF q(b) can be written as : where Σb = < σ> q(σ) In equation ( 19), V (χ) is defined as 2) Update of q(χ): Since ) where Then the expression of q(χ) can be written as where q(χ) can be viewed as the product of GIG PDF with order ε − ρN t .Therefore, the < χ i l > q(χ) is given by [15] < where and f =< η l > q(η) .For modified Bessel function, we have Therefore, < χ −1 l > q(χ) with i = −1 can be written as < χ −1 l > q(χ) = ρN t − ε e where e can be further deduced as 3) Update of B: In order to improve the estimation performance, the covariance matrix B can be constrained, which is inspired by [17].A possible form in [17] assumes that the covariance matrix is a Toeplitz matrix.The matrix B is assumed to have a non-stochastic variable in this paper because of its statistical characteristic.According to equation (17), the expression of updating B can be derived as The derivative of (25) with respect to B, is given by Thus, we obtain the rule of updating of B as where b l is the l-th block in b (with the size of N t × 1), and Σ l b is the l-th diagonal block in Σb (with the size of N t × N t ).Furthermore, B can be substituted by a Toeplitz matrix with where r Also, a 0 and a 1 are the average value of the elements along the main diagonal and the sub-diagonal of the matrix in (26), respectively.
4) Update of q(σ): It can be easily obtained that Then we have

Simulation
In this section, the performance of the proposed VMP algorithm is evaluated.Meanwhile, we compare the normalized mean square error (NMSE) of the proposed algorithm and the traditional ones.The measurement matrix P in section II is given with dimension M = 560, N t = 8, L = 128, and the number of the nonzero values in h l is G = 20 in Fig. 3.
Figure 3 shows the performance of NMSE among OMP algorithm, SP algorithm and the proposed VMP algorithm versus the signal-to-noise ratio (SNR).It can be observed that the proposed VMP channel estimate algorithm outperforms the other traditional estimation methods as SNR increases.As shown in Fig. 3, the proposed VMP algorithm shows better performance compared with SP algorithm when SNR is relative small.Besides, with the increase of SNR, the gap between the proposed algorithm and the SP algorithm becomes smaller, for the sake that the affection of the noise is almost negligible when SNR becomes large enough.In Fig. 3, it can be seen that the SP algorithm performs not only better than OMP algorithm but also than VMP algorithm when ε = 1.However, with appropriate choice of the value of ε, the proposed VMP algorithm achieves significant performance gain over the others.
Additionally, the performance with different ε is also shown in Fig. 4. It can be seen that with the decrease of ε the performance of the VMP algorithm is improved.That is because the parameter ε controls the sparsity properties, and the case ε = 0 encourages a sparser solution than ε > 0. Specifically, by considering the sparse block structure of the channel vector, the complexity of proposed method can be reduced significantly.
In Fig. 5, we set the dimension of P with M = 560, N t = 60, L = 30.And the number of the nonzero values in h l is G = 4. Comparing Fig. 4 and Fig. 5, it can be easily found that with the increasing of the number of the transmission antennas, the performance of all of the algorithms decreases.However, the proposed algorithm shows better robustness than the others.The performance of the proposed VMP algorithm achieves significant performance than the others even when ε = 1.

Conclusion
This paper studies a Bayesian-VMP sparse channel estimation algorithm for large scale MISO-OFDM systems.By introducing an intermediate random vector and carefully choosing the PDF based on it, the sparsity-inducing prior of the unknown vector is modelled.After that, an optimization problem over CSI is constructed.The VMP algorithm is proved to be an effective way to solve the optimization problem.Simulations indicate the merits of the proposed algorithm over traditional ones.The proposed algorithm is proved to be an effective solution of estimating the sparse channel parameters in large MISO-OFDM systems.