Implicit Pilots for an Efficient Channel Estimation in Simplified Massive MIMO Schemes with Precoding

. This paper proposes a new channel estimation scheme based on implicit pilots, optimized for a simpli ﬁ ed massive multiple input, multiple output (MIMO), implemented with precoding, combined with Single-Carrier with Frequency-Domain Equalization (SC-FDE) modulations. We propose an iterative receiver that considers an iterative detection with interference cancellation and channel estimation. The channel estimates are usually obtained with the help of pilot symbols and/or training sequences multiplexed with data symbols. Since the required overheads in massive MIMO schemes can be too high, leading to spectral degradation, the use of superimposed pilots (i.e., pilots added to data) is an e ﬃ cient alternative. Three di ﬀ erent types of preprocessing algorithms are considered in this paper: Zero-Forcing Transmitter (ZFT), Maximum Ratio Transmitter (MRT), and Equal Gain Transmitter (EGT). The main advantage of MRT and EGT is that they do not require matrix inversions. Nevertheless, some level of interference is generated in the decoding process. Such interference is mitigated by employing an optimized iterative receiver. By employing the proposed implicit pilots, the performance of MRT and EGT is very close to the Matched Filter Bound just after a few iterations, even when the number of transmit or receiver antennas is not much higher than the number of data streams.


Introduction
Massive MIMO (m-MIMO) schemes are a key technique employed in emergent wireless communications that tends to achieve higher network capacity and spectral efficiency [1,2].m-MIMO is expected to be utilized in 5G (Fifth Generation) systems, alongside with Millimeter Wave (mm-Wave) communications, due to its increased channel coherence bandwidth, as compared to centimeter wave.Moreover, the low wavelength allows the installation of a high number of antenna elements in a reduced area, facilitating the implementation of m-MIMO [3], especially in small cell networks (pico or femto).
Other systems consider the same combination of m-MIMO with mm-Wave, such as IEEE 802.11ad [4], using bands in the vicinity of 40 GHz up to 70 GHz, or even above [3,5].Nevertheless, the high path losses, reduced diffraction effects, and more complex power amplification implementations are common limitations experienced in such high frequencies [6].This is, however, mitigated by the high reflections, which tends to support an increased coverage.
To cope with frequency-selective channels, m-MIMO schemes can be combined with prefix-assisted techniques like OFDM (Orthogonal Frequency Division Multiplexing) or SC-FDE (Single-Carrier with Frequency-Domain Equalization) [7,8].
Block transmission techniques with a cyclic prefix, long enough to cope with the channel length, are commonly employed to mitigate intersymbol interference, whose effect increases with the increase of the symbol rate [9].OFDM or Single-Carrier (SC) modulations, with Frequency-Domain Equalization (FDE), are among the most used block transmission techniques.However, these techniques need accurate channel estimates.These estimates can be obtained with the help of pilots (although "pilots" are commonly associated with training signals multiplexed in the frequency (as those used in OFDM subcarriers, for channel estimation), while "training sequences" is a term widely employed in training signals multiplexed in time, this paper uses the term "pilots" to refer to any signal used for the purpose of channel estimation) multiplexed with data, either in the time or in the frequency [10], but this leads to spectral degradation.
Pilots can also be multiplexed with data subcarriers so there is no pilot/data interference, but in the SC-FDE case, one needs to create nulls in the apparent channel frequency response seen by the data, leading to performance degradation [11].As an alternative, we can use superimposed pilots (also known as implicit pilots or pilot embedding), i.e., pilots that are added to data [12,13].The major problem associated with superimposed pilots is the interference between data and training signals: on the one side, the channel estimates are corrupted by the data signal, leading to irreducible noise floors; on the other side, the detection performance will be degraded because of the interference from the training block.
An additional problem associated to m-MIMO is that one needs to employ simple techniques to separate data streams, namely, avoiding the matrix inversions inherent to conventional MIMO receivers, to reduce the implementation complexity, as proposed in [14] as a postprocessing approach.
Improved spectral efficiencies are commonly associated with constellations of a higher order, which also require higher powers to face the reduced minimum Euclidean distances.Such modulations tend to correspond to a higher peak-to-average power ratio (PAPR), translated in a reduced amplification efficiency [15].Moreover, the inherent use of OFDM signals, composed of a sum of many independent and parallel subcarriers, tends also to present high requirements in terms of PAPR.This can be mitigated by using SC-FDE schemes, instead of OFDM signals, which present lower envelope fluctuations of the signals, translating into low complexity and more efficient power amplification [9].
A very efficient receiver commonly associated with SC-FDE schemes is the Iterative Block-Decision Feedback Equalization (IB-DFE) technique [15][16][17].Such an iterative receiver makes use of feedforward and feedback coefficients to process the signals in the frequency domain, reaching a performance typically much better than that of a noniterative receiver.IB-DFE can be viewed as turbo equalization [18,19].
The Zero-Forcing (ZF) algorithm tends to be complex, because it requires the computation of the pseudoinverse of the channel matrix, for each frequency component.In this paper, we avoid this complexity by implementing the m-MIMO using MRT and EGT algorithms, using the preprocessing approach, instead of the traditional postprocessing, simplifying the receiver, associated with an iterative receiver, using SC-FDE transmissions.Note that MRT corresponds to the well-known Maximum Ratio Combiner (MRC) algorithm [20], with the difference being the location where it is implemented: the former uses the preprocessing approach, whereas the latter is employed following the traditional postprocessing approach.The same rationale applies to EGT versus the Equal Gain Combiner (EGC), as described in [20].Since these algorithms originate a certain level of interference, we include an interference cancellation process in the receiver, whose design is based on the IB-DFE receiver.
In this paper, we propose an iterative receiver that considers an iterative detection with interference cancellation and channel estimation for m-MIMO, using an efficient precoding, applied to broadband mm-Wave communications that can employ highly efficient, low-cost saturated amplifiers.For the sake of comparison, this paper performs a comparison between the implicit pilots, conventional pilots, and ideal channel estimation, under the same transmission and receiver scenarios for m-MIMO.
This paper is organized as follows: Section 2 describes the system characterization associated with generic SC-FDE signals; Section 3 considers the transmitter structure for the proposed m-MIMO using precoding; Section 4 describes the channel estimation using multiplexed or implicit pilots; Section 5 deals with the receiver design for MIMO detection and channel estimation; Section 6 analyzes the performance results; and Section 7 concludes the paper.

System Characterization
This paper considers SC-FDE schemes, using Quadrature Phase Shift Keying (QPSK) modulation.The time-domain block signal to be transmitted is x n , n = 0, 1, … , N − 1 , where N corresponds to the length of the data block.The frequency-domain block is generated from the time-domain block as X k , k = 0, 1, … , N − 1 = DFT x n , n = 0, 1, … , N − 1 , i.e., by performing the DFT (Discrete Fourier Transform) of the time-domain block.
By assuming that the cyclic prefix is longer than the overall channel impulse response of each channel, after removing the cyclic prefix, the received frequency-domain signal comes where 1 denotes the channel frequency response for the kth subcarrier (the channel is assumed invariant in the frame) and N k is the frequency-domain block channel noise for that subcarrier.Moreover, from (1) we obtain the received time-domain signal through At the output of the FDE we have the samples We assume a frame structure with N subcarriers per block and N T time-domain blocks, each one corresponding to an "FFT block."

2
International Journal of Antennas and Propagation Assuming the conventional linear FDE for SC schemes, the postprocessing comes as where As expected,

Transmitter Structure for the Proposed Massive MIMO Using Precoding
This paper focus on a massive MIMO scenario, as depicted in Figure 1, which concerns the transmission between a transmitter with T antennas and a receiver with R antennas.This system can be employed between a Base Station (BS) and a Mobile Terminal (MT) with R receiving antennas, to send multiple streams of data.This paper focus on the scenario with R data streams, where T ≫ R.Moreover, the uplink scenario can also be considered by the proposed system, as long as the MT has enough power-processing capability to implement the precoding.Since the proposed preprocessing algorithms based on MRT and EGT are very simple, such uplink scenario can easily be implemented.
Using the matrix-vector representation, we can express (1) for m-MIMO, using the corresponding frequency-domain block as where H k denotes the R × T channel matrix for the kth frequency, with the r, t th element H where B k denotes the T × R precoding matrix, and the data symbols Depending on the algorithm employed, the precoding matrix B k can be computed as follows: (1) Using the Zero-Forcing Transmitter (ZFT) (ZFT refers to the ZF algorithm implemented as precoding, at the transmitter side) algorithm, B k becomes (2) Using the MRT algorithm, B k becomes where T stands for the number of transmitting antennas (3) Using the EGT algorithm, B k becomes A disadvantage of the ZFT relies on the need to compute the pseudoinverse of the channel matrix, for each frequency component, which corresponds to a high processing power capability.This paper mitigates this limitation by using the MRT and EGT algorithms.Nevertheless, a certain level of interference is generated, which degrades the performance, especially for moderate values of T/R.Such interference can be mitigated by employing the proposed iterative receiver that performs interference cancellation, as detailed in [20], resulting in an improved performance.
The iterative receiver (interference canceller), depicted in Figure 2, considers where the frequency-domain estimated data symbols are . The interference cancellation matrix C k can be computed by where

International Journal of Antennas and Propagation
This interference canceller is implemented using X k = X 0 , … , X N−1 , with X k denoting the frequency-domain average values conditioned to the FDE output for the previous iteration [7], with X k = DFT x n .Note that x n can be obtained as defined in [20].
For the first iteration, there is no information about the transmitted symbols and X k = 0.The signal and system description of the proposed m-MIMO using postprocessing, instead of preprocessing, is described in [20].

Channel Estimation
4.1.Channel Estimation Using Conventional Pilots.Let us first assume that X k = 0, i.e., there is no data overlapping the training block, as in conventional schemes.In that case, we could estimate the channel frequency response as follows [7]: then there is no interference between antennas when estimating the corresponding channels (e.g., by using disjoint sets of subcarriers for different antennas).This leads to The channel estimation error ε k is Gaussian-distributed, with zero mean and in [17].
Since the channel impulse response is shorter than the cyclic prefix (which is just a fraction of the block duration), we could employ training blocks that are shorter than the standard data blocks.Alternatively, we could use the enhanced channel estimates as [9,21,22] where w n = 1 if the nth time-domain sample is inside the cyclic prefix and 0 otherwise.In this case, the SNR at the channel estimates is improved by a factor T/T G , with T and T G denoting the duration of the useful part of the block and the cyclic prefix, respectively.

Channel Estimation Using Implicit Pilots.
Let us consider now the use of superimposed pilots, i.e., X k ≠ 0 for the subcarriers with pilots.In the following we will assume that [19] where σ 2 D stands for the power of the data.Moreover, it is assumed that [19] Clearly, we will have interference between data symbols and pilots, which leads to performance degradation.To overcome this problem, we can employ pilots with relatively low power and average the pilots over a large number of blocks so as to obtain accurate channel estimates.This is very effective since the data symbols usually have zero mean and different data blocks are uncorrelated.Naturally, there are limitations on the length of this averaging window, since the channel should be constant within it (not to mention the associated delays).Once we have an accurate channel estimate, we can detect the data symbols, removing first the signal associated with the pilots.
Let us assume a frame with N T time-domain blocks, each with N subcarriers.If the cyclic prefix of each FFT  International Journal of Antennas and Propagation block has N G = NT G /T samples, we will need N G equally spaced frequency-domain pilots for the channel estimation.For pilot spacing in time and frequency, ΔN T and ΔN F , respectively, the total number of pilots in the frame is given by This means that we have a pilot multiplicity or redundancy of To avoid significant performance degradation due to channel estimation errors, the SNR associated with the channel estimation, given by SNR est ≈ N R σ 2 TS /σ 2 D , should be much higher than the SNR for the data SNR data = σ 2 D /σ 2 N .For the first iteration, an initial channel estimation is obtained just by correlating the received signal Y r Rx k with the pilots, by using (12) and then (16).
For the second iteration, the data symbols are removed from the received blocks, i.e.,

Y r TS
and the channel estimation is improved by using (12) and (16).
For the third and further iterations, the pilots are removed from the received blocks, as and the average values of the data symbols will be used as pilots for obtaining the channel frequency response estimate where , the matrix of the frequency-domain precoded signals generated by W k = B k X k , as defined by (6), and where B k denotes the T × R precoding matrix, as defined by ( 7), (8), or (9), depending on the precoding algorithm.Since using α = 0 might lead to noise enhancement effects in the channel estimates when W i k 2 is small, we will consider If we have moderate to high SNR, then and we could use α = 0.
Finally, H ~TS k and H ~D k , i.e., the channel estimates obtained from the training sequences and from the data aided, respectively, can be combined to provide the normalized channel estimates with minimum error variance, as defined by where  In this section, we present a receiver with m-MIMO detection and channel estimation for SC-FDE with superimposed pilots.Without loss of generality, it is assumed that there is a pilot for each subcarrier of each block of the frame (and for each transmit antenna), i.e., ΔN F = ΔN T = 1, leading to N Frame TS = NN T and a pilot multiplicity or redundancy of The principles behind this receiver are the following (see indexes in the chains of Figure 3): (i) We first obtain the preliminary channel frequency response estimate, as in ( 12) This initial phase of detection and channel estimation is depicted in Figure 3, where different types of dashed lines represent the first and second iterations, while straight lines represent third and further iterations.Note that H ~i t,r k denotes the channel estimate between the tth transmit and the rth receive antenna, and the ith iteration (ii) Then, and as in (16), the channel estimate is and the blocks of detected samples X ~1 k are generated using the receiver with interference cancellation, as defined in Section 4. Thus, considering SC-FDE signals with interference canceller in the receiver, the decoded symbols become where C k is defined in (11).Note that denotes the average signal conditioned to the output of the interference canceller, for the previous iteration . These average data values are then used to generate the average values of the transmitted symbols X 1 k (as in (ii) to (v)) that will be used in the next iteration (v) For the next iterations, the pilots are removed from the received blocks, i.e., Y , and the average values of the data symbols will be used as pilots for obtaining the channel frequency response estimate using data-aided H ~i t,r D k , as defined by (23) (vi) As in (ii), we use the approach of (16) to enhance the channel estimates (vii) Finally, we combine the channel estimations obtained from the training sequences with those obtained from the data (data aided), to provide the normalized channel estimates with minimum error variance, as defined by (26) (viii) The steps (iv) to (vii) are repeated for each iteration of the receiver

Performance Results
This section studies the Bit Error Rate (BER) performance obtained with m-MIMO using precoding, after the estimation of the channel parameters, using implicit pilots.The SC-FDE transmission technique is considered.The BER is evaluated as a function of E b /N 0 , where N 0 is the one-sided power spectral density of the noise and E b is the

6
International Journal of Antennas and Propagation energy of the transmitted bits (i.e., the degradation due to the useless power spent on the cyclic prefix is not included).
The BER was evaluated using Monte Carlo simulations, with QPSK modulation and with a block length of N = 256 symbols (similar results were observed for other values of N, provided that N ≫ 1).We considered 10000 independent channel realizations for obtaining the average error rates.A Rayleigh fading channel was considered with 16 uncorrelated equal power paths (it was assumed invariant during the block duration).The duration of the useful part of the blocks (N symbols) is 1 μs, and the cyclic prefix has a duration of 0.125 μs.The impact of the CP duration in the performance is residual as long as the impulse response presents a high number of separated multipath components (with different delays with regard to the symbol period), which is the case of the current paper.For SC-FDE systems, up to four iterations of the receiver (detection and interference cancellation) were considered.Beyond four iterations, the performance improvement was almost negligible.
Perfect synchronization is assumed, as well as a transmitter with linear power amplification.

Ideal Channel Estimation. Figure 4 considers BER results
for massive MIMO with 32 transmitting antennas and 2 receiving antennas (32 × 2), using precoding with ZFT, MRT, and EGT, and ideal channel estimation.The Matched Filter Bound (MFB) performance is also shown.Results with and without interference cancellation are shown in Figure 4.In the scenario of Figure 4, the ZFT achieves a performance close to the MFB.Note that a regular SC-FDE receiver, without interference cancellation, is employed in the case of the ZFT.This receiver is used because the ZFT algorithm does not generate interference.It is viewed that, with 4 iterations of the interference cancellation, the performance obtained with the MRT approximates that of the MFB and ZFT, without having to invert matrices.Moreover, in the same scenario, the MRT algorithm achieves a performance better than that of the EGT.
Figure 5 considers BER results for massive MIMO with 32 transmitting antennas and 8 receiving antennas (32 × 8), using precoding, and ideal channel estimation.The conclusions that can be extracted from Figure 5 are similar to those obtained from Figure 4. Nevertheless, by increasing the number of receive antennas (which corresponds to an increase in the number of MTs or number of parallel streams of data sent to a certain MT), we are increasing the level of interference.Consequently, the performance results obtained with the MRT and EGT without interference cancellation are worse than those obtained in the 32 × 2 scenario.Nevertheless, by employing the new proposed interference cancellation, the performance results improve.The performance results obtained with the MRT and with 4 iterations of the interference canceller get closer to those obtained with the MFB.
Figure 6 considers BER results for massive MIMO with 128 transmitting antennas and 8 receiving antennas (128 × 8), using precoding, and ideal channel estimation.As compared to Figure 5, we have increased the number of transmit antennas, while the number of receive antennas was kept unchanged, which resulted in a performance improvement for MRT/EGT, without and with interference cancellation.
6.2.Channel Estimation Effects.This section presents results using implicit pilots for channel estimation, for m-MIMO using precoding.A comparison is also performed with the results obtained with ideal channel estimation and with the results obtained with conventional pilots.   Figure 7 shows the BER with implicit pilots, considering a single data block, a single iteration of the iterative channel estimator, and a pilot power of −12 dB, as compared to the power of the data.Results with ideal channel estimation are also plotted in Figure 7.As can be seen, the results obtained with implicit pilots, using such parameters, are very poor.This occurs for several reasons: (i) There is interference between data symbols and pilots, which leads to performance degradation.
To overcome this problem, we can employ pilots with relatively low power and average the pilots over a large number of blocks so as to obtain accurate channel estimates.This is very effective since the data symbols usually have zero mean and different data blocks are uncorrelated.Naturally, there are limitations on the length of this averaging window, since the channel should be constant within it (not to mention the associated delays).In the following, results with 5 and 15 blocks are shown (ii) A single iteration of the iterative channel estimator is probably not enough to achieve an acceptable estimate of the channel (iii) The power of the pilot is, eventually, too low.In the case of implicit pilots, a trade-off of the pilot power needs to be achieved because (1) the power of the pilot must be sufficiently low such that the interference generated by the pilots over the data symbols is not high and (2) the power of the pilot must be sufficiently high such that the initial channel estimate obtained with the pilots is good enough for the further iterations of the channel estimation to perform well Figure 8 shows the BER performance results with implicit pilots, considering 5 data blocks, three iterations   Figure 8: BER results with 32 × 2 using implicit pilots with 5 data blocks, 3 iterations, and a pilot power of −6 dB, −3 dB, and 0 dB.International Journal of Antennas and Propagation of the iterative channel estimator, and a pilot power of −6 dB, −3 dB, and 1 dB, as compared to the power of the data.Results with ideal channel estimation are also plotted.As can be seen, the results obtained with implicit pilots, using such parameters, are better than those obtained in the scenario of Figure 7 (pilot power of −12 dB and −9 dB).The results obtained with a pilot power of −3 dB are slightly better than those of power −6 dB.Except for the ZFT (that performs very poorly), the results obtained with a pilot power of 0 dB are very close to those of power −3 dB.Using implicit pilots, the best overall performance is always the MRT, followed by the EGT.It is noted that the performances obtained with MRT and EGT and a pilot power of −3 dB and 0 dB are very close to the ideal channel estimation counterparts.Finally, it is worth noting that while the ZFT with ideal channel estimation performs almost as good as the MFB, by considering unideal channel estimation, the performance obtained with the ZFT is always poor.Consequently, one can conclude that ZFT is not a good choice, either because it performs poorly under unideal channel estimation or because it is very demanding from the processing point of view (it requires computing the pseudoinverse of the channel matrix, for each frequency component).
Figure 9 shows the BER performance results with implicit pilots, considering 5 data blocks and a pilot power of −3 dB, with two versus three iterations of the iterative channel estimator.Results with ideal channel estimation are also plotted.As can be seen, with the exception of the ZFT, the performances obtained with 2 and 3 iterations of the channel estimator are very similar.This occurs for both MRT and EGT.It is also noted that such results are very close to those obtained with the ideal channel estimation.
Figure 10 shows the BER performance results with implicit pilots, considering 5 versus 15 data blocks, with pilot power of 0 dB, and with three iterations of the iterative channel estimator.Results with ideal channel estimation are also plotted.As can be seen, the difference of performance between 5 and 15 blocks is almost imperceptible.Results obtained with ZFT, MRT, and EGT with 5 and 15 blocks are very close to the performance obtained with ideal channel estimation.
Figure 11 shows the BER performance results with implicit pilots, considering 5 data blocks, with a pilot power of 0 dB, and with three iterations of the iterative channel estimator.The results of the implicit pilots are compared with those obtained with conventional pilots.Moreover, results with ideal channel estimation are also plotted.As can be seen, in the scenario of Figure 11, the results obtained with the implicit pilots are very similar to those obtained with conventional pilots.An exception is the ZFT, where the conventional pilots achieve a performance worse than those obtained with implicit pilots.

Conclusions
The channel estimation for the m-MIMO system was considered in this paper, using the SC-FDE transmission technique.m-MIMO using precoding with different algorithms was adopted, facilitating its usage in mm-Wave communications.
We considered both conventional multiplexed pilots and implicit pilots for channel estimation purposes.To overcome the difficulties inherent to the interference levels between data and pilots that occur with implicit pilots (superimposed pilots), we proposed an iterative receiver structure with interference cancellation and channel estimation.The results of implicit pilots were compared against those obtained with conventional pilots and with ideal channel estimation.
It was viewed that by using the proposed MRT and EGT, we avoid the need to compute the pseudoinverse of the channel matrix, for each frequency component, as required for the ZF algorithm.Since with MRT and EGT, a certain level of interference is generated, a novel iterative interference canceller was proposed, which suppresses such interference.
With the proposed MRT and EGT, applied to m-MIMO as precoding algorithms, a performance very close to the MFB is achieved, especially with 4 iterations of the interference canceller.
Moreover, our performance results show that the use of implicit pilots, combined with the proposed iterative receiver, allows performances close to those obtained with ideal channel estimation, as well as close to those obtained with conventional pilots.While the ZFT is the scheme that achieves the best overall performance under ideal channel estimation conditions (although demanding high processing to invert matrices), by considering channel estimation (unideal channel estimation), the ZFT degrades heavily.The MRT tends to achieve the best overall performance, followed by the EGT, under unideal channel conditions, using channel estimation either with implicit pilots or with conventional pilots.
to precoding, defined by
stands for the power of the training sequences, i.e., pilots (TS stands for training sequence).Moreover, Y r k stands for signal at the rth received antenna r = 1, 2, , R , and X t TS k denotes the training sequence transmitted by the tth transmit antenna t = 1, 2, , T .If the training sequences associated with different transmit antennas are orthogonal, i.e.,

Figure 2 :
Figure 2: Block diagram of m-MIMO chain with receiver and interference cancellation associated with the precoder.

Figure 3 :
Figure 3: Block diagram of m-MIMO receiver with interference canceller and channel estimation, using preprocessing.
w n (iii) Then we improve the channel estimation with the pilots, as defined by (21), estimate the signal regenerated from the data, subtract it from Y r Rx k and repeat (i) and (ii), using Y r TS k , instead of Y r Rx k(iv) The pilots/training sequences are removed from the received frequency-domain blocks, leading to[23]

Figure 7 :
Figure 7: BER results with 32 × 2 using implicit pilots with 1 data block, 1 iteration, and a pilot power of −12 dB.

Figure 9 :
Figure9: BER results with 32 × 2 using implicit pilots with 5 data blocks and pilot power −3 dB, with 2 versus 3 iterations in the channel estimator.

Figure 10 :
Figure10: BER results with 32 × 2 using implicit pilots with 5 versus 15 data blocks, 3 iterations, and a pilot power of 0 dB.

Figure 11 :
Figure11: BER results with 32 × 2 using implicit pilots with 5 data blocks, 3 iterations, and a pilot power of 0 dB versus conventional pilots.