Subspace-based self-interference cancellation for full-duplex MIMO transceivers

This paper addresses the self-interference (SI) cancellation at baseband for full-duplex MIMO communication systems in consideration of practical transmitter imperfections. In particular, we develop a subspace-based algorithm to jointly estimate the SI and intended channels and the nonlinear distortions. By exploiting the covariance and pseudo-covariance of the received signal, we can increase the dimension of the received signal subspace while keeping the dimension of the signal subspace constant, and hence, the proposed algorithm can be applied to most of full-duplex MIMO configurations with arbitrary numbers of transmit and receive antennas. The channel coefficients are estimated, up to an ambiguity term, without any knowledge of the intended signal. A joint detection and ambiguity identification scheme is proposed. Simulation results show that the proposed algorithm can properly estimate the channel with only one pilot symbol and offers superior SI cancellation performance.


Introduction
Half-duplex transmission is commonly used in the current communication systems by transmitting and receiving over orthogonal channels. Full-duplex communication represents an attractive alternative to save channel resources or to increase the transmission efficiency. The main deterrent to employ full-duplex is the large selfinterference (SI) from the simultaneous transmission and reception over the same frequency band. The SI is usually several orders of magnitude higher than the intended signal received from the other transmitter, because the later travels a longer distance than the former signal. Recent works have shown that, using different cancellation stages, the SI can be sufficiently suppressed to properly detect the intended signal [1,2].
The SI is first cancelled at the radio-frequency (RF) level, prior to the low-noise amplifier (LNA) and the analog-to-digital converter (ADC), to avoid overloading/saturation of these devices [1][2][3]. In other words, the SI should be sufficiently suppressed at RF to maintain the receiver's limited dynamic range. Then, further SI suppression can be done after the ADC at the baseband [4,5].
In the following, we assume that a cancellation stage at RF is available and we concentrate on the SI cancellation in the baseband.
To further reduce the SI, channel state information of the interference link should be available. Therefore, estimating the SI channel is a critical issue in full-duplex systems. In [6], the SI channel estimation is performed in the frequency domain using a least square (LS) technique. LS and minimum mean square error (MMSE) channel estimations are proposed in [7] to estimate the SI channel in the relay station. However, these approaches ignore the intended signal coming from the other transceiver and treat it as additive noise. An adaptive least mean square algorithm to estimate the SI channel is proposed in [8] where the large SI compared to the intended signal and additive noise is exploited to obtain an estimate of the SI channel. A more elaborate LS-based estimator was presented in [9] where a first estimate of the SI channel is obtained by considering the intended signal as additive noise. Then an iterative detection of the intended signal and channel estimation is performed to obtain a better estimate of the channel. On the other hand, spatial domain cancellation attempts to reduce the SI by precoding at the transmit chain and decoding at the receive chain. Spatial domain cancellation is formulated in the frequency domain [10][11][12]. An alternative time domain formulation was presented in [13] by precoding the transmitted SI to coincide with the null space of the SI channel. These techniques are based on the knowledge of both the SI and intended channels at the two transceivers, which further motivates the development of channel estimators for fullduplex systems. A novel cancellation method is proposed in [14] by adding a cancelling signal to the original signal.
In addition to the SI channel information for SI cancellation, intended channel knowledge is an important prerequisite for signal detection. Motivated by this fact, channel estimation has been the subject of intense research. In the case of data-aided transmissions, training-based techniques can be applied [15,16]. However, the amount of training increases dramatically with the number of antennas and channel order. Blind approaches have been proposed as more bandwidth efficient techniques [17,18] where subspace methods, initially presented in [19], have a great potential. By decomposing the covariance matrix of the received signal, subspace methods exploit the orthogonality between the noise and the signal subspaces in the observation space to express the channel coefficients as a linear combination of a basis of the signal subspace. Although previous researches have shown the potential of this procedure to give an accurate estimate of the channel, it remains of limited practical interest. Actually, considering that the noise subspace needs to be nondegenerated, it is legitimate to wonder how we can satisfy this condition. Previous works rely on oversampling of the received signal or using more receive antennas than transmit antennas [20,21]. However, such solutions increase the receiver cost and need additional hardware. Moreover, they may result in correlated noise which makes the subspace technique inappropriate. A maximum likelihood estimator was presented in [22] by exploiting the pilots in the intended signal.
In the full-duplex context, the transmitter impairments, including power amplifier (PA) nonlinearity and IQ mixer imbalance, become limiting factors and need to be reduced to properly detect the intended signal. In practice, the inband image resulting from the IQ mixer in mobile user is about 28 dB lower than the direct signal [23]. In the presence of strong SI of about 50 dB higher than the intended signal, this IQ image represents additional interference for the intended signal. The effects of transceiver impairments are illustrated in detail in [3,24]. Due to the importance of the nonlinearities, a digital cancellation procedure has been proposed to reduce the effects of the PA in [25] by estimating the nonlinear coefficients of the PA and another algorithm has been proposed to deal with the IQ mixer imbalance [26]. However, there is no discussion about the intended signal in the existing literature, which limits the estimation performance if it is considered as additive noise.
In this work, we incorporate the intended signal in the estimation process. We also take into account the transmitter impairments when modelling the SI signal. For realistic multipath propagation channels, we need to estimate the SI channel, the intended channel and the distorted SI. And noting that the intended signal is unknown, we propose to use a novel subspace method to efficiently estimate the different parameters. Since the received signal consists of the SI and intended signals, the dimension of the signal subspace in full-duplex operation is at least twice that in traditional half-duplex operation [5,27]. Thus an essential shortcoming of the existing subspacebased technique is that it can be applied only when the number of receive antennas is larger than the number of transmit antennas. In the following, we circumvent this condition and develop a subspace-based algorithm suitable for MIMO full-duplex systems with larger or equal numbers of transmit and receive antennas. We exploit both the covariance and pseudo-covariance matrices of the received signal to effectively increase the dimension of the observation space while keeping the dimension of the signal subspace unchanged. The joint processing of the received signal and its complex conjugates has been used in many works to improve the detection performance on various systems [28,29]. Also, in an entirely different context, the improper property of the received signal was first exploited for channel identification in [30] to obtain a virtual SIMO model from a SISO one. Preliminary results can be found in [31] for real-valued symbols to enable the application of widely linear processing techniques, but entail a loss in spectral efficiency compared to complex-valued symbols. We propose in this paper a method to use the widely linear processing to complex symbols by forcing the transmit signal to be improper. We justify the advocated time domain approach and compare its performances to a frequency domain approach and we generalize the PA model to any nonlinearity order. In practice, we cannot blindly recover the channel coefficients since an ambiguity term always appears in the final estimate [5]. This ambiguity is resolved using a sequence of pilot symbols, considerably shorter than needed in training-based techniques. In the following, we propose a joint data detection and estimation of the ambiguity term to considerably reduce the length of the pilot sequence. We show through simulation that just one pilot symbol is sufficient to perfectly estimate the channel.
The paper is organized as follows. In Section 2, the full-duplex system model is presented. The subspacebased channel estimation is described in Section 3. In Section 4, we describe the joint decoding and ambiguity removal procedure. Illustrative simulation results are given in Section 5 and Section 6 presents the conclusion.
Notations commonly used in this paper are presented. Subscripts (·) * , (·) T , and (·) H refer to conjugate, transpose and conjugate transpose for matrices or vectors, respectively. For a given vector x, diag(x) returns a diagonal matrix whose diagonal elements are the entries of x. rank(M) returns the rank of a given matrix M, det(M) returns the determinant of M and vect(M) stacks the columns of M into one vector. The operator ⊗ refers to the Kronecker product of two matrices. (·) and (·) return the real and imaginary parts of complex numbers. E(·) denotes the mathematical expectation. || · || 2 returns the Euclidean norm of a vector. I p refers to the p × p identity matrix and 1 p the p × 1 vector with 1 at all elements. A term accented by a hat, x, means an estimate of x.

Full-duplex MIMO system model
Consider two transceivers communicating in a full-duplex fashion. The simultaneous transmission and reception creates self-interference (SI) to be cancelled before the demodulation process. The SI signal is first suppressed at RF, prior to the low-noise amplifier (LNA) and analog-todigital converter (ADC) to avoid overloading/saturation of these components [2,3,32]. In [5], we proposed an efficient compressed-sensing (CS)-based algorithm for the RF SI cancellation stage. In this work, we concentrate on the development of subspace-based algorithm to jointly estimate the SI and intended channels and the nonlinear distortions for the baseband SI cancellation stage of a full-duplex MIMO transceiver with arbitrary numbers of transmit and receive antennas. The output signal of the RF SI cancellation stage consists of the residual SI, the intended signal received from the other transceiver and the additive thermal noise. Figure 1 shows a simplified block diagram of a MIMO transceiver. The residual SI can be further suppressed at the baseband after ADC using digital signal processing (DSP). The advantage of working in the digital domain, as compared to RF, is that sophisticated DSP methods can be handled. Both transceivers are equipped with N t transmitting antennas and N r receiving antennas. At transmitting antenna q, a group of N data symbols X q =[X q (0), . . . , X q (N − 1)] T is first modulated by the IFFT matrix to form an OFDM block, then the time domain vector x q =[x q (0), . . . , x q (N − 1)] T is extended by the cyclic prefix of length 1 N cp and the resulting vector is sent sequentially. In the transmit stream q, the complex signal x q (t) after the digital-to-analog conversion (DAC), is passed through an imbalance IQ mixer whose output is as follows: where k 1,q and k 2,q are the responses of the IQ mixer at antenna q to the direct signal and the image, respectively. Then, the signal is amplified with a nonlinear PA. In the Fig. 1 Simplified block diagram of the full-duplex transceiver with RF and baseband SI cancellation stages following, we model the PA response with a Hammerstein model whose response is: where α 2p+1,q , for p = 0, . . . , P, are the nonlinearity coefficients of the PA at transmit antenna q, P is the nonlinearity order and f (t) is the memory of the PA. In (2), denotes the convolution operator. The transmitted signal is coupled to produce SI in the receiver. Considering multipath channels, the received signal at antenna r is as follows: where s q (t) is the transmitted signal from the q th antenna of the other intended transceiver. h c r,q (t) is the response of the SI channel from transmitting antenna q to receiving antenna r of the same transceiver. h s r,q (t) is the response of the intended channel from transmitting antenna q of the other intended transceiver to receiving antenna r of the same transceiver. w th,r (t) is the additive thermal noise in Rx stream r. To reduce the SI before the LNA and ADC, the RF cancellation stage is performed as follows: where h c r,q (t) is a first estimate of the SI channel [1,6]. h c r,q (t) is used to adjust the phase, amplitude and delay of the SI to the main propagation path. To include the transmitter distortion in the RF cancellation process, the reference signal is taken from the output of the PA. This RF SI cancellation can attenuate the SI by 30 dB, as reported in practical experiments [6,33]. Then, the received signal passes through the LNA: where w LNA (t) is the additive noise caused by the LNA and k LNA is the gain of the LNA. Finally, the received signal is adjusted by the variable gain amplifier (VGA) to match the dynamic range of the ADC. For simplicity, we suppose that the linear gains k 1,q and α 1,q of the IQ mixer and PA are equal to 1. Combining (2), (3) and (5), the received samples are given by where x q,ip,p (n) = x IQ q (n)|x IQ q (n)| 2p resulting from the cascade of IQ mismatch and PA (2p + 1) rd order nonlinearity and w r (n) collects the thermal noise, the LNA noise and the quantization noise. In (6), the global channel responses are given by To have a homogeneous notation, all channels are supposed to have the same order L and the channels of order lower than L are zero-padded so that the different channels have the same order and L still satisfies L < N cp . The received vector y(n) = [y 1 (n), . . . , y N r (n)] T over the N r antennas is given by where for l = 0, 1, . . . , L and w(n) = [ w 1 (n), w 2 (n), . . . , w N r (n)] T . For a more compact representation, we gather the transmitted signals from the N t antennas to obtain (10) where the N r ×N t matrices H (i) (l) and H (s) (l) are given by for l = 0, . . . , L and We then group the channel matrices and gather all the channel coefficients in the following N r M × 2N t N block Toeplitz matrix:
The received OFDM block on the N r antennas is: (14) where M = N + L, the 2N t N × 1 data vector u is given by and For multi-block transmission, the received vector in (14) is indexed by the block number t, i.e., y t . For convenience, we omit this indexation and we will consider later a given number of transmitted blocks to compute the covariance matrix of the received vector.

Subspace-based channel estimator
We propose to apply a subspace-based algorithm to jointly estimate the SI and intended channel coefficients along with the nonlinear coefficients. Subspace methods rely on the orthogonality property between the signal and noise subspaces. These two subspaces are obtained from eigendecomposition of the covariance matrix of the received signal y. Denoting by R u , the covariance of u, the covariance matrix R y of the received vector y is given by as long as the signal samples are uncorrelated from the noise samples 2 .
The signal subspace is spanned by the columns of the matrix H. Noting that the columns of H are, by construction, linearly independent as soon as there exists an l ∈ [0, L] such that H(l) is full rank 3 , the matrix H is a full-rank matrix. Therefore, the dimension of the signal subspace is 2NN t . It follows that, to obtain a nondegenerate noise subspace, its dimension N r M − 2N t N should be larger than zero, and thus, the number of receiving antennas should be larger than the number of transmitting antennas to make the subspace method work, and in [5], we developed the linear subspace algorithm for this setting. In the following, we will develop the subspace-based algorithm for general numbers of transmit and receive antennas. When N t = N r , the matrix R y cannot be directly used to find the noise subspace. As an alternative different approach, we consider the augmented received vector as The use of the augmented received vector is usually referred as widely linear processing. In this case, the augmented covariance matrix R y of y has the following structure: where R u denotes the covariance matrix of the augmented It is worth mentioning that the proper noise has a vanishing pseudo-covariance [34]. The main purpose of using the extended received signal is to increase the dimension of the received signal and thus avoid the degenerate noise subspace. Hence, the subspace identification procedure can be derived only if the signal part covariance matrix, given by HR u H H , of the covariance matrix R y is singular. The matrix R u can be expressed in a block form in terms of the covariance matrix of u, R u = E(uu H ), the pseudo-covariance matrix C u = E(uu T ) and their complex conjugates as In the following, we distinguish two cases of real and complex modulated symbols.
For real modulated symbols, it can be shown that R u = α 2 M ⊗ I 2N t with the 2N × 2N matrix M having the following form: From (22), we note that each column of M appears exactly two times (the first column of M is the same as the (N + 1) th column, and the i th column of M is the same as the (2N −i+2) th column, for i = 2, . . . , N). Therefore, the matrix M has exactly N-independent columns and thus its rank is N. It follows that the rank of R u is 2NN t . In Appendix 1, we show that R u has zero eigenvalue with multiplicity 2NN t and 2α 2 also with multiplicity 2NN t . Then, the matrix R u is decomposed as UDU H where D is the 4NN t × 4NN t diagonal matrix with zeroes in the first 2NN t diagonal elements and 2α 2 in the last 2NN t diagonal elements and U is an orthogonal matrix whose columns are the corresponding eigenvectors of R u .
For complex symbols, the pseudo-covariance matrix C u is generally equal to the zero matrix, which makes the matrix R u of full rank. To avoid this problem, we apply a simple precoding at the input of the IFFT. It transforms the data symbol X q to where P and Q are two matrices. By combining the data symbol X q and its complex conjugate, we force the pseudo-covariance matrix to be different from zero.
From (24) By dividing ν i into two MN r × 1 vectors, i.e., for i = 1, 2, . . . , p. The matrix H is completely defined by the set of matrices H(l), for l = 0, 1, . . . , L. Therefore, the specific structure of H should be taken into consideration when solving the equations in (27) to obtain a more accurate estimate of the channels. To that end, we divide the two vectors ν i,1 and ν i,2 as follows: where each ν i,j (n), for n = 1, 2, . . . , M, is a N r × 1 vector. From (13) and (28) and ν H i,2 H * can also be partitioned in the same manner. By introducingȟ(l) = vect(H(l)) and V i,j (n) = I 2N t ⊗ ν H i,j (n), for i = 1, . . . , p and j = 1, 2, it is easy to verify

Then, using the previous notations, (27) is rearranged to obtaiň
or, by taking the transpose of the previous equation: for i = 1, 2, . . . , p. Note that the difference between (27) and (32) is that (32) takes into account the Toeplitz blocks structure of H. Now, collecting all the previous equations, we obtain 1ȟ + 2ȟ * = 0, Separating the real and imaginary parts of (33), we have From (35), the vector h belongs to the right null space of . In practice, h is a linear combination of the 4N t N r right singular vectors of the matrix , denoted by β i , which are equal to the eigenvector of the Gramian H corresponding to the zero eigenvalue. Therefore, an estimate of h is given by where =[ β 1 , β 2 , . . . , β 4N t N r ], and the 4N t N r × 1 vector c represents the ambiguity term to be estimated. The complex channel vector can also be obtained as where is obtained by combining the lines of in the following way: and j is the complex number satisfying j 2 = −1.
We mention that the matrices U 2 and U 4 do not depend on the received signal and can be computed offline prior to the transmission. It is also seen that the overestimated channel order L does not affect the estimation process. This is a common property with other subspace-based estimators [17].

Resolving the ambiguity term
As mentioned above, the subspace that contains the channels is obtained and the ambiguity term needs to be estimated to extract the exact coefficients. Different approaches can be applied to solve the ambiguity term c. To do so, we highlight the contribution of c on the received vector y. First, we separate the matrix in two N t N r (L + 1) × 4N t N r matrices i and s which contribute in the SI and intended channels, respectively i.e., h (i) = i c andȟ (s) = s c . By rearranging the elements where each i,q (l) is a N r × 4N t N r matrix,Ȟ  andˇ i are used to build the matrices H (i) and i , respectively, having the same block structure as H in (13).
Next, we define the diagonal matrices K and A p whose diagonal elements are k =[ k 2,1 , . . . , k 2,N t ] T and α p = [ α 2p+1,1 , . . . , α 2p+1,N t ] T , respectively, and we denote by Using the previous notations and by developing x = x i + (I N ⊗ K )x * i + P p=1 (I N ⊗ A p )x ip,p in terms of the transmitter impairments, one can express the received signal in (14) as where s and H (s) are defined in the same way as i and H (i) , respectively, and s =[ s T (0), . . . , s T (N − 1)] T . After some manipulations, one can easily verify that ( Then, the received vector in (41) is rewritten as In (42), the received vector y is expressed as a linear function of the unknown vector c. This formulation makes the estimation of c more tractable. While the transmitted SI is known, the distorted parts (I N ⊗ A p )x ip,p and (I N ⊗ K )x * i of the SI from the cascade of the IQ mixer and PA need to be estimated. We begin by writing the following cost function f (c, s, K , A on c, K , A p (for p = 1, . . . , P) and s. Given  an initial estimate c of c, the minimization of f ( c, s, K , A p ) with respect to s, K and A p can be recast as a least square (LS) problem. Then, using the solutions s, K and A p , we minimize f (c, s, K , A p ) with respect to c. We iterate this procedure until the estimated parameters converge. An initial estimate of c is obtained using the LS criteria as where the operator (·) # returns the pseudo-inverse of a given matrix. At the k th iteration, the estimate c k−1 obtained at the previous iteration is used to find s, K and A p (or equivalently k and α p ) as follows: where, for clarity, we introduce B = 1 N ⊗ I N t and C k−1 = I NN t ⊗ c k−1 and we use the equality ( Then, s k is transformed in the frequency domain and each element of the frequency domain vector is projected to its closest discrete constellation point. The obtained vector is converted back to the time domain to obtain a better estimate s k of s.
Then, an update of c at iteration k is obtained as: If a set of P pilot , pilot symbols are available at subcarriers indexed by P = {p 1 , . . . , p P pilot }, the intended transmit signal at antenna q can be represented as the sum of two signals: where the first sequence s p q (n) contains the pilot symbols and the second sequence s d q (n) contains the unknown data symbols transmitted by other intended transmitter. Then, the received vector in (42) is rearranged as follows: where s p and s d are constructed in the same way as s and contain the pilot symbols and unknown symbols, respectively. The initial estimate of c is modified to incorporate the pilot symbols as and the estimates of s d , K and A p at iteration k are given by As before, s d k is converted to the frequency domain, demodulated then transformed to the time domain to obtain s d k . The updated estimate of c at iteration k is obtained as: In the following, we summarize the different steps of the proposed algorithm: 1. Compute the augmented covariance matrix R y by time averaging of T received samples as:

Simulation results
In this section, we provide some simulation results on the performance of the proposed estimation algorithm for a 2 × 2 MIMO full-duplex system. The transmitted bits are mapped to 4-QAM symbols, then passed through an OFDM modulator of length N = 64. The wireless channel is represented as a Rayleigh multipath fading channel with five equal-variance resolvable paths. Since the exact number of paths is supposed to be unknown, the algorithm is parametrized as if there are eight paths. In the following, the SNR is defined as the average intended-signal-tothermal noise power ratio and the estimation mean square error (MSE) of H is MSE = E ||H − H|| 2 . To model the RF impairments, a complete transmission chain is simulated. The PA coefficients are derived from the intercept points by taking the IIP3 = 20 dBm. For the IQ mixer, the ratio between the direct signal and the image is set to 28 dB which is specified in 3GPP LTE specifications [23]. The ADC is modelled as a 14-bit quantizer to incorporate the quantization noise. Therefore, no simplifications are made regarding the different impairments. Antenna separation can attenuate the SI by 40 dB while the RF cancellation stage reduces the direct path by 30 dB [1] leaving the weaker reflections and transceiver impairments to be reduced by the proposed digital algorithm. The proposed algorithm is compared to different channel estimators: the least square (LS) and the maximum likelihood (ML) algorithms. For the LS estimator, the channel coefficients are obtained using the known self signal and the pilot symbols in the intended signal. It simply considers the unknown symbols as additive noise. The ML estimate is obtained by maximizing the following cost function: An iterative procedure to find the ML estimate was proposed in [35]. The covariance matrix is obtained by averaging 60 OFDM blocks. Figures 2 and 3 plot the MSE vs. SNR curves for the SI and intended channel estimations, respectively. In both figures, one pilot symbol, from the intended transceiver, is used to solve the ambiguity matrix. For comparison purpose, a perfect estimate of the ambiguity term c is obtained as c perfect = arg min c ||ȟ − c|| 2 2 and the corresponding curves are labelled by clairvoyant subspace. It is seen that, when one pilot symbol is used in the ML and LS estimators, the proposed subspace algorithm offers notably lower MSE over a large SNR range. We also represent the performance of the ML and LS estimators when 20% of the transmit symbols are known (pilot symbols equally spaced within one OFDM symbol) while keeping   4 . In this case, the three algorithms give comparable performance at low SNR region with the expanse of lower bandwidth efficiency. As the SNR increases, the performance of the LS and ML estimators saturate due to the reduced number of pilot symbols and the presence of the unknown transmit signal from the intended transceiver which acts as an additive noise. While the subspace algorithm exploits the information bearing in the unknown data to find the signal subspace. The ambiguity term is first solved using the known transmit symbols, then the iterative decoding ambiguity estimation is applied to improve the estimation performance. From Figs. 2 and 3, three to four iterations  are sufficient to converge and the performance is close to the performance when the ambiguity term c is perfectly obtained. Note that the ML solution is also obtained in an iterative way and for a fair comparison; we simulate the performance of the ML estimator after four iterations. As it can be expected, the estimate of the SI channel is more accurate than the estimate of the intended channel. This can be explained by the fact that the self-signal is known while one pilot symbol is known in the intended signal.
The number of pilot symbols is a critical issue in channel estimation since a large pilot sequence provides better estimation performance but reduces the bandwidth efficiency of the system. In Figs. 4 and 5, we compare the impact of the number of pilot symbols on the performance of the three estimators. We periodically place the pilot symbols within an OFDM symbol. Optimal pilot placement requires to verify all P pilot combinations from N subcarriers and hence, leads to an NP-hard problem beyond the scope of this paper, and is left for future work. It can be seen from these figures that the subspace method is not greatly affected by the number of pilot symbols since the subspaces are obtained using the second-order statistics of the received signal and not the transmit signal itself. Clearly, the proposed algorithm outperforms the ML and LS estimators at a reduced number of pilots while this tendency is inverted when the number of pilots increases. However, a system with a large amount of pilot symbols is not of practical interest.
In Figs. 6 and 7, we evaluate the impact of the number of observed OFDM symbols on the estimation performance. For the three algorithms, we consider the transmission scheme where the number of pilot symbols is set to one and the SNR is 10 dB. As the subspace algorithm is based on estimates of the second-order statistic of the received signal, its performance varies with the number of OFDM symbols. All three algorithms are able to estimate the SI channel with an error floor for the LS. The ML and subspace algorithms offer the similar performance. On the other hand, the LS estimator fails to recover the intended channel, for any number of OFDM symbols. This can be explained by the fact that the number of unknowns (intended channel coefficients) is larger than the number of pilot symbols. Hence, it is not possible to use this method when the number of pilot symbols is small. The ML estimator presents also poor estimation performance for the intended channel, while the subspace method is able to return a good channel estimate, with a better bandwidth efficiency compared to the other estimators, as soon as there are enough OFDM symbols to compute the covariance matrix.
Our primary motivation of this work is to develop an accurate channel estimator to cancel the SI signal. The performance of the SI-canceller are represented by its achieved output signal-to-residual-SI-and-noise power ratio (SINR) after SI cancellation vs. the input SNR. Ideally, if SI could be completely cancelled then the residual SI after cancellation is 0, and consequently, the output SINR equals the input SNR as shown by the dashed line "perfect cancellation" in Fig. 8. In other words, the "perfect

SINR [dB]
subspace one iteration subspace 2 iterations subspace 3 iterations frequency domain LS 20% pilots frequency domain LS one pilot widely−linear LS perfect cancellation Fig. 8 Output SINR vs. input SNR after SI cancellation cancellation" is considered as the ideal upper-bound for the SINR. As shown in Fig. 8, with three iterations, the proposed subspace-based SI-canceller can offer an output SINR very close to the upper-bound over a large SNR range. At low SNR, the large estimation error results in a larger residual SI after cancellation, which ultimately affects the output SINR.
We also investigate in Fig. 8 a frequency domain method to estimate the different parameter using the pilot symbols on some subcarriers. We resort to the LS estimator to find the channel responses at the pilot subcarriers. Since the remaining subcarriers contain unknown symbols from the intended transceiver, the complete channel responses are obtained by linear interpolation of the estimated coefficients. Thus, the frequency domain approach uses only the portion of the signal containing pilots while the proposed approach exploits the whole received signal through the second-order statistics. Clearly, the performance of the frequency domain approach highly depends on the number of pilots (as shown in Fig. 8) since the interpolation cannot model the variance of the channel in the frequency domain. We also compare the proposed method with the widely linear LS estimator in [26]. Note that the algorithm in [26] ignores the PA nonlinearities and does not incorporate the intended signal in the estimation process. Some time frames are dedicated to transmit orthogonal pilot symbols for estimation purpose, where the transceiver receives only its own signal. Therefore, the widely linear LS estimator incurs an overhead and requires synchronization between the two transceivers. Besides, it shows a noise floor at high SNR because the PA nonlinearity is not considered during the estimation process. On the other hand, by exploiting the whole received signal through its second-order statistics, the proposed method offers good performance even with one pilot and still outperforms the frequency domain approach (even with much larger number of pilots). Figure 9 plots the bit error rate (BER) vs. SNR curves of the two approaches. For comparison, we include the case of perfect channel estimate. To improve the BER, the SINR should be kept as high as possible at the demodulator. To conclude, while the frequency domain approach is more intuitive, it needs a large number of pilots and is outperformed by the proposed method.
We evaluate the performance of the system in the presence of phase noise by simulation. Figures 10 and 11 plot respectively the SINR and the BER vs. the phase noise 3 dB bandwidth f 3dB for SNR = 20 dB and common oscillator at the transmitter and the receiver. The residual SI depends obviously on the quality of the oscillator represented by its f 3dB . Higher f 3dB results in a fast varying process. Clearly, the proposed method still offers good cancellation performance, which is degraded as f 3dB increases. The PA nonlinearity effects on the performance of the proposed algorithm are also investigated through simulations. Figure 12 plots the resulting SINR after cancellation vs. the value of the PA third-order intercept point (IIP3) for SNR= 20 dB. For perfect cancellation, the resulting SINR after cancellation would be the SNR= 20 dB. A lower IIP3 indicates higher PA distortions (or poorer PA) and hence reduces the resulting SINR after cancellation. Figure 12 shows that as the IIP3 value increases, the cancellation performance is improved. However, for a sufficiently high IIP3 (e.g., 18 dBm or higher), the PA distortions are no longer dominant and the resulting SINR after cancellation is unchanged. This can be explained by the fact that, when developing the algorithm, the thirdorder component of the signal x q,ip3 (n) = x IQ q (n)|x IQ q (n)| 2 is approximated by x q (n)|x q (n)| 2 to simplify the algorithm. This approximation only affects the algorithm performance when the nonlinear coefficients are sufficiently high.

Conclusions
In this paper, a subspace-based estimation has been proposed to jointly estimate the SI channel, the intended channel and the transmitter impairments for MIMO full-duplex systems. By exploiting the covariance and pseudo-covariance matrix of the received signal, an effective way has been formulated to apply the subspace method for symmetric MIMO systems. The complete characterization of the second-order statistic of the received signal avoids the need of oversampling, required in traditional subspace methods. The subspace that contains the channels is blindly estimated and a short pilot sequence is needed to extract the channel coefficients from this subspace. The proposed method dramatically reduces the number of pilot symbols needed to identify the channel coefficients. Simulation results show that one pilot symbol is enough to obtain an accurate estimate while other methods are not able to recover the channel. 1 The length of the cyclic prefix N cp should be larger than the delay spread of the channel to eliminate the inter-symbol interference and inter-carrier interference. Therefore, if we know the length of the channel, we can set the cyclic prefix to be sufficiently large to satisfy N cp > L. Since this information is in general not available, N cp is chosen to guarantee N cp > L. For example, if the distance between the two transceivers is 1 km, a cyclic prefix of 4 microsec is sufficient. 2 Physically, the additive noise arises from the thermal agitation of the charge carriers in an electronic device and is independent from the input. It can also contain interference from other systems whose signals are independent from the transmit signal of the considered system. 3 The previous condition is verified for independent channels between different antennas. 4 The pilot symbols are equally spaced within one OFDM symbol.