A Bayesian Tensor Decomposition Method for Joint Estimation of Channel and Interference Parameters

Bayesian tensor decomposition has been widely applied in channel parameter estimations, particularly in cases with the presence of interference. However, the types of interference are not considered in Bayesian tensor decomposition, making it difficult to accurately estimate the interference parameters. In this paper, we present a robust tensor variational method using a CANDECOMP/PARAFAC (CP)-based additive interference model for multiple input–multiple output (MIMO) with orthogonal frequency division multiplexing (OFDM) systems. A more realistic interference model compared to traditional colored noise is considered in terms of co-channel interference (CCI) and front-end interference (FEI). In contrast to conventional algorithms that filter out interference, the proposed method jointly estimates the channel and interference parameters in the time–frequency domain. Simulation results validate the correctness of the proposed method by the evidence lower bound (ELBO) and reveal the fact that the proposed method outperforms traditional information-theoretic methods, tensor decomposition models, and robust model based on CP (RCP) in terms of estimation accuracy. Further, the interference parameter estimation technique has profound implications for anti-interference applications and dynamic spectrum allocation.


Introduction
In the past decade, OFDM with MIMO has become a widely adopted wireless transmission technique due to its ability to achieve high data rates [1,2] and enhance diversity gain and system capacity, particularly in scenarios with dynamic, time-varying, and frequencyselective channels [3,4].In the context of big data processing, tensor-decomposition-based channel estimation methods [5,6] have attracted a significant amount of attention in MIMO-OFDM systems due to their high efficiency in processing large complex datasets with improved estimation accuracy for high-dimensional problems.
Tensor-decomposition-based channel estimation algorithms generally consist of two steps: the first step involves the estimation of rank, which corresponds to the number of multipath components, and the second step utilizes the obtained rank to estimate the multipath component parameters.It is widely acknowledged that determining the tensor rank is an NP-hard problem, as discussed in [7].The predominant approaches to estimate the rank use information-theoretic methods, among which, the most popular methods are Akaike information criterion (AIC) and Bayesian information criterion (BIC), but these have drawbacks of oversimplification and overfitting [8].Further, the minimum description length (MDL) [9] method demonstrates a significant reliance on prior knowledge and exhibits sensitivity.Based on the obtained rank, tensor decomposition can be applied to estimate the channel parameters.Based on the model of Tucker, M. Haardt [10] extended the high-order singular-value decomposition (HOSVD) to the estimation of channel parameters using the estimation of signal parameters via rotational invariance techniques (ESPRIT).Further application has been expanded to 5G localization mapping as described in [11].
Based on the model of CP, an enhanced approach was proposed in [12] to address the downlink channel estimation problem in MIMO-OFDM systems with large antenna arrays.Further enhancement was conducted in [13], where a tensor-space-assisted estimation scheme was proposed by exploiting the Vandermonde structure of the factor matrix.In addition, based on the two aforementioned models, the sequential unfolding singular-value decomposition (SUSVD) was proposed in [14] by utilizing a distinctive hierarchical tree structure to obtain orthogonal factor matrices: also called the PARATREE method.
Due to the presence of interference, the performance of traditional tensor-decompositionbased channel estimation methods is severely degraded, as the actual channel interference cannot be simply modeled as colored noise.The degradation of rank estimation performance significantly reduces the performance of channel parameter estimation.This is particularly true in MIMO systems, where interference exhibits high correlation.Interference primarily arises from imperfect designs in MIMO-OFDM systems and front-end circuits, leading to signal distortion, including harmonic distortion, intermodulation distortion, and phase distortion, as demonstrated by radio frequency FEI [15,16].Additionally, frequency reuse and bandwidth congestion may also result in CCI [17][18][19].When the same frequency bands are allocated to multiple transmitters, signal overlap and degradation usually occur.Traditional methods for addressing interference have relied on additional hardware and post-processing algorithms to filter out interference.However, emerging efficient spectrum allocation technologies based on spectrum sensing and interference identification have significant research implications [20,21].
Based on a non-Gaussian and non-stationary interference model, tensor decomposition can be employed to reduce the dimensionality of multi-dimensional data.Qibin Zhao proposed a tensor-based variational Bayesian approach in [22] for channel estimation by eliminating interference in the channel matrix and utilizing the spatial coupling relationships of partially received tensors.References [23,24] proposed to use threshold-based interference exclusion methods for low-rank approximation and channel parameter estimation with incomplete data.In [25], a multiplicative Gamma process (MGP) was used to reduce the complexity and enhance the speed of automatic rank determination (ARD).Similarly, the use of a generalized hyperbolic (GH) distribution can achieve more flexible sparsity awareness [26].It is worth mentioning that variational methods with incomplete observations still suffer from information entropy loss with the presence of interference.
Therefore, it is essential to incorporate channel information, including the additive interference structure.Traditional methods like adaptive filtering [27], prior-knowledgebased MIMO systems [28], and radio frequency (RF) front-end feedback networks [29] focus on removing rather than estimating the interference.Moreover, these conventional approaches have drawbacks of high complexity and costs.An additive RCP [30] was proposed by inferring interference terms for each pixel in image processing to enhance the precision of image processing, which was expanded to channel parameter estimation in [31].However, so far, the actual types of interference in the tensor have not yet been thoroughly considered in the model, which makes accurate interference estimation difficult.
In this paper, we propose an RCP based on alternate prior hypothesis (APH) for channel estimation in MIMO-OFDM systems, hereinafter referred to as RCP-APH.We first separate the interference tensor space and then construct spatial correlations through actual interferences, either FEI or CCI.Consequently, we perform variational iterations in the separated tensor space by alternately modifying the interference prior hypotheses conditions.The main contributions of this paper are summarized as follows:

•
We adopt an additive interference model for which the parameters are jointly estimated with channel parameters rather than mitigating the interference.As such, it has profound implications in anti-interference applications and dynamic spectrum allocation.

•
We propose to jointly estimate the channel and interference parameters without increasing the complexity and degrading the estimation performance.The proposed method enables simultaneously estimating the number of paths and the channel and interference parameters in MIMO-OFDM systems.
The structure of this paper is as follows.Section 2 describes the preliminaries and basic concepts.Section 3 presents the MIMO-OFDM system model.In Section 4, we propose the RCP-APH algorithm, and Section 5 shows the experimental analysis of the proposed algorithm.Finally, Section 6 summarizes this paper.

Preliminaries and Notations
In this paper, we introduce the term "mode", denoted by n, to represent the order of a tensor, which is also referred to as the dimension in various disciplines.An N-th order complex tensor is represented using calligraphic letters, as illustrated by X ∈ Furthermore, the unfolding of the tensor X with respect to the n-th mode is represented by X (n) in accordance with [10].
Tensors are sliced along different dimensions to form a sub-tensor, which is also known as tensor slicing.Slices of a three-dimensional tensor are represented as matrices and are denoted by uppercase bold letters.A set of data along a specific dimension of the tensor is referred to as a fiber and is represented in vector form and denoted by lowercase bold letters.Therefore, in the context of a three-dimensional tensor, the relationship between a slice and a fiber is expressed as X :,:,i 3 = X X X = [x x x 1 , . . ., x x x i 1 , . . ., x x x I 1 ] T , where the row vector of the slice is represented as T .Throughout this paper, We use the symbols * , T, H, −1 , ˜ , { }, and ∥ ∥ F to denote the conjugate, transposition, Hermitian transposition, matrix inversion, estimated value, and set of the same and Frobenius norm operations, respectively.
For multilinear mathematical operations, the complex inner product of vectors is defined by ⟨x x x ).The Hadamard product is performed in an entrywise way between two items of the same size, such as A A A ∈ C I×J and B B B ∈ C I×J matrices, and the result is  (1) . (1)

MIMO-OFDM System Model
We consider a typical traffic multipath scenario with the presence of interference as depicted in Figure 1.The transmit and receive array consist of N BS−T and N BS−R antennas with equidistant spacing of d t and d r , respectively.The linear arrays at both ends form a MIMO system designed to estimate channel parameters, including the angle of departure (AoD) θ, the angle of arrival (AoA) ϕ, the delay τ, and the complex amplitude α.It can be observed that the parameter set for the l-th multipath is {θ l , ϕ l , τ l , α l }, and the (l + 1)-th path has angular differences in the transmission angle ∆θ and arrival angle ∆ϕ compared to the former path.In this paper, we use an OFDM signal with a bandwidth of B and modulated by K subcarriers for transmission.For convenience, the K subcarriers with a spacing of B/K are all used to transmit periodic known training pilots.The periodicity of the signal ensures that the end of each OFDM symbol naturally connects with the beginning of the next symbol.We assume that the signal has been detected and synchronized, where the whole piece of hte signal symbol is recovered for channel estimation [32].And there are L paths in the propagation channel.At the receiver, by utilizing the orthogonality of transmission symbols and stacking the channel matrices of K frequency points, we can get the channel tensor H ∈ C N BS−R ×N BS−T ×K in the form of CP factorization as follows: (1) , A A A (2) , A A A (3) ]], where "•" indicates the outer product, factor matrices {A A A (n) } n=1,2,3 are composed of the corresponding antenna array response, g g g(τ l ) Receive Array ( ) ( ) We assume additive interference, as seen in Figure 2. The frequency power composition of the received tensor is composed as Y = H + S + W, where H, S, and W represent a channel tensor with the channel information, channel interference, and the noise tensor, respectively, which all follow an independent and identical distribution (i.i.d.).It is essential to note that the yellow lightning inside the red circle in Figure 1 indicates the FEI of the transmitter antenna, denoted as S FEI−T .Similarly, the yellow lightning inside the red square represents the FEI of the receiver antenna, represented by S FEI−R .Following that, the yellow lightning appearing on both sides of the road indicates CCI generated by other electronic devices and neighboring cells, referred to as S CCI .Therefore, the interference tensor of FEI-R is made of a row fiber with a size of 1 × N BS−R , indicating this FEI-R from a particular receiving antenna to all transmitting sub-channels.In the same way, the interference tensor FEI-T is made of a column fiber with a size of N BS−T × 1, indicating this FEI-T is from a particular transmitting antenna and affects all receiving sub-channels.In addition, the interference is assumed to occur at any possible spectrum location and to have an arbitrary amplitude and phase.Section 5.4 describes the characteristics of the proposed algorithm for different interference bandwidths.

Bayesian Tensor Factorization
Diverging from the traditional RCP algorithm [30], this paper involves a strong correlation assumption about the prior information about interference.Spatially, this correlation is established on the entire slice and on the fibers in both the horizontal and vertical directions.Under the condition of maximizing the evidence, different interferences are alternately estimated.

Alternate Prior Hypotheses
To alleviate the complexity in the description, a third-order CP generative model is employed.As previously mentioned, the full-set representation of subscripts is denoted as where ν denotes the noise precision, S Ω represents all items of interference, and one interference term is denoted as S i 1 ,i 2 ,i 3 .Each vector a a a (n) i n influences a sub-tensor with index i n under mode-n.The generalized inner product ⟨a a a , a a a (3) i 3 ⟩ of the three latent vectors enables us to capture multilinear interactions reflecting the intrinsic structural characteristics of the tensor data.But this "inner product" complicates the learning process of the model.Therefore, an attempt is made to minimize the dimensionality of the latent space by inducing sparsity in the columns of the factor matrices: where Λ = diag(λ λ λ) represents the inverse covariance matrix, and shared by the factor matrices across all modes.Due to the uncorrelation of the channel multipath, these hyperpriors for λ λ λ assume an i.i.d.hypothesis.The Gamma distribution is denoted as Ga(x|m, n ) = n m x m−1 e −nx /Γ(m), where Γ(m) is the Gamma function.Furthermore, considering the zero-mean complex Gaussian distribution characteristics of the actual channel's amplitudes, we can draw a conclusion for the ARD: that is, when a certain path λ l is sufficiently large, the corresponding l-th column of the latent factor matrix tends to zero, thereby removing the corresponding redundant path.
For the convenience of subsequent discussions, a category of interferences is denoted as S T p , as previously discussed in Figure 2.An individual interference from this category is represented as S T p Ind(T p) , and its specific position in the time-frequency domain is determined by the subscript index Ind(T p), including Ind(CCI) = {i 3 }, Ind(FEI − R) = {i 1 , i 3 }, and Ind(FEI − T) = {i 2 , i 3 }.The complete set of interference terms for a certain category is represented as S T p Ω , with an individual interference term denoted as S T p i 1 ,i 2 ,i 3 .Thus, taking the condition of mutual independence between different interferences into account, the following interference prior assumptions are made: where the different types of T p correspond to the different hyperparameters γ γ γ T p .Moreover, according to the different resolution priors described in Section 3, the relationships between the interference and interference terms are given by 1 1 1 , and 1 1 1 . This also indicates that the interference cannot be simply considered to be a Gaussian distribution, nor can it be simply modeled as colored noise.Finally, a hyperprior is placed on the noise precision of the environment: All the mentioned prior assumptions have been assumed within the probabilistic graphical model, as illustrated in Figure 3.In this figure, white circles and squares represent hidden random variables and hyperparameters, respectively, while yellow circles denote the observed tensor.In the blue region, it is clear that this variational method is implemented through alternating iterations between two priors, including CCI and FEI.

Variational Bayesian Inference
For simplicity, all factor matrices and hyperparameters are integrated into the parameter set Θ = {A A A (1) , A A A (2) , A A A (3) , λ λ λ, S Ξ , γ γ γ Ξ , ν}, and Ξ = {CCI, FEI − R, FEI − T}.Consequently, with different types of interference, the likelihood function is obtained as follows: where the symbol "\" denotes the complement of the set-for example, when T p = CCI, \Tp = {FEI − R, FEI − T}-and S\Tp represents the estimated values of the remaining two types of interferences, such as S\CCI = S FEI−R + S FEI−T .The variational approach involves approximating the posterior distribution p Θ | Y Ω − S\Tp Ω with the distribution of q(Θ), and the relationship is as follows: In the above equation, the evidence of p Y Ω − S\Tp Ω remains a constant, so maximizing the ELBO of parameter L(q, T p) will inevitably minimize the Kullback-Leibler (KL) divergence, thereby completing the inference for the posterior distribution.In this process, given the uncorrelated characteristics of actual parameters, the uniform field theory is employed as follows: Finally, by computing the expectation of the log-likelihood function ln p Y Ω − S\Tp Ω , Θ under the posterior distribution q Θ/Θ j of the remaining parameters, precise posterior inference for this parameter Θ j is obtained as:

.1. Posterior Distribution of Factor Matrices
By using Equation ( 8), after performing the posterior expectation on all unknown latent variables and hyperparameters, except the n-mode matrix A A A (n) , the distribution follows a complex Gaussian distribution , for which the mean and variance are where E q [•] represents the posterior expectation, It should be noted that the derivation in this paper utilizes the uncorrelated character- istics between factor matrices from different modes as well as the uncorrelated characteristics among different row vectors of the same factor matrix.The above assumption aligns perfectly with the prior assumptions of the actual channel tensor.

Posterior Distribution of Hyperparameters λ λ λ
As assumed in Equation (4), we have: denote the posterior parameters learned from observations and can be updated by: where vector With regard to the prior knowledge, we know the posterior E q which directly determines whether a certain path should be eliminated.Because the mapping relationship between interference disrupts the Gaussian prior distribution at the lattice level, it reduces the algorithm's generalization capability and decreases the speed of calculating the variational expectation.Therefore, in this paper, we set a multiplicative threshold η to perform principal component analysis (PCA) according to the maximum and minimum values in E q [λ λ λ].This method significantly improves the speed of convergence, as shown in Section 5.2.

Posterior Distribution of Hyperparameters S
In practical situations, different interferences are uncorrelated, and interferences from the same type at different frequency points are also uncorrelated.Therefore, we get the posterior distribution for q(S T p ) = ∏ Ind(T p)

CN (S T p Ind(T p) |
S T p Ind(T p) , (σ TP Ind(T p) ) 2 ) as: , a a a In the above equation, when T p is equal to CCI, the remaining terms' operation is \Ind It is evident that in the posterior estimation of each interference type, the global information E q [⟨a a a , a a a (3) i 3 ⟩] of the channel must be utilized, while the interference variance is determined by environmental Gaussian noise E q [ν].Therefore, accurately estimating the noise is a crucial prerequisite for precisely assessing interference, as illustrated in Figure 4b.

Posterior Distribution of Hyperparameters γ γ γ
In the prior assumption of Equation ( 5), the estimation of interference precision directly dictates the presence of interference, so we need to set an appropriate interference power threshold (IPTH), as discussed in Section 5.2.The posterior distribution, , is represented by the following equation: where the prior hyperparameter b T p 0 is assumed for a specific type of interference, and the posterior hyperparameter b γ Ind(T p) M varies with different indices Ind(T p).

Posterior Distribution of Hyperparameters ν
The inference of noise precision is achieved through three factor matrices and observed data.Its posterior follows a Gamma distribution q(ν) = Ga(ν|e M , f M ), determined by the following: where the expectation operation is expressed in Equation ( 16), where the i n -th row of F F F (n)   is denoted by f

Evidence Lower Bound
From Equation (8), it can be observed that the algorithm conducts variational inference from three dimensions.Naturally, under the accurate elimination of redundant paths, the evidence lower bound (ELBO) undergoes a monotonically non-decreasing iterative process.The concept of maximizing ELBO involves the posterior expectation of the joint distribution and the entropy of the posterior distribution.The derivation of ELBO can be expressed as Equation (17) (see Appendix A for details), where we divide different dimensions with Tp, allowing for the validation of each dimension separately, as shown in Figure 4a.

Computational Complexity
The time complexity of the three factor matrices in Equation ( 11) is O(3L 3 + 3ML + ∑ n I n L 2 ), where the total size of the observed data is M = ∏ n I n , and L represents the number of multipaths and the model complexity.The computational cost for λ in Equation ( 12) is O(∑ n I n L 2 ).And similarly, the computational cost for ν is O(ML 2 ).So far, for the calculation of the above parameters, the proposed method has the same computational complexity as the traditional RCP algorithm.Furthermore, since this algorithm performs iterations at different resolutions for the classified interference S T p , the computational complexities for each iteration in terms of CCI, FEI-R, and FEI-T are O(3I 3 L), O(3I 1 I 3 L), and O(3I 2 I 3 L), respectively.These values are less than the complexity of O(3ML) for the RCP algorithm.In summary, compared to the traditional algorithm, the proposed RCP-APH significantly reduces computational complexity when the number of iterations is large.

Simulation Analysis
In this section, a comprehensive simulation analysis was conducted to assess the performance of our algorithm.Each testing condition underwent 200 independent experiments and was accompanied by random noise and interference.Firstly, the rank estimation performance of the RCP-APH algorithm was compared with traditional information methods [9] and traditional RCP [30].Secondly, under the assumption of accurate rank estimation, the parameter estimation performance of RCP-APH was compared with the performance of two mainstream tensor decompositions such as CP [12] and Tucker [10] as well as the RCP algorithm.Lastly, a detailed interference positioning performance comparison was conducted between the two variational methods.
According to the simulation conditions illustrated in Figure 1, the configuration is set as follows.The transmitting array is located at (0 m, 0 m), while the receiving array is positioned at (30 m, 0 m).The actual number of multipaths is 2, with the line-of-sight (LOS) path being obstructed.Simultaneously, the actual parameters for the dual-path channel are set as θ = [45 • , 30 • ], ϕ = [45 • , 60 • ], and τ = [142.13,136.60]ns.The signal bandwidth used is B = 100 MHz, with ∆τ • B = 0.553.It is noted that the delay harmonic parameters are highly indistinguishable.Omnidirectional linear array antennas are equipped at both the transmitting and receiving ends and comprise N BS−R = N BS−T = 5 antennas with spacing of d t = d r = λ c /2.Under the above conditions, the uniqueness condition for CP decomposition is satisfied, as described in [14].Moreover, the complex gains follow a circularly symmetric Gaussian distribution α l ∼ CN(0, 1/(4πD f c /c) 2 ), where c is the speed of light, the LOS distance D = 30 m, and the carrier frequency f c is 5.9 GHz.Considering the maximum aperture of the receiving array A p = (5 − 1)d r , we obtain D ≥ 2A 2 p /λ c = 0.41 m, satisfying the far-field assumption and belonging to the Fraunhofer zone for channel testing.Lastly and most importantly, CCI and FEI are taken into account in the simulation.Thus, at the tensor lattice level, we introduce the parameter of the interference ratio β, which describes the proportion of interference terms in the received tensor.

Initialization and Termination Conditions
For the variational methods, after performing variance normalization on the received tensor, we should also assume the Gaussian distribution of CN (0 0 0, I I I) for the factor matrices, which allows for the initialization of factor matrices without prior information.The initial rank R int is chosen as three times the number of true paths, i.e., six paths, satisfying the requirements of the weak upper bound, i.e., R int ≤ min n ( ∑ i̸ =n I i ).In our model, the toplevel hyperparameters, including c 0 , d 0 , e 0 , f 0 , a Ξ 0 , and b Ξ 0 , are set to 1 × 10 −6 , resulting in a noninformative prior.Thus, the expectation of hyperparameters can be initialized by The entire inference process of the model is summarized in Algorithm 1, where the posterior factors in Equation ( 9) are sequentially updated from bottom to top, as depicted in Figure 3.
To enhance the speed of ARD for two variational algorithms in the presence of interference, redundant multipaths corresponding to λl under the condition of E[λ λ λ]/min(E[λ λ λ]) > η are eliminated.Additionally, for CP decomposition with known rank, the initial factor matrices are obtained using SVD operations.For the Tucker decomposition with known rank, we used the unitary ESPRIT algorithm with forward smoothing and HOSVD techniques.

Algorithm Performance
In the following, we choose the iteration number as M Iters = 500, the threshold of η = 10, signal-to-noise ratio (SNR) ρ = 20 dB, and number of subcarriers K = 64.To control the iterations, we set β = 0.2.Interference power is set as five times the noise power, i.e., 1/γ T p = σ T p 2 = 5/ν = 5σ 2  Noise .The CCI items ratio is 0.5.We consider narrowband interference, i.e., it appears at a limited number of consecutive frequency samplings (CFSs).In this standard setup, three CFSs are occupied by FEI, while two CFSs are occupied by CCI, as shown in Figure 2. The simulation results are depicted in Figure 4.
In Figure 4a, we primarily conduct a feasibility study on the proposed algorithm RCP-APH, where the interference power is calculated as the absolute power magnitude after variance normalization of the received tensor Y.The blue lines (solid, dotted, and dashed lines) depict the variations of ELBO for three different interferences, while the red solid line represents the estimated rank, i.e., the number of paths.The maximum number of iterations is 166, and at the 57th iteration, the algorithm achieves the true rank as indicated by the red dotted line.It is noteworthy that at the 57th iteration the ELBO unexpectedly decreases slightly, which can be explained by the fact that the redundant paths are eliminated, resulting in the loss of information entropy due to the small value of η.However, though choosing a large value of η may solve the problem of "unexpectedly decreases" in ELBO values, a large η value would also increase the number of iterations and, in turn, the complexity.Therefore, the threshold η must be selected appropriately in order to balance between the complexity and accuracy.
Figure 4b mainly analyzes the estimation performance of the RCP and RCP-APH algorithms.Firstly, the two curves in the figure represent the interference power distributions estimated by the two algorithms.It can be observed that, compared to the distribution estimated by the RCP-APH algorithm, the interference power estimated by the RCP shows a concentrated distribution, making it difficult to distinguish the true interference.Secondly, the solid and dashed vertical lines in the figure represent the noise power, signal power, and noise precision estimated by the both algorithms.The RCP-APH estimates the SNR more accurately compared to the RCP, as evidenced by the difference between the estimated signal power and noise power.Finally, in selecting the interference threshold, we consider the three aforementioned estimation metrics.If the estimated noise power is used as the threshold, the RCP would be unable to capture interference information.Therefore, this paper uses noise precision as the threshold for extracting interference terms.This threshold has the advantage of not only extracting the high-power interference estimated by the RCP but also facilitating subsequent performance comparisons of both algorithms.

Channel Estimation Performance
Within this section, we evaluate the performance of rank estimation and channel parameter estimation.Firstly, a comparison of performance under different interference power ratios is conducted, as illustrated in Figure 5.In the rank estimation of Figure 5a, it is observed that information-theoretic methods, i.e., MDL and AIC, are ineffective in the presence of strong interference.This confirms the unsuitability of traditional informationtheoretic approaches in the case of interference due to overfitting.As a result, channel parameter estimation algorithms based on matrix processing that strongly depend on rank estimation are significantly degraded.It is also evident that algorithms based on the variational model outperform information-theoretic methods.Additionally, under low interference power ((σ T p ) 2 = 5σ 2 Noise ), the RCP-APH algorithm surpasses RCP for all interference ratios.Under high interference power ((σ T p ) 2 = 10σ 2 Noise ), RCP-APH only slightly lags behind RCP in the extremely unfavorable scenario of β = 0.8.This reveals the robustness of the proposed algorithm.
In Figure 5b,c, it can be observed that the proposed RCP-APH outperforms other algorithms and reveals its robustness against changes in interference power as indicated by the black line.Furthermore, traditional RCP exhibits a certain degree of robustness.However, due to the lack of actual interference modeling, its performance is comparatively inferior, as indicated by the green lines.Moreover, for methods that require the number of multipaths to be known, such as the CP and Tucker decomposition methods, CP shows better performance due to its effective reduction of interference in single dimensions through multidimensional iterations.On the other hand, Tucker decomposition, due to HOSVD, encompasses interference information from multiple dimensions, resulting in the poorest performance.

Interference Estimation Performance
A comparison is conducted in terms of the performance of time-frequency position estimation for interference.In this context, "time" represents the large-scale sampling time, denoted as t, not to be confused with the small-scale delay τ.The term "frequency" denotes the position of frequency sampling points for interference.Since this paper processes all sub-channel snapshots at a single sampling time t, the discussion is thereby simplified to identifying the interference position at frequency sampling points.Here, a simulation of the RCP-APH algorithm is performed under conditions of β = 0.2 and ρ = 20 dB.The received tensor is illustrated in Figure 2, and the interference parameter estimation is depicted in Figure 6.The red box visualizes an FEI-T that is composed of three vertical green lines, indicating that all receiving antennas are affected by interference from the same transmitter antenna for three CFSs.The blue box shows an FEI-R that occupies N BS−T units in the horizontal direction and that lasts for three CFSs.Further, the black block represents a CCI that spans over both the vertical and horizontal directions with two CFSs.In Figure 6c, the RCP-APH accomplishes interference estimation for a single realization.It is evident in Figure 6a that the lower-power regions, indicated by lighter colors, cannot be identified due to their power approaching the noise level.In Figure 6d, which is the unfolding of Figure 6c, interference items underestimated by the algorithm are represented in blue.Notably, the majority of interferences are accurately estimated, as indicated by the green color.
In Figure 7, a single experimental comparison of two variational algorithms is conducted under different interference ratios.To better demonstrate the difference in performance, we use the same coordinate systems as in Figures 4b and 6b.These plots in the first row depict the probability density function (PDF) of the interference power at different values of β.The second row represents the specific positions, where the true interference is in the unfolded form.The third and fourth rows show the estimation of interference positions for both the RCP-APH and RCP algorithms.Here, the performance metric for position estimation based on the binary classification model in [33] is adopted.True positives (TPs) indicate accurately identified interference positions as depicted in green; false negatives (FNs) represent missed detections of interference positions, shown in blue; false positives (FPs) denote incorrectly identified interference positions as shown in red; and true negatives (TNs) signify correctly identified positions without interference, depicted in white.Seen from Figure 7, the proposed RCP-APH algorithm can distinguish interference by an optimal threshold of noise precision.As evident in the subsequent three rows of the figure, both algorithms exhibit a decreasing trend in red and an increasing trend in blue with the rise of β.This corresponds to the actual mapping: transitioning from overestimation to underestimation.The distinct advantages of the proposed algorithm include: 1.There are rare occurrences of singular interference item estimation, enabling direct mapping between interference items and actual interference.The statistical characteristics of the interference in the 200 independent experiments maintain the same conditions as in Section 5.3.The subsequent analysis employs three binary performance parameters as follows: 1. Precision = TP/(TP + FP) is utilized to depict the accuracy of estimations; 2. Recall = TP/(TP + FN) signifies how many of the actual estimations are captured; 3. F1 Score = 2 • Precision • Recall/(Precision + Recall) provides a comprehensive balance between the first two metrics.
As illustrated in Figure 8, the three performance metrics of the RCP algorithm increase with the growth of interference power.However, the performance gains associated with the interference power gradually diminish as β increases.A notable distinction between the RCP-APH and the RCP is that for β ≥ 0.6, there is a decline in recall, leading to a corresponding decrease in the F1 score.Importantly, the most crucial point is that across various interference powers, all three performance metrics of the proposed algorithm consistently surpass those of the RCP algorithm by a significant margin.
In Figure 9, it is evident that as ρ decreases to 10 dB, all three performance metrics of both methods decline.The proposed algorithm exhibits slightly inferior performance compared to the RCP under low-ρ and high-β conditions.However, under high-ρ conditions and low-ρ with low-β conditions, RCP-APH demonstrates superior performance.From Figure 10a, it can be observed that with the increase in frequency points K, there is a slight decrease in precision for RCP-APH, while recall and F1 score exhibit a monotonic increase, significantly outperforming the RCP.As shown in Figure 10b, the three performance metrics remain nearly constant.However, due to incomplete observations, this outlier appears when the interference ratio reaches 50%.According to Figure 10c, widening in the interference bandwidth results in an improvement in precision for both algorithms, while recall and F1 score decline.Importantly, the estimation performance of RCP-APH consistently outperforms that of RCP.

Conclusions
In this paper, we propose a robust RCP based on the APH to interference.With the strong correlation of the interference, the proposed algorithm is capable of simultaneous estimation of the rank, channel, and interference parameters.In comparison with the RCP, the proposed algorithm has the following features: 1. Increasing the model sparsity reduces the computational complexity.2. The noise precision, from which interference items can be inferred, is reasonably and accurately estimated.3. The estimated interference items show spatial correlation, enabling more accurate identification of the type of interference.4. The prior hypothesis aligns more closely with real interference, enhancing the overall performance of communication systems.Through a simulation analysis, a comprehensive examination was conducted using different SNRs, interference powers, tensor spatial structures, proportions of interference items occupied by CCI, and lengths of the interference bandwidth.This analysis provides conclusive evidence of the superior estimation performance of rank and channel parameters using the RCP-APH algorithm.Finally, the accurate interference time-frequency position estimation performance of the proposed algorithm is validated.where E q A A A (n) ,S T p ,ν ln p Y − S\Tp | A A A (n) , S T p , ν −1 = − ∏ n I n ln(π) + ∏ n I n E q [ln ν] − E q [ν]E q Y − S\Tp − A A A (1) , A A A (2)  f M E q Y − S\Tp − A A A (1) , A A A (2) , A A A (3) − S T p 2 Λa Λa Λa −E q [ln q(λ λ λ)] = . The Kronecker product of matricesA A A ∈ C I×J and B B B ∈ C K×L is a matrix of size IK × JL, denoted by A A A ⊗ B B B. The Khatri-Rao product of matrices, A A A ∈ C I×K and B B B ∈ C J×K , is A A A ⊙ B B B ∈ C I J×K, which is defined by a columnwise Kronecker product.Without loss of generality, the Hadamard product and Khatri-Rao product of a set of matrices, except the n-th matrix, can be simply denoted by and a a a BS−T (θ l ) = [ 1 e jµ(θ l ) . .• • • e j(N BS−T −1)µ(θ l ) ] T .At the same time, the phases of these are respectively represented by µ(ϕ l ) = (2π/λ c )d r sin ϕ l and µ(θ l ) = (2π/λ c )d t sin θ l , where λ c is the signal wavelength.

Figure 2 .
Figure 2. The power composition of the received tensor.

3 .
It corresponds to the actual configuration, as I 1 = N BS−R , I 2 = N BS−T and I 3 = K.In order to achieve RCP within a probabilistic framework, an observation model is introduced:

Figure 4 .
Figure 4. (a) The changes in the number of paths and the three variations of ELBO for RCP-APH.(b) The probability density function (PDF) of the interference item power distribution and other estimated parameters for RCP and RCP-APH.

Figure 5 .
Figure 5.For different interference item ratios, a comparison of rank and parameter estimation performance is conducted for interference powers of 5σ 2 Noise and 10σ 2 Noise .(a) Rank estimation.(b) Angle estimation.(c) Delay estimation.Here, (b,c) share a common legend.

Figure 6 .
Figure 6.Study on the performance of interference estimation for the RCP-APH.Green indicates accurately estimated interference positions, while blue represents unestimated interference positions.(a) True interference; (b) matrix unfolding of true interference; (c) estimated interference; (d) matrix unfolding of estimated interference.

Figure
Figure 6b depicts the unfolding form of Figure 6a along the 1-mode pattern.The vertical axis has a size of N BS−R , and the horizontal axis has a size of N BS−T • K.The green color in Figure 6b denotes the specific positions of the interference in the channel tensor.The red box visualizes an FEI-T that is composed of three vertical green lines, indicating that all receiving antennas are affected by interference from the same transmitter antenna for three CFSs.The blue box shows an FEI-R that occupies N BS−T units in the horizontal direction and that lasts for three CFSs.Further, the black block represents a CCI that spans over both the vertical and horizontal directions with two CFSs.In Figure6c, the RCP-APH accomplishes interference estimation for a single realization.It is evident in Figure6athat the lower-power regions, indicated by lighter colors, cannot be identified due to their power approaching the noise level.In Figure6d, which is the unfolding of Figure6c, interference items underestimated by the algorithm are represented in blue.Notably, the majority of interferences are accurately estimated, as indicated by the green color.

Figure 7 .
Figure 7. estimations of the interference positions are compared between two variational algorithms under different interference ratios.To clearly depict the performance differences between the algorithms, coordinate annotations for all subplots are omitted.The first row illustrates the estimated noise precision and PDF of the interference item power for both the RCP-APH and RCP algorithms.The coordinate scales are consistent with Figure 4b.The second row represents the actual interference, while the third and fourth rows depict the estimations of the interference positions for both algorithms.The coordinate scales align with those in Figure 6b.

Figure 8 .
Figure 8.Under different interference item ratios, a comparison of interference estimation is conducted for interference powers of 5σ 2 Noise and 10σ 2 Noise .(a) Recall.(b) Precision.(c) F1 Score.Here, all subplots share a common legend.

Figure 9 .
Figure 9. Interference estimation performance is compared for different interference item ratios for both 10 dB and 20 dB of ρ.(a) Recall.(b) Precision.(c) F1 Score.Here, all subplots share a common legend.

Figure 10 .
Figure 10.Performance metrics for interference estimation for different spatial structures and intercharacteristics. (a) Different sampling K. (b) Different ratio of CCI.(c) Different bandwidth of FEI.All subplots share a common legend.