Exploring coupled images fusion based on joint tensor decomposition

Data fusion has always been a hot research topic in human-centric computing and extended with the development of artificial intelligence. Generally, the coupled data fusion algorithm usually utilizes the information from one data set to improve the estimation accuracy and explain related latent variables of other coupled datasets. This paper proposes several kinds of coupled images decomposition algorithms based on the coupled matrix and tensor factorization-optimization (CMTF-OPT) algorithm and the flexible coupling algorithm, which are termed the coupled images factorization-optimization (CIF-OPT) algorithm and the modified flexible coupling algorithm respectively. The theory and experiments show that the effect of the CIF-OPT algorithm is robust under the influence of different noises. Particularly, the CIF-OPT algorithm can accurately restore an image with missing some data elements. Moreover, the flexible coupling model has better estimation performance than a hard coupling. For high-dimensional images, this paper adopts the compressed data decomposition algorithm that not only works better than uncoupled ALS algorithm as the image noise level increases, but saves time and cost compared to the uncompressed algorithm.

different information sources, and finally obtain clearer, more informative and higher quality fused images. It is that we can use different types of electronic data collection sensors to manage, analyze and integrate resources efficiently to provide clearer images to humans. However, multi-source heterogeneous image data analysis is now facing many problems. Complex data objects have multiple dimensions, and how to depict the relationship between them through data analysis is one of the challenges to solve urgently.
For example, we cannot utilize a general matrix to express the spectral image as there are multiple spectral band, (e.g. the mode 3 axis), where each spectral band represents a color image matrix. And we can see that multi-channel images have natural tensor structures from the above case. Moreover, the superposition of image matrix can be seen as a video if the mode 3 axis represents time, and above two tensor structures are shown in Figs. 1 and 2. Therefore, tensor can express multiple relationships in the real world, while this paper studies image fusion based on tensor analysis, which can abstractly describe the interaction mechanism between multiple aspects of image data. In addition, tensor structure has strong expression ability and computational properties, so it is very meaningful to study tensor analysis of images. Tensor decomposition is a very significant knowledge content, which can preserve the structural characteristics of the original image data [2].
What's more, multi-source heterogeneous semantics are much more rich. How to build a generalization model that integrates multi-source data or discover the correlation between them is another challenge for multi-source heterogeneous images. In this paper, we generally use couplings to refer to the correlation between heterogeneous images. When doctors make the diagnosis for a patient's brain, they can get some images of the patient's brain from a variety of ways, as shown in Fig. 3. And  whether there is a coupling between these brain images, and how to combine them to determine the etiology of patients are the questions to be discussed in this paper. In addition, for the fusion of hyperspectral images and multispectral images, the purpose is to fuse the information in the same scene to generate fusion image with large amount of information and high quality, as shown in Fig. 4. Similarly, what is the correlation between the information contained in these spectral images and how to combine complementary information to get a clearer image is also the significance of this paper.
The outline of this paper is organized as follows. In "Related work" section, related work about coupled images fusion is discussed. And we mainly describe some basic notations and definitions on tensors in "Tensor and related notation" section. In "Coupled image fusion" section, the proposed coupled image fusion algorithms are presented. Moreover, the experimental results on algorithms are shown in "Experiments and results" section. Finally, we give some conclusions and future research directions we need to study next.

Related work
Recently, motivated by the tensor nuclear norm(TNN), Pan Zhou and Canyi Lu proposed a novel low-rank tensor factorization method for efficiently solving the 3-way tensor completion problem, which can recover the synthetic data, inpainting images and videos with superior performance and efficiency [3]. For hyperspectral images, tensor decomposition can make full use of space and inter spectral redundancy between images, compressing the high spectral image and extracting the related feature information fast and high-quality [4][5][6]. Chia-Hsiang Lin et al proposed a convex optimization-Based coupled nonnegative matrix factorization algorithm for hyperspectral and multispectral data fusion [7]. Li and Dian proposed a coupled sparse tensor factorization (CSTF)-based approach for fusing hyperspectral images and multispectral images Fig. 3 The diagnosis process for a patient's brain to obtain a high spatial resolution hyperspectral image [8]. In addition, they consider high spatial resolution hyperspectral image (HR-HSI) as a 3D tensor and redefine the fusion problem as the estimation of a core tensor and dictionaries of the three modes [9]. Veganzones and Cohen proposed a Canonical Polyadic decomposition (CP) algorithm based on hyperspectral images, which was used to solve the problem of blind source signal separation [4]. They suggested to solve this problem as a low dimensional constrained tensor decomposition and applied kinds of fast decomposition of large nonnegative tensors which allowed a major speed up in the computation of the decomposition [10].
In the past few years, there have been many researches on the application of CP decomposition in image. In [11,12] used the CP decomposition into image compression and classification. Marcella Astrid et al used the CP decomposition into Convolutional Neural Networks (CNNs) to solve the image classification tasks [13]. Bauckhage introduced discriminant analysis to higher order data(i.e. color images) for classification [14]. For hyperspectral and multispectral images (i.e. multi-source heterogeneous data), kinds of methods exploits the Bayesian framework [15][16][17] to fuse such images. Rodrigo and Jeremy proposed a Bayesian framework to define flexible coupling models for joint tensor decompositions of multiple datasets [18,19]. They cast the problem of data fusion as the analysis of latent variable. And the latent models between data are coupled through subsets of their variables, where the coupling refers to the relationship between these variables subset. That is, there is a coupling relationship between the factor matrix after the tensor decomposition. In this paper, we hope to use the coupling relation between images to solve the problem, improving the accuracy and interpretability of the latent variables related to the other data set from the information of one data set through joint tensor decomposition algorithm.
In application, data analysis from different data sources needs to be handled the heterogeneous datasets (i.e. a matrix or a high-order tensor) [20,21]. Recently, matrix factorization and tensor-based factorization have been successfully applied to multi-frame data restoration [22][23][24][25], recognition [26,27], unmixing [28] and data fusion [18], etc. For the data analysis of some coupled matrices and tensors, corresponding coupled matrix and tensor factorization-optimization algorithm (i.e. CMTF-OPT) was proposed in [29]. The numerical experiments showed that The CMTF algorithm had better performance than CP algorithm in data recovery at a certain level of noise.
Similarly, the higher order coupled tensor decomposition problem was also studied in [10]. Model showed better performance in coupled tensor decomposition in the experiments. Rodrigo and Jérémy studied the coupling relationship between different data and the data decomposition algorithms under coupling, which showed that the decomposition algorithm based on coupled data had better convergence and the calculation of the algorithm took less time than alternating least squares (ALS) algorithm [18]. So using the coupling relations between images is the necessary measure and work to decompose images. Moreover the algorithms in [18,29] are not used to the coupled images. And S. Li and R. Dian do not make full use of the coupling relationship between images and use this coupling relation to accelerate the operation of the algorithm and restore the coupled image data [8]. Therefore, this paper proposes the coupled image data decomposition algorithms based on the CMTF-OPT algorithm [29] and the coupled tensor data decomposition algorithm [18].

Tensor and related notation
In the nineteenth century, Gauss and Riemann put forward the concept of tensors in the study of differential geometry. In 1916, Einstein applied tensors to the study of the general relativity, which made tensor analysis to be an important tool in continuum mechanics, theoretical physics and other disciplines. In 2005, characteristic polynomial was proposed for the first time in real symmetric tensor by Qi, and he presented the definition of the eigenvalues [30].
In order to study the data fusion between the coupled images better and simplify the presentation, this paper first introduces some of the following symbols and definitions. For the general tensor, this paper uses the calligraphic letters to represent them e.g. X , the matrix is denoted by capital letters e.g. X , and the scalar (or. vector) is represented by lowercase, e.g. x. The mode-n matricization of a tensor X ∈ R I 1 ×I 2 ×···×I N is denoted by X (n) , which can reduce the dimension of the tensor. For a matrix A ∈ R I×J , vectorization is to expand the matrix by column, forming a IJ Column vector, that is Given two matrices A ∈ R I×K and B ∈ R J ×K , the Khatri-Rao product is denoted as A ⊙ B , and the calculation results is a matrix of size IJ × K and defined by where ⊗ is the Kronecker product.
Given two tensors A, B ∈ R I 1 ×I 2 ×···×I N , the Hadamard product denoted as A * B , and the calculation results is a matrix of size I × J , i.e.
Given two tensors A, B ∈ R I 1 ×I 2 ×···×I N ,the inner product is defined as the sum of the product of its elements, i.e. we usually use ' • ' to represent the outer product.
Let matrix A = (a ij ) m×n ∈ C m×n , the Frobenius norm is defined as (4) �A, B� =

Tensor decomposition
In applications, the general tensor decomposition models can be divided into two categories, which are the Canonical Polyadic/PARAFAC decomposition (e.g. CP decomposition) and the Tucker decomposition model. The canonical decomposition was originally proposed by Carroll and Chang [31] and PARAFAC (parallel factors) by Harshman [32] separately. In 1966 Tucker [33] proposed the Tucker model. In particular, the CP decomposition model is a special case of the Tucker decomposition model. At the time, models were put forward to extract data characteristics from psychological tests. For the general matrix model, we can extract the potential information of matrix data, such as hyperspectral data fusion and blind source separation, by means of singular value decomposition of matrix, nonnegative matrix decomposition and so on. Similar to the idea of low rank approximation of matrix, researchers also want to extract latent information from tensor model data by means of tensor decomposition model.
For a tensor X ∈ R I 1 ×I 2 ×···×I N , the CP decomposition is expressed as where R is a positive integer, A (n) is called the factor matrix, which is a combination of rank one vector a (n) r , e.g.
for n = 1, 2, . . . , N , ∈ R R , a (n) r ∈ R I n , A (n) ∈ R I n ×R . Especially, for the three order tensor X ∈ R I×J ×K , the CP decomposition is expressed as Here, the column of factor matrix A , B and C is normalized to 1, and r is the weight. If the weight is assigned to the factor matrix, the CP decomposition can also be showed as Now we define D r composed of r is R × R × R order diagonal tensor, so the equation can be transformed into

Coupled data fusion
Data fusion, also known as collective data analysis, has been a hot topic in different fields. Data analysis from multiple sources has attracted considerable people in the Netflix Grand Prix. The goal is to accurately predict movie ratings. In order to get better ratings, additional data sources supplement the user score, such as the label information has been used. And the collective matrix factorization (CMF) proposed by Singh and Gordon [34] is based on the correlation between data sets, and the coupling matrix is factored simultaneously. Many researchers have paid their more attention to image fusion technique based on pulse coupled neural network. Literature [35] described the models and modified ones. As to the multi-focus image fusion problem, Veshki et al. utilized the sparse representation using a coupled dictionary to address the focused and blurred feature problem for higher quality [36]. In order to create spectral images with high spectral and spatial resolution, Yuan Zhou et al [37] proposed a fusion algorithm by combining linear spectral unmixing with the local low-rank property by extracting the abundance and the endmembers of Hyperspectral images usually have high spectral and low spatial resolution. Conversely, multispectral images.
For two matrices X ∈ R I×M , Y ∈ R I×L , the general CMF decomposition model is established by minimizing the following objective function where the matrices U ∈ R I×R , V ∈ R M×R and W ∈ R L×R are the factor matrices, R is the number of factors. In particular, because there exists a large number of high-order data, the data fusion between the coupled tensor and the matrix is discussed below.
For a tensor X ∈ R I×J ×K and a matrix Y ∈ R I×M , the general coupled tensor and matrx decomposition model is established by modifying the above objective function where the matrices A ∈ R I×R , B ∈ R J ×R and C ∈ R K ×R are factor matrices obtained by CP decomposition of the tensor X.
Alternating Least Squares Algorithm for coupled matrix and tensor factorization (CMTF-ALS) algorithm is proposed in [29]. The algorithm based on ALS is simple, small and effective. However, the convergence of the algorithm based ALS is not good with the missing data [38]. On the other hand, it is more robust to solve all CP factor matrices with an optimized algorithm, and is more easily extended to the missing data set [39]. Therefore, for high order data sets, with the support of the algorithm proposed in [29], this paper presents some coupled image decomposition algorithms, which describe the coupling analysis of heterogeneous image data sets.

CIF-OPT algorithm
The main purpose of this paper is exploring data fusion between coupled images. In general, images are stored in terms of tensor or matrix. Based on the CMTF optimization (CMTF-OPT) algorithm in [29], a coupled images factorization-optimization(CIF-OPT) algorithm is proposed. We firstly consider matrix image and N-order tensor image with one mode in common, where tensor image is decomposed through CP model and matrix image is decomposed through matrix decomposition. Given a tensor image X ∈ R I 1 ×I 2 ×···×I N and a matrix image Y ∈ R I 1 ×M which have the nth mode in common, where n ∈ {1, . . . , N } . Without loss of generality, we assume that two images coupled in the third mode, and the common latent structure in these images can be extracted by CIF-OPT algorithm. The objective function of the coupled analysis between the two image datasets is as follow In order to solve the above optimization problem, we can calculate its gradient and solve it by using any first order optimization algorithm of [40]. Next, this paper will discuss the gradient of the objective function, the first item of the function (11) is written as the second item of the function (11) is recorded as ,and the specific forms of the partial derivative of f 1 with respect to A (i) are as below where i = 1, 2, . . . , N.The matrices A and V ∈ R M×R are factor matrices obtained by matrix decomposition of the matrix Y.
The specific forms of the partial derivative of f 2 with respect to A (i) and V can be computed as Combined with the above calculation results, we can calculate the objective function f.
Finally, this paper calculates the gradient of the optimization function f, and its specific form is a vector e.g.
where the length of the vector is P = R N n=1 (I N + M) , which can be formed by vectorizing the partial derivatives with respect to each factor matrix and forming a column vector.
For some missing data sets, coupling analysis can still be carried out. The implementation of the algorithm can ignore missing data, and only analyze the known data elements to find the tensor or matrix model. And we applied it to the missing image and restored the original image based on the proposed coupled image decomposition algorithm through the another coupled image, which refer to [29].

Flexible coupling models
Rodrigo and Jérémy proposed the flexible coupling models based on the joint decomposition of Bayesian estimation in [18]. They mainly present two general examples of coupling priors such as joint Gaussian priors and non Gaussian conditional distributions. For two tensors with noisy measurements Y and Y ′ , Y is a second tensor (e.g. matrix) which can be the SVD Y = U V T + E , and Y ′ is a third order tensor which can be writ- Gaussian and non Gaussian models for the parameters in [18]. This paper only discusses the second example (e.g. the hybrid Gaussian model), and the other two cases are not considered by us. If readers are interested, you can refer to the literature [18].
If there is no prior information about some parameters, the joint Gauss modeling is not enough. On the contrary, we consider that these parameters are deterministic, while other parameters are still random Gauss priors. We call this model a hybrid Gaussian model. In fact, it only covers one scene which is factor matrix C is coupled to C ′ another by a transformation matrix, this coupled relation can be written by .
where H and H ′ are transformation matrices, Ŵ is noisy of independent and identically distributed (i.i.d.) Gaussian with matrix C , e.g. Ŵ ∼ N(0, σ 2 c ). Under the assumption of hybrid Gaussian model [8], the MAP estimation is obtained by minimizing the following cost function, that is, transforming function (22) into a. Hybrid gaussian modeling and alternating least squares (ALS) algorithm To minimize the above objective functions, standard algorithm matched with convex optimization can be used, Rodrigo and Jérémy proposed the modified version of the alternating least squares (ALS) which is widely used and easy to implement. We will refrain on detailing above algorithms.
Using the above algorithm to initialize the original tensor, the factor matrices are generated randomly and the tensor is formed by the tensor product operation between the factor matrices. Finally, we can estimate the effectiveness of the algorithms by comparing the mean square error between the factor matrices obtained by the above coupling tensor decomposition algorithms of the noisy tensor and the original factor matrices. If we apply the above algorithms to image, we need to initialize the original image, that is, generating the original factor matrices. In this paper, we use the following algorithm (i.e. Algorithm 1) to generate coupled images.
In this paper, the ALS algorithm is used to initialize the image, we modify the objective function and algorithm in [18] as follows: In order to apply ALS algorithm, it's necessary to calculate its gradient with respect to every factor matrix and set it to zero. For coupled factors C and C ′ , this algorithm only considers updating them simultaneously, which requires to solve the following linear equations [8]: and (23) step3: For simplification, absorbing the diagonal tensor D r and D r in a 0 andâ 0 to obtainÂ 0 andÂ 0 . Using the coupling factor σ c andĈ 0 to initial factor matricĈ 0 , e.g.
step4: Using tensor product forÂ 0 ,b 0 ,ĉ 0 andÂ 0 ,b 0 ,Ĉ 0 to obtain new tensors. Then the noise is added to the tensors to have coupled images Y, Y .
a This footnote is a zero mean white Gaussian matrix. b This footnote is the rank in tensor decomposition.

b. Joint tensor decomposition of high dimensional coupled images
For the actual high-dimensional images, a high dimensional coupled image decomposition algorithm is proposed. It is assumed that the coupling relationship between images is as follow: where Ŵ is an i.i.d. Gaussian matrix with variance of each element σ 2 c and C ′ has columns of given norm. If we disregard the coupling of the tensors, a common method to retrieve the CP models is to compress the data arrays, decomposing the compressed tensors, and then uncompress the obtained factors matrices, which is a more computationally efficient way.
For a three-order tensor X ∈ R I×J ×K ,the tucker decomposition is expressed as where U ∈ R I×P ,V ∈ R J ×Q , and W ∈ R K ×R are the factor matrices ,which are usually orthogonal. The positive integers P, Q, and R are the number of components (i.e., columns) in the factor matrices U , V , and W , respectively. The tensor G ∈ R P×Q×R is called the core tensor. Tucker decomposition is a high-order principal component analysis, which represents a tensor as a core tensor multiplied by a matrix along each mode.
If we assume two tensors are noiseless, i.e.
In this section, we consider the decomposition algorithm of the coupled high dimensional images. The high dimensional image is decomposed into the low dimensional core tensor through the Tucker decomposition, and then decomposing the two core tensors by CP decomposition to get the factor matrices, the compressed CP models will reduce the cost and consumption of the calculation. There exists the relationship below.
The factor matrix of the original tensor can be solved by matrix multiplication with less computation.
It is assumed that the coupling relationship between noiseless tensors in the compressed space as However, if one tensor is noisy(i.e. Y),there exists the similarity coupling in the compressed dimension where Ŵ c is a matrix of same dimensions as C c and with Gaussian i.i.d. entries of variance. The hybrid objective function in the compressed space is modified to the following objective function, which can be solved using the ALS algorithm.
For the factorization of some coupled images, we expect that the factor matrices obtained by factorization algorithm can be shown by images. That means to minimize the objective function under the constraint of nonnegative conditions. The MU algorithm is usually used for nonnegative matrix factorization [10].
Algorithm 2. Joint tensor decomposition algorithm for the coupled images step1: For two images, X , X . step2: Applying tucker decomposition algorithm to obtain step3: Decomposing the two core tensors by CP decomposition to get the factor matrices step4: Solving the hybrid objective function using the ALS algorithm.

Experiments and results
The main idea of this paper is applying the above coupled data fusion algorithms to deal with the image. It is well known that the stored data values of images are ordinarily large. If the tensor decomposition algorithm is applied directly to the coupled image data, the error is a big problem, which makes us use the command Im2double in Matlab to scale the values to reduce the numerical error in program operation.

CIF-OPT algorithm
In this section, the coupled matrix tensor decomposition method is applied to multispectral and panchromatic images. The original images which are the low spatial resolution multispectral image located in Beijing, China and the corresponding high spatial resolution panchromatic image are as follows.

Fig. 5 The multispectral image of area I
The experimental data are captured from Airborne Visible Infrared Imaging Spectrome-TER (AVIRIS) in Beijing. The AVIRIS data can provide 224 spectral segments with a spatial resolution of 20 m, covering a spectral range of 0.2 ∼ 2.4 m, and its spectral resolution is 10 nm. The size of multispectral and panchromatic images of area I which are shown in Figs. 5 and 6 are 300 × 300 × 3 and 300 × 300 pixels respectively. The size of multispectral and panchromatic images of area II which are shown in Figs. 7 and 8 are 256 × 256 × 3 and 256 × 256 pixels respectively. The multispectral and panchromatic image data of area I are read and initialized by MATLAB, and stored as tensor X ∈ R 300×300×3 and matrix Y ∈ R 300×300 . And the tensors of area II are generated in the same way.
According to the data generation of the CIF-OPT algorithm, the original data are sampled from the multispectral and panchromatic images, and the initial factor matrix is obtained by the ALS algorithm. Using the tensor product and the matrix multiplication to calculate the noiseless initial tensor and matrix. Without losing generality, we set the coupled factor matrix between tensor and matrix to C . In order to study the performance of the algorithm better, experiments are carried out under different noise levels. Adding noise to the obtained tensor X ∈ R 300×300×3 and the matrix Y ∈ R 300×300 after initializing the images, e.g.
where N ∈ R 300×300×3 is the stochastic Gaussian noise tensor, η is used to control the noise level. Similarly, the Gaussian noise is added to the matrix Y , that is, where N ∈ R 300×3 is the stochastic Gaussian noise matrix, η is used to control the noise level.
In this experiment, four different noise levels are set, which are 0, 0.1, 0.25, and 0.35 respectively. As the terminating condition of the CIF-OPT algorithm, it needs to be satisfied where f is the objective function, in addition, the maximum number of function values and iteration number is set to 10 4 and 10 3 respectively. In the process of experiment, the termination condition of algorithm depends on the change of function value before and after iteration.
The main purpose of this section is to apply the CIF-OPT algorithm to the coupled multispectral and panchromatic images and the specific results are shown in Table 1. From Table 1, it can be seen that the CIF-OPT algorithm has the similar fusion effect for different images, which shows that the algorithm has a certain robustness. The fusion effect will not change dramatically as the vary of image order and data elements. Moreover, the number of iterations does not alter greatly with the increase of order, which proves the feasibility of decomposition for coupled images.
The change from coupled data to the coupled image decomposition algorithm shows the iteration times and the maximum number of functions that CIF-OPT algorithm needs to convergence is larger than CMTF-OPT algorithm. The results are inevitable, because the coupled image needs to be initialized and converted to the coupled data to achieve the algorithm when the coupled image is decomposed. Therefore, the convergence of algorithm requires more iteration times. Accordingly, by observing the experiment of adding noise to the coupled image, it can be seen that the proposed algorithm can realize the decomposition of the coupled image, and the decomposition effect is better than the CMTF-OPT algorithm under the certain noise level. Consequently, the above results prove the feasibility of coupled image decomposition. Another ACMTF algorithm for coupling image decomposition conducted the same noise level experiment for region I, and compared with CMTF-OPT algorithm. The results are shown in Fig. 9. It can be seen that ACMTF algorithm is superior to CMTF-OPT algorithm in different noise levels. Therefore, based on the same parameters and noise conditions, this paper conducts the same noise level experiment for another ACIF algorithm of coupled image, and compares the target function value, iteration times, error and other parameter results between the two algorithms. The fusion effect under different noise levels is shown in Fig. 10. Experimental results show that CIF-OPT algorithm is more effective than ACIF algorithm when the added noise is less than 0.2. This is different from the data fusion effect of the two algorithms in the coupled data, and with the increase of noise level, the performance of the two algorithms in the fusion effect is almost the same, which shows that the improved image decomposition algorithm does not have better advantages than CIF-OPT algorithm.
In this paper, the number of iterations required for algorithm convergence is compared as shown in Fig. 11. It can be seen from Fig. 11 that the number of iterations required for ACIF algorithm to converge is more than CIF-OPT algorithm in most cases, and the running time from reaching convergence condition to algorithm termination is longer when the algorithm is running. In order to comprehensively consider various factors, this paper calculates the fusion errors of the four algorithms to comprehensively compare the accuracy of the algorithm for data decomposition. See Fig. 12 for the specific results. Figure 12 shows that to some extent, the decomposition effect of the coupled image is better than that of the coupled data, but the error of the decomposition algorithm of the coupled data has a certain stability, and the error difference is lower than that of the decomposition algorithm of the coupled image. For the two algorithms of coupled image decomposition, when the noise level is lower than 0.25, the error value of the two  Fig. 12 is small. Therefore, with the increase of noise level, the error value of CIF-OPT algorithm is very close to that of ACIF algorithm. So for the coupled image decomposition algorithm, CIF-OPT algorithm shows better fusion effect at low noise level, and the error difference and the number of iterations required for algorithm convergence are less.
The above algorithm is mainly based on the comparison of the result parameters of region I. We compare the iterations of the algorithm and the difference between the objective function values for two different regions based on CIF-OPT algorithm, and the results are shown in Figs. 13 and 14. It can be seen that there is no obvious linear relationship between the number of iterations and the image size. Because the magnitude of Fig. 14 is very small, the objective function values of the two regions are very close, that is to say, the function optimal value of CIF-OPT algorithm will not change significantly due to the difference of image size and pixel.
Based on the proposed CIF-OPT optimization algorithm, the multispectral images of the missing data coupled with the panchromatic image are restored. That is, all the data  of the coupled image with 25% missing data is better than the effect of image with 50% missing data. That is, the correlation between real value and estimated value is stronger when the percentage of missing data is 25% , and the linear slope is closer to 1.

Straightforward hybrid Gaussian coupled decomposition
We consider the straightforward hybrid Gaussian coupling model C = C ′ + Ŵ .
And we select the small images inside the red frame as the uncoupled original tensors of Algorithm 1 in Fig. 17. The two CP models are generated by these images with dimensions has some noise σ ′ n = 0.1 . The coupled images were generated by Algorithm 1, and decomposed by the ALS algorithm (i.e. Alg. 1) in [8]. Alg. 1 is applied to estimate the CP models under 400 different noise and coupling realizations. We also evaluate the total MSE on the C and C ′ factors and the total MSE for an ALS algorithm. And the total MSE on a factor, for example the total MSE on C with N r different noise realizations is (1/N r ) K k=1 R r=1 N r n=1 (C kr −Ĉ n kr ) 2 , where Ĉ n kr is the factor estimated in the n-th noise realization.
Due to the different coupling intensity of the image, we used the above algorithms to decompose the image and get the mean square error of the factor matrix C for Fig. 18 The coupled image Y Fig. 19 The coupled image Y ′ different coupling intensity. We can see that the factor matrix is generally better estimated by increasing the coupling density while applying a hard coupling(i.e. C = C ′ ) in Fig. 20, since more information comes from the clean tensor through the coupling. For a flexible coupling, instead of the mean square error decreases while 1/σ c > 10 2 . And in the interval 1/σ c ∈ [10; 10 4 ] , the flexible coupling model has better estimation performance than the hard couplings. For 1/σ c > 10 4 , the flexible couplings works better than a hard coupling.

Compressed coupled decomposition
For high-dimensional images, the compressed data decomposition algorithm is adopted in this paper. The selected images are the Lena noised images which are as follows in Figs Fig. 23, which showed that the mean square error of factor matrix is the smallest when R = 3 . So later in this paper, R = 3 as the best number of components is used for other experiments. Factor matrices are generated by CP decomposition algorithm similarly to the previous example. The matrix C is coupled with factor matrix C ′ with additive zero mean Gaussian noise of variance σ 2 c . Where (39)  and γ is a zero mean white Gaussian matrix. The data array Y is almost noiseless σ n = 0.001 , while Y ′ has some noise σ ′ n = 0.1.We compare the performance of the coupled algorithm in the compressed space and the uncompressed space. Results for 20 iterations of the coupled algorithms are shown in Figs. 24 and 25. Compression was computed with randomized SVD from [41]. And initializations were given by two coupled uncompressed ALS with 1000 iterations, themselves initialized by decomposing images.
As shown in the picture, the compression and ALS algorithms show similar performance in the case of coupling, and the decomposition accuracy of the two algorithms decreases with the increase of the coupling density.
Then we study the relationship between noise and algorithm estimation performance. The SNR for Y is set to 22 dB, and it varies from 4 dB to 20 dB for Y ′ .Where We compare the performance of the coupled algorithm in the compressed space and the uncompressed space, as well as standard ALS in the uncompressed space. Note that in Fig. 26, when the SNR < 18 dB, uncoupled ALS algorithm works better than the remaining two algorithms. As the noise level increases, for SNR > 18 dB, the uncoupled ALS algorithm decreases estimation performance in both coupled and compression cases. We can see that the compression decomposition algorithm obviously saves time and cost compared to the uncompressed algorithm which may be rather slow. As the noise level increases, uncoupled ALS decreases estimation performance in both coupled and compression cases. In addition, hybrid coupling model is only verified in Gaussian condition to show its effectiveness and we will do other experiments in the future to demonstrate the versatility of the algorithm.

Conclusions
In this paper, we propose two algorithms for coupled image decomposition, which mainly utilize the CMTF-OPT algorithm and the flexible Bayesian model in the coupled data decomposition. For the proposed CIF-OPT algorithm, the corresponding experiments show that the effect of the coupled image decomposition under the influence of different noise is robust, and the fusion effect is better than the CMTF-OPT algorithm, which shows that the coupled images decomposition algorithm is feasible. In addition, because the expression of a phenomenon can be different from all kinds of data sets, the link set of data decomposition should be flexible. Therefore, this paper presents the modified flexible Bayesian model. From the experiments of it, we can easily see that the factor matrix could be estimated better by increasing the coupling density. And from the aspect of algorithm, the flexible coupling model has better estimation performance than the hard coupling models. Moreover, a coupled data compression scheme is derived from tensor images of large data sets. As the noise level increases, uncoupled ALS decreases estimation performance in both coupled and compression cases. On the run time of the algorithm, the compression decomposition algorithm obviously saves time and cost compared to the uncompressed algorithm. In fact, the image matrix is nonnegative. Therefore, when considering the coupled image decomposition algorithm, adding non negative constraints is our work in the future.