Improved Coupled Tensor Factorization with Its Applications in Health Data Analysis

Coupled matrix and tensor factorizations have been successfully used in many data fusion scenarios where datasets are assumed to be exactly coupled. However, in the real world, not all the datasets share the same factor matrices, which makes joint analysis of multiple heterogeneous sources challenging. For this reason, approximate coupling or partial coupling is widely used in real-world data fusion, with exact coupling as a special case of these techniques.However, to fully address the challenge of tensor factorization, in this paper, we propose two improved coupled tensor factorization methods: one for approximately coupled datasets and the other for partially coupled datasets. A series of experiments using both simulated data and three real-world datasets demonstrate the improved accuracy of these approaches over existing baselines. In particular, when experiments on MRI data is conducted, the performance of our method is improved even by 12.47% in terms of accuracy compared with traditional methods.


Introduction
With the rapid development of cyber physical systems, a soaring amount of data from heterogeneous sources is now easily accessible.Analysing data from multiple sources has been proven to enhance knowledge discovery by capturing its underlying structures, which are otherwise difficult to extract.For instance, in recommendation systems, it is not only possible to rely on past user ratings as additional assistance for joint analysis, but to also consider the supply chain surrounding a product, or the similarity between users and other information [1][2][3].Drawing upon additional related information can improve recommendation performance.In metabolomics-an analytical technique used to study biological fluids such as LC-MS (liquid chromatography-mass spectrometry) and NMR (nuclear magnetic resonance)-joint analysis helps to accurately identify the various component chemicals [4,5].Electroencephalography (EEG) and functional magnetic resonance imaging (fMRI) are complementary patterns and when jointly analyzed, can provide "the best of both worlds," i.e., EEG's superior time resolution and fMRI's superior spatial resolution.Hence, fusing models can, for example, provide deeper insights into the activities of the brain or help improve medical treatments for nervous system diseases [6][7][8].
A common and effective way to deal with multisource data is to represent them as matrices and then use collective matrix factorization (CMF) [9] for joint analysis.Matrixbased joint analysis is used extensively in many fields, including bioinformatics [10,11], social network analysis [12,13], signal processing [14,15], and so on.However, this type of analysis only works with two-dimensional data and cannot be applied to datasets of three or more dimensions.However, recent developments in sensor technology now allow more and different aspects of data to be captured, and higher order tensors have become an important tool for representing these multidimensional datasets.Accordingly, tensor decomposition was introduced to accurately extract (i) The proposed CTF-AC method is the very first tensor factorization model to address data fusion with approximately coupled datasets.This model is also suitable for multisource datasets that are not coupled but where the data are highly correlated.
(ii) By combining individual decomposition and coupled decomposition, a new coupled tensor factorization method called CTF-PSF emerges.This method handles data fusion with partially coupled datasets.
(iii) Extensive experiments on synthetic and real-world datasets verify that the two proposed methods generate more accurate results than the traditional methods.
The rest of this paper is organized as follows.Section 2 introduces some background knowledge on tensor decomposition and provides the problem definition.The details of how our approaches work are introduced in Section 3. Section 4 describes the experimental design of this paper Figure 1: A CP decomposition of a third-order tensor.Figure adapted from [29].
and numerical experiments to illustrate the advantages of the proposed methods.Finally, we conclude our work and discuss future research directions in Section 5.

Preliminaries and Problem Definition
Following the notations in [29], vectors (tensors of order one) are denoted in boldface lowercase letters, e.g., a, b, c.Matrices (tensors of order two) appear as boldface capital letters, e.g., A, B, C. The th column of A is denoted as a  .A () indicates the th matrix in a sequence.For example, A (1) , A (2) , . . ., A () represent a sequence of  matrices.The transpose of matrix A is denoted by A T .Higher-order tensors (third-order or higher) appear as boldface Euler script letters, e.g., X, W. X () indicates the mode-n matricization of an  ℎ -order tensor X, which can be obtained by permuting the dimensions of X and reshaping the permuted tensor into a matrix.‖a‖ and ‖A‖ denote the two-norm of a and the Frobenius norm of A, respectively.The Hadamard products is indicated by * .Table 1 lists all the symbols used in this paper.
2.1.CANDECOMP/PARAFAC Decomposition.CANDE-COMP/PARAFAC (CP) is one of the most popular tensor decompositions.The goal of CP decomposition is to factorize a tensor into a sum of rank-one tensors.For instance, given a third-order tensor X ∈ R ×× , after CP decomposition, X can be approximately represented as where X  is a rank-one tensor and the symbol "∘" represents the vector outer product operator [29]. is a positive integer, which means it approximates X with  rank-one tensors.This CP model can be concisely described by where ⟦ ⟧ denotes the CP decomposition operator [30].In this CP decomposition, A, B, and C are the factor matrices of X, which represent a combination of the vectors from the rank-one components in Figure 1; i.e., A = [a 1 a 2 . . .a  ].
Later, Acar et al. improved CP algorithm and developed an algorithm named CP-WOPT [31]; it uses a first-order optimization method to solve the weighted least squares problem.
Complexity 3 mode- matricization of a tensor a  the nth column vector of A A ()  the nth matrix in the sequence A (1) , A (2) ,..., A An illustration of a joint decomposition with a third-order tensor and a matrix factorization of  components.X and Y share one dimension; that is, A = [a 1 , a 2 , . . ., a  ] as the common factor matrix. Figure adapted from [32].

Coupled Tensor Factorization.
Coupled factorization methods have become an effective means for jointly analyzing multisource datasets.The simplest form of coupled tensor factorization is collective matrix factorization (CMF).For example, in a movie recommendation system, additional information about the movie, such as the movie genre, its actors, or the user's social network, in addition to the user's historical ratings, could be used to improve the accuracy of rating predictions.For example, a user rating matrix for the movie can be expressed as matrix X, which represents V × , coupled with matrix Y, which represents V × .This CMF model can be defined as where U, V, and W are the factor matrices.
As shown in Figure 2, a high-order extension of CMF, i.e., a CMTF model, can be simply defined as where A, B, C, and V are the factor matrices.In this problem, CMTF-OPT is used to vectorize all the factor matrices and their partial derivatives so the problem can be solved by any gradient-based optimization algorithm, such as the nonlinear conjugate gradient (NCG) method.More details can be found in [23].

Problem Definition.
Consider a coupled tensor factorization of two third-order tensors X 1 and X 2 that are coupled in the first dimension.A, B, and C are the factor matrices of X 1 and U, V, and W are the factor matrices of X 2 .To jointly factorize X 1 and X 2 , the objective function can be written as ( Given our focus is on situations that are not exactly coupled, but rather approximately coupled, e.g., A ≈ U, function ( 5) is no longer applicable.However, like soft constraints, the matrices can be coupled approximately by adding a regularization term.Then, the objective function becomes However, function (6) has two potential issues.(i) The model loses accuracy when there is a large difference between the number of entries in tensors X 1 and X 2 .The errors from approximating X 1 and X 2 will have a different impact on Complexity the objective function depending on whether there are many more or many less entries in X 1 than X 2 .Therefore, using the same weight ratio will, obviously, result in a loss of accuracy [33].(ii) Further, this model is only suitable for cases of two-tensor coupling and cannot be applied to multiple tensor scenarios.
A completely shared factor matrix, whether approximate or not, is only one type of exact coupling, e.g., A = U.There are other types of exact coupling scenarios, such as partial coupling, e.g., where heterogeneous datasets only share some, but not all, components [34].The methods based on function (4) may not be applicable to such situations.Hence, we turn our attention to partial coupling with an extension to CTF-AC, called CTF-PSF.CTF-PSF is based on Acar et al. 's [23] CMTF-OPT algorithm, but with some modifications to allow for data reconstruction with heterogeneous data that has both shared and unshared components.More details on this model appear in Section 3.2.

The Proposed Models
In real life, many heterogeneous datasets are only approximately coupled, which means that their dimensions are not exactly coupled.The CTF-AC model offers a joint decomposition solution to situations with approximately coupled datasets.Moreover, it is relatively common for multisource datasets to be partially coupled, which means that only some of the factors in a potential matrix are shared, not all, as is the case with exact coupling and its variants.To address these situations, we have extended CTF-AC to incorporate the CMTF-OPT algorithm in a method called CTF-PSF to offer joint decomposition for partially coupled datasets.The CTF-AC model is presented in Section 3.1.The CTF-PSF model is presented in Section 3.2.

CTF-AC.
To address the two potential problems associated with function (7), i.e. unbalanced tensor entries and its inapplicability to multitensor scenarios, we have developed a "two birds with one stone" solution.To overcome potential inaccuracies as a result of unbalanced tensor entry distributions, we have added error weights to the objective function (Section 3.1.1)and to extend traditional models for use with more than two tensors, we have added a soft constraint to the transfer factor matrix (Section 3.1.2).

Adding Error Weights.
When a weight is assigned to the fitting error for each tensor, function (6) becomes where 1/2 helps with the derivative calculations and where W (1) and W (2) are binary tensors of the same size as X 1 and X 2 , respectively.Therefore, ‖W (1) ‖ 2 and ‖W (2) ‖ 2 indicate the number of entries of X 1 and X 2 , and (A) denotes the number of entries in A. In this way, the model eliminates the influence an imbalanced number of tensor entries has on accuracy.

Adding a Soft Constraint to the Transfer Factor Matrix.
To extend traditional models for use with more than two tensors, we have modified function (8) on the assumption that the Frobenius norm may have possible transitiveness.Assume that tensor X 3 is approximately shared with X 1 and X 2 .D, E, and F are the factor matrices of X 3 .Three tensors can then be approximately coupled by Now consider a more general situation.Suppose there are  tensors from  sources.The objective function of the joint decomposition of these tensors, based on the CP model, is defined as where Further, assume that there are  related factors in these  relevant tensors; i.e., Thus, functions ( 9) and ( 10) can be modified as +      A (1)   − A (1)   1 Then, the partial derivatives of  with respect to A ()   can be calculated with function ( 14) as follows: The factor matrix between multiple tensors is constrained using the soft constraint transfer method.When taking partial derivatives of the shared factor matrix, the solution of the factor matrix for the first tensor and the last tensor is somewhat different from that in the middle.Hence, the shared factor matrix is divided into these three different types of partial derivatives, i.e., the first, last, and middle.
With all the gradients of the factor matrix derived, the problem can be solved with any gradient-based method.The algorithm flow of the joint-filled nonlinear conjugate gradient with C related factors of these M relevant tensors is shown in Algorithm 1. Convergence is achieved when the relative change of the objective function is less than the set threshold.The algorithm terminates when the number of iterations reaches its maximum.

CTF-PSF.
The CTF-AC model outlined above can further be evolved into a new model that deals with partially coupled datasets, i.e., CTF-PSF [35].
As shown in Figure 3, when heterogeneous datasets only share some components rather than all, methods based on objective function (4) may not be applicable.Without loss of generality, we take the coupled datasets of a tensor and a matrix as an example.Figure 3 shows that a third-order tensor X ∈ R ×× and a matrix Y ∈ R × are coupled in the first dimension.However, suppose they have the same low-rank structures, i.e., the same number of .Let A 1 ∈ R × , B ∈ R × , and C ∈ R × be the factor matrices of X extracted through a individual decomposition with  components.Similarly, A 2 ∈ R × and V ∈ R × are the factor matrices extracted from matrix Y using a matrix factorization with   components.In partially coupled multisource datasets, the factor matrix derived from each source will not match exactly; i.e., A 1 and A 2 are likely to only share some columns.Further suppose that tensor X and matrix Y have   shared components and   (−  ) unshared components.Then, the objective function ( 4) can be modified to where and where A 1 ∈ R ×  and A 2 ∈ R ×  are the unshared columns of A 1 and A 2 , respectively, and A  ∈ R ×  are the shared   columns.Thus, the objective function ( 15) can be further modified to Here, the shared and unshared components are optimized separately.The unshared components of the tensors and matrix are updated using individual decompositions, and the shared components of the tensor and the matrix are updated using joint decompositions.Specific details of this optimization can be found in [35].
The pseudocode for CTF-PSF is shown in Algorithm 2. The algorithm terminates when the number of iterations and the number of function evaluations reach their respective maximums.The algorithm converges based on the relative change value and the two-norm of the gradient of all factor matrices divided by the number of entries in the gradient.The functions of  and  in Algorithm 2 denote individual and joint decompositions, respectively.First, each single dataset is decomposed individually to update the unshared columns (A 1 , A 2 ) in the matrix of shared dimension (line 7 to 17 in Algorithm 2).The others factor matrices (B, C, and V) and the shared columns (A  ) in the matrix of the shared dimension are updated through joint decomposition (line 19 in Algorithm 2).However, this does mean that the number of shared components needs to be determined in advance.To ensure proper modeling (line 20 Algorithm 2), there is a necessary adjustment step to combine A  , A 1 , and A 2 into

Experiments
4.1.Experimental Design.We tested and verified the advantages of both CTF-AC and CTF-PSF through a series of comparative experiments with several baselines on both simulated data and three real-world datasets.(i) CMTF-OPT first vectorizes all the factor matrices and their partial derivatives so that problems can be solved using any gradient-based optimization algorithm.
(ii) ACMTF-OPT is an advanced version of CMTF-OPT that includes additional constraints to allow analysis of more complex coupled data.It can also be used for missing completion.More details can be found in [16].
(iii) CP-WOPT is an individual decomposition method for single tensors, it uses a first-order optimization method to solve the weighted least squares problem.

Performance Metrics.
The performance of all methods, including CTF-AC and CTF-PSF, was evaluated according to the accuracy of missing completions.Hence, missing index tensors in the models were added to deal with incomplete data.The assessment metric is defined as the difference between the original and the estimated entries for missing values, known as a tensor completion score (TCS).TCS is defined as where X is the initial tensor and X indicates the datasets estimated by different methods.W is a binary tensor of the same size as X and represents the missing entries in X with zeros to represent the missing data and ones to represent valid data.Obviously, the smaller the TCS value, the better the result.
In the CTF-AC experiments, we also used RMSE to measure the fitness of the observable values for each method, defined as One patient was randomly selected from 38 patients, and then we randomly selected two MRI brain scans of the same site as the two sources of data.Each scan has a size of 378 × 378.The two MRI images are shown in Figure 4. 2 was also used to assess CTF-AC.
Dataset3 (Dataset3 is available at http://www.models.life.ku.dk/joda/prototype) contains 29 chemical mixtures, each comprising five chemicals measured using LC-MS (liquid chromatography-mass spectrometry) and NMR (nuclear magnetic resonance).NMR was able to detect all five component chemicals and the results can be formulated as tensors X ∈ R 28×13324×8 .LC-MS, however, only detected four components and therefore the results were formulated as a matrix Y ∈ R 28×168 .More details about these coupled datasets can be found in [16].This dataset was used to assess CTF-PSF.

General Experimental
Parameters.For all comparative experiments, each method was given the same termination conditions.The maximum number of iterations was set to (10 4 ), and the maximum number of function evaluations was set to (10 5 ).Additionally, the relative change in loss function values was set to 10 −6 and the two-norm of the gradient divided by the number of entries in the gradient was set to 10 −7 .The sparsity penalty parameters for ACMTF-OPT were set to 10 −3 ; i.e.,  = 10 −3 .

Simulated Data
Experimental Set-Up.Tensor data X ∈ R  1 × 2 ×⋅⋅⋅×  and matrix data Y ∈ R   × were generated according to the same technique in [32] using the following formulation: X = ⟦A (1) , A (2) , . . ., A () ⟧ + ⟦B (1) , . . ., B () ⟧ , where A () ∈ R   ×  is the factor matrix shared by both datasets and the other matrices A () ∈ R   ×  ( = 1, 2, . . .,  and  ̸ = ) are the factor matrices for other dimensions of the tensor.V ∈ R   ×  denotes the factor matrix corresponding to the second dimension of matrix Y.   denotes the number of shared components.The factor matrices B () ∈ R   ×  ,  ∈ [1, ], are the unshared factors for each dimension of the tensor.C (1) ∈ R   ×  and C (2) ∈ R ×  are the unshared factors of the matrix Y, and   represents the number of the unshared components in each dataset.B () , C (1) , and A ()  correspond to A 1 , A 2 , and A  in Section 3.2, respectively.
All matrices, except for B () and C (1) , were generated randomly with entries drawn from a standard normal distribution.All matrix columns were normalized to a unit norm.Then, Gaussian noise was added to the tensor and matrix using X  = X + N(‖X‖/‖N‖), Y  = Y + N(‖Y‖/‖N‖), respectively, where  indicates that the noise levels, tensors N, and matrix N are the same size.All entries had a standard normal distribution.Finally, the simulated missing values were added to tensor X  according to a sampling ratio, denoted as SR.
In the first dimension, we used a tensor size of 50×30×20 coupled with a matrix of 50 × 100.The factor matrices were generated and constructed as coupled datasets using (20).Let R indicate the estimation rank of the coupled datasets.In individual decompositions, R denotes the estimation rank of the individual datasets.In the experiments with CTF-PSF, the total number of shared and unshared components were set to four unless otherwise specified; i.e., the rank of each individual dataset was set to 4 ( =   +   = 4).The noise level  and the sampling ratio of missing values (SR) were set independently for each comparative test.As previously mentioned, performance was evaluated according to the estimation accuracy of missing values as calculated by (18).Numerical Results.Figures 5 and 6 shows the TCSs of all four methods for different SR at   = 3,   = 1, and  = 0.1 when R = 3 and R = 4, respectively.Figure 5(a) shows similar performance by the three joint decompositions based methods when   = R.In Figure 5(b), we see that CP-WOPT, which is based on individual decomposition was relatively stable and ostensibly equivalent to the joint decomposition methods with a missing value ratio of less than 90%.However, Figure 5(c) shows that the joint decomposition methods performed well with many missing values, while individual decomposition produced completely inaccurate results.CTF-PSF considers the shared components and does not take the unshared components into account when R =   .In other words, CMTF-OPT is a special case of CTF-PSF when the coupled datasets do not have any unshared components.Figure 6(a) shows that CMTF-OPT and ACMTF-OPT gave almost the same performance when   ̸ = R.However, in contrast to the previous experiment, individual decomposition had certain advantages with a missing value ratio below 90%, as shown in Figure 6(b).It is worth noting that since CTF-PSF considers both the unshared components and the shared components, this method performed almost as well as the methods based on individual decomposition.However, as can be seen in Figure 6(c), once the proportion of missing values reached 90%, the TCS for the individual decomposition method rapidly increased, while CTF-PSF continued to provide good performance.The influence of different numbers of shared components with the methods based on joint decomposition is shown in Figure 7. CTF-PSF and CMTF-OPT were tested with R =  = 6,   = {1, 3, 5}, and  = 0.1 at different missing value ratios.As shown, increasing the number of shared components helped to improve completion accuracy.This is mainly because the factor matrix provides more auxiliary information as the number of shared components increases.In addition, the advantages of CTF-PSF became more obvious as the number of shared components increased compared to CMTF-OPT.
Figure 8 shows the TCSs for the joint decomposition methods when X and Y were simultaneously sampled at   = 3,  = 0.1 and SR(%) = [90, 40].Here, R =  = 4. S-1 and S-2 represent X and Y, respectively.The results show that our method still achieved good completion accuracy when every tensor and matrix contained at least some missing values.

Real-World Data.
Recall that X and Y in 3 (described in Section 4.1.3)are partially coupled in terms of the constituent chemicals.These datasets have four shared components and X has an unshared component.LC-MS data are often noisy and contain many irrelevant features.Therefore, noise can also be regarded as an unshared component of Y.
To compare the performance of different methods, X was simulated with different proportions of missing values, and TCSs were evaluated for all baselines using joint decomposition, as shown in Table 2. CMTF-OPT preformed better than CTF-PSF with lower amounts of missing values (SR(%)), and ACMTF-OPT was superior to CTF-PSF when the missing value ration reached 80%.The reason for this is that CTF-PSF does not consider the weight of shared and unshared components.Unsurprisingly, CTF-AC did not perform as well as the other methods, including CTF-PSF, when faced with partially coupled datasets.in the figure, the accuracy of these methods deteriorated as noise increased, particularly with higher proportions of missing values.CTF-PSF performed better than the other methods when   < R, but not obviously so with high levels  of noise ( = 0.3).This is because unshared components can be very helpful with data reconstruction when   < R. The accuracy of these methods improved as the number of shared components (  ) increased.And their performance  was almost the same when   =  = 4, i.e., when all components are shared, where CMTF-OPT becomes a special case of CTF-PSF.

Simulated Data
Experimental Set-Up.The multisource datasets we generated contained two types of shared data to simulate the different kinds of shared relationships found in reality.Without loss of generality, two third-order tensors are used as an example to explain the way the data was generated.Suppose the tensors X ∈ R ×× and Y ∈ R ×× are two related tensors.The factor matrices for tensor X derived through individual decomposition are A, B, and C, and the factor matrices for tensor Y derived through individual decomposition are U, V, and W. From these matrices, two simulated datasets were generated in two different ways: Case 1.The unshared factor matrices B, C, V, and W and the shared factor matrix A were randomly generated from the normal distribution.U = (0, 1) * A + (0, 1), where the  function represents the random arrays generated from a specified distribution.Two third-order tensors were then formed based on the generated factor matrix and normalized to arrive at X and Y.Then, Gaussian noise was added to the tensor, i.e., X  = X + N(‖X‖  /‖N‖  ), where N ∈ R ×× corresponds to the random noise tensor and  is used to adjust the noise level.The same process applies to Y  .
Case 2. The only difference between this case and Case 1 is that U = (0.5* , 1), where  indicates the parameter The difference between these two datasets is the relationship between the factor matrices A and U.In Case 1, A has a linear relationship to U. In Case 2, the elements in A and U have normal distributions that satisfy the same variance with different mean values.Both these datasets approximate realworld data.
We use CMTF-OPT to realize the traditional multisource tensor decomposition, i.e., MTF.In the experiments with simulated datasets, we compared CTF-AC with a joint decomposition method (MTF) and an individual decomposition method (CP-WOPT).We set the dimensions of the tensor to 50 and the estimation rank to 5. The estimated rank for joint decomposition methods was set to R = 10.R is the estimated rank for CP-WOPT.All noise levels were set to  = 0.1 and, again, the sampling ratio of missing values is denoted as SR.
Numerical Results.Table 3 shows the completion degree of the observed values and the completion accuracy of the missing values for both simulated datasets with a missing value ratio of SR(%) = [0, 95].A missing value ration of SR(%) = [0, 95] means that the first tensor has no missing values, while 95% of the values in the second tensor are missing.Since the two tensors are the same size, but the amount of missing values is very different, the completions for the first tensor and the second differ greatly, i.e., 100 : 5. Table 3 shows the root mean square errors for the observable data in the first tensor (RMSE1) and the second tensor (RMSE2).TCS1 represents the missing value results for the first tensor, with TCS2 representing the second tensor.The results in Table 3 show that RMSE1 for MTF was similar to CTF-AC.In Case 2, CTF-AC's RMSE1 (2.8270e-4) was actually larger than MTF (2.8111e-4).However, regardless of the type of dataset, the RMSE2 for MTF was larger than that for CTF-AC, especially in Case 1.This indicates that traditional models, like MTF, sacrifice completion accuracy with tensors that have a smaller number of observables to compensate for completion accuracy with tensors that have a larger number of observables.Therefore, the CTF-AC tensor completion score for the second tensor (TCS2) is better than the MTF under this circumstance.CTF-PSF was not included in the experiments with these simulated datasets given that both are approximately coupled.However, CTF-PSF is included in the following experiments with real-world datasets.4 shows the TSCs for the four methods with the electronic nose dataset and missing value sampling ratios of SR (%) = [5,50,90].Here, MTF had better completion accuracy with small missing value ratios (TCS=0.0029at SR(%)= 5), but this accuracy was significantly reduced on tensors with large ratios (TCS=0.1635at SR(%) = 90).CTF-AC shows better TCSs with more missing values because, again, traditional methods, like MTF, sacrifice accuracy with fewer observables in favor of better accuracy with more.With a missing value ratio of 90%, CTF-AC's TCS was half that of MTF.Adding the TCSs for all three missing value ratios, CTF-AC achieved better overall accuracy.CTF-PSF was less effective than other methods because it is specifically designed to address partial coupling and is not well-suited to approximately coupled datasets.Table 5 lists the TCSs for the four methods on the brain MRI dataset with missing value sampling ratios of SR(%) = [x, 0] and x = [10,20,30,40,50,60,70].The bold results denote the best scores.Here, CTF-AC shows a significant advantage over MTF.Although there was no significant difference between the completion accuracy for CTF-AC and CP-WOPT with small missing value ratios, CTF-AC's completion accuracy on the first image improved in comparison as the ratio increased because it borrows auxiliary information from the second image.Hence, CTF-AC's TCS was superior to CP-WOPT with a high missing value ratio.Figure 11 shows the completion accuracy of the first MRI image for all methods when SR (%) = [60, 0].12(a) and 12(b) show the completion accuracy comparison plots for each method on the two simulated multi-source datasets; one with balanced missing value ratios (SR(%) = [50, 50]), the other with imbalanced ratios (SR(%) = [5,95]).With a balanced ratio, the number of observable and missing values in the tensors is roughly equivalent, while with imbalanced ratios, there is great disparity.

Discussion. Figures
From Figure 12, we see that the TSCs for the weighted model, CTF-AC, and the traditional model, MTF, are similar with balanced observable and missing values.However, as the disparity increases, the disadvantages of traditional models becomes very obvious.Figure 12(a) shows an even higher     TCS for CP-WOPT, while the completion accuracy for CTF-AC with tensors that contain a great many missing values remains high.To further investigate the impact of missing values with each method, we sampled the MRI dataset with missing value ratios of SR(%) = [x, 10] and x = [10, 30, 50, 70] and list the results in Table 6.With a balanced ratio (SR(%) = [10,10]), CTF-AC still performed better than the other methods, including CTF-PSF.Therefore, using a model that is completely shared or partially shared based on factor matrices may introduce some errors.Overall, these experimental results prove the validity and accuracy of the CTF-AC model.Figure 13 shows the completion accuracy of CTF-AC, CTF-PSF, MTF, and CP-WOPT on the two images with an imbalanced ratio of SR (%) = [50, 10].Unlike traditional joint decomposition methods (such as MTF), CTF-AC adds error weights to the objective function and sets a discriminant factor for both datasets.The discriminant factor reflects the correlations in the data to fit each tensor.Therefore, CTF-AC provides better completion accuracy than MTF with greater levels of missing values.

Conclusions
Jointly analyzing data from multiple sources has the potential to extract underlying data structures to enhance knowledge discovery.However, in data fusion, traditional coupled tensor factorization has been unable to deal with the diverse relationships found between multi-source datasets, such as approximate or partial couplings.Existing techniques are only appropriate for modeling exact couplings.Therefore, to address this challenge, we propose two improved coupled tensor factorization methods: one for approximately coupled datasets, CTF-AC, and the other for partially coupled datasets, CTF-PSF.CTF-AC is also suitable for multisource datasets with no dimension couplings if the data is highly correlated.CTF-PSF is an extension of CTF-AC, based on the CMTF-OPT algorithm, which factorizes datasets with both shared and unshared components by combining individual and coupled decompositions.Through numerical experiments, we demonstrate that the tensor completion accuracy of the proposed methods outperforms traditional coupled tensor factorization methods on datasets with approximate and partial couplings.However, there are some disadvantages to the proposed methods.These are highlighted because they provide opportunities for future research.
In future, our work will pursue several research directions: (i) overcoming CTF-AC's increase in computational costs as a result of calculating the factor matrix constraint term in the objective function.We intend to rewrite this calculation into a parallel algorithm to improve overall operational efficiency.(ii) CTF-PSF is based on a predetermined number of shared components and it does not consider the weights of those shared and unshared components.Therefore, in future work, we will strive to make our framework more accurate and robust by adding more constraints.

Figure 3 :
Figure 3: Illustration of a joint decomposition for a third-order X and a Y coupled in one dimension.They have   shared components and  −   unshared components rather than sharing all the components.

4. 1 . 1 .
Baselines.The baselines roughly fall into two different categories: methods that jointly decompose multiple tensors, such as CMTF-OPT and ACMTF-OPT, and methods that individually decompose a single tensor, such as CP-WOPT.A description of each follows.

Figure 4 :
Figure 4: The results of two brain examinations of the same patient on the same site within the same year.

Figure 5 :
Figure 5: The TCS for each baseline with different amounts of missing data ( R = 3 when   = 3,  = 1).The performance of joint decompositions methods was almost the same because CMTF-OPT is a special case of CTF-PSF when R =   .Figures 5(b) and 5(c) are local amplifications of Figure 5(a).The sampling ratio of missing values is represented by SR.

Figure 6 :
Figure 6: The TCS for each baseline with different amounts of missing data ( R = 4 when   = 3,   = 1).CTF-PSF performed better than the baseline joint decomposition methods, indicating that CTF-PSF does not lose accuracy with high amounts of missing data like CP-WOPT.Figures 6(b) and 6(c) are local amplifications of Figure 6(a).

Figure 7 :
Figure7: The TCS for CTF-PSF at different missing value rations (SR% where   = {1, 3, 5} R =  = 6,  = 0.1).As the number of shared components increases, the factor matrix provides more auxiliary information, which helps to improve completion accuracy.

Figure 8 :
Figure 8: An illustration of the joint decomposition methods with a third-order tensor and a matrix with  components.X and Y share one dimension (i.e., A = [a 1 , a 2 , . . ., a  ]) as the common factor matrix. Figure adapted from [32].

Figure 9 :Figure 10 :
Figure 9: The TCSs for coupled factorization methods at  = 0.1 and  = 0.3 and R = 4.The TCSs for these methods deteriorated as noise increased.CTF-PSF was superior to CMTF-OPT and ACMTF-OPT, although their performance gradually diminished as noise increased.

Figure 11 :
Figure 11: MRI data of the brain at SR(%) = [60, 0], the completion effect of all methods on the first image.

Figure 12 :
Figure 12: The TCSs for the three methods on Cases 1 and 2 with balanced and unbalanced missing values.

Figure 13 :
Figure 13: MRI data of the brain at SR(%) = [50, 10], the completion effect of all methods on the first image and second image.

Table 1 :
List of symbols used in this paper.
1,  2 are the error weights from approximating X 1 and X 2 , respectively. is the error weight of the Frobenius norm of A and U. To equalize the contribution of errors in each part of the objective function,  1 and  2 are set to the reciprocals of the number of entries in X 1 and X 2 , respectively. is set to the reciprocal of number of entries in A.
info.cs.ucy.ac.cy) consists of 38 patients who underwent two MRI scans of different parts of the brain within the same year.

Table 5 :
The TSCs for all four methods on MRI brain scans with different ratios of missing values.