A comparison of algorithms for fitting the PARAFAC model
Introduction
The PARAFAC (PARallel FACtor analysis) model was introduced in 1970 by Harshman (1970) and simultaneously by Carroll and Chang (1970) under the name CANDECOMP. For a three-way data array , the PARAFAC model is defined aswhere is the measured value, , , and represent the parameters to estimate, are the residuals and F is the number of factors extracted.
In general terms, fitting model (1) boils down to minimising (usually in a least squares sense) the fitting error made. This means finding the parameters that minimise the loss function:Several algorithms for solving such problems are described in the literature (Faber et al., 2003, Hayashi and Hayashi, 1982, Paatero, 1997, Tomasi and Bro, 2004). Some of those tested in this work (DTLD, ASD and SWATLD) do not minimise (2) and in fact their loss functions are not strictly well-defined (Faber et al., 2003).
While the PARAFAC model remains the same, the different methods of fitting it to a three-way array can be better explained introducing distinct notations. The parameters can be gathered in three loading matrices (also referred to as scores matrix), and defined asandwhere , and denote the columns of , and respectively.
Using the matricised format for multi-way arrays (Bro, 1998) and with the introduction of the column-wise Khatri–Rao product (Rao and Mitra, 1971) (see Appendix C), the model can be written aswhere “T” means the transpose and refers to the way the multi-way array is matricised ( slabs of size are put one beside the other forming a matrix of dimension ).
Another possibility is to express the model in a slab-wise form as:where is a diagonal matrix containing the k-th row of , is the matrix representing the k-th frontal slab and is the corresponding matrix of the residuals. Eqs. (4) and (5) have to be trivially modified in case other matricisations (e.g. ) or slabs (e.g. the horizontal slab of size ) are required (Bro, 1998).
The algorithms fitting the PARAFAC model can be classified in three main groups: alternating algorithms, which update only a subset of the parameters at each step; derivative-based methods, seeking an update for all the parameters simultaneously by successive approximations; and direct (non-iterative) procedures. To the first group belong PARAFAC-ALS (Alternating Least Squares (Harshman, 1970)), ASD (Alternating Slice-wise Diagonalisation (Jiang et al., 2000)) and SWATLD (Self Weighted Alternating TriLinear Decomposition (Chen et al., 2000)). The second set is represented by PMF3 (Positive Matrix Factorisation for 3-way arrays (Paatero, 1997)) and dGN (damped Gauss–Newton, also known as Levenberg–Marquadt (Paatero, 1997, Tomasi and Bro, 2004)). As for the third grouping of algorithms, the most known implementations are the Generalised Rank Annihilation Method (GRAM (Sanchez and Kowalski, 1986)) and the Direct TriLinear Decomposition method (DTLD), both based on a generalized eigenvalue problem. All these algorithms will only be described in general terms as more details are available in the original papers.
Fitting the PARAFAC model presents numerous problems. One aspect affecting speed of convergence and retrieval of the underlying solutions is the condition number of the loading matrices, which reflects both collinearity and relative magnitude of the factors (Hopke et al., 1998; Kiers, 1998). Compression, which is outlined in Section 2.5, has been reported to be beneficial in this respect and has the added advantage of reducing the computational expense (Bro and Andersson, 1998; Kiers, 1998). Another problem is the so-called two factor degeneracy (2FD), i.e. the presence in the solution of two factors that are almost perfectly collinear but have opposite signs and almost cancel out each others’ contribution. The essential problem with 2FDs is that the degenerate factors can grow arbitrarily large while the loss function continuously (and very slowly) decreases. 2FDs may appear in the final solution as a consequence of e.g. a wrong estimation of the rank of the array, or some aspects specific to the data set at hand, (Kruskal et al., 1989, Paatero, 2000). A connection has been established between some of the possible causes of 2FDs in the final solution and the so-called swamps (Mitchell and Burdick, 1994; Rayens and Mitchell, 1997), which are sequences of iterations where the interim solutions contain features similar to 2FDs of increasing severity and the loss function decreases very slowly. When an (iterative) algorithm encounters a swamp, it either emerges from it (i.e. after an unpredictably large number of iterations, the loss function starts decreasing more rapidly again and the 2FD slowly disappears) or it reaches an earlier stop because one of the convergence criteria is suddenly met. Great attention has been given in the development of new methods capable of dealing effectively with swamps and more in general with the more frequent high collinearity case. Two examples are regularisation (Paatero, 1997; Rayens and Mitchell, 1997) and line search (Bro, 1998; Harshman, 1970) (described in Section 2.4).
Section snippets
DTLD
Ho et al., 1978, Ho et al., 1980, Ho et al., 1981 developed an algorithm called RAFA (Rank Annihilation Factor Analysis) for estimating the concentration of a chemical analyte in an unknown sample-matrix solely using the measurements of the unknown sample and of a pure standard. This property was coined the second-order advantage, as it is obtained by using the second-order or two-way structure of the individual sample measurements instead of vectorising the corresponding matrix. The
Experimental part
Nine algorithms are studied in this section using both simulated and real data: DTLD-GRAM, PARAFAC-ALS, ASD, SWATLD, dGN, PMF3, PARAFAC-ALS with compression, dGN with compression and PMF3 with compression. For space reasons the single implementations will not be discussed in detail; in general, the MATLAB 6.5 (www.themathworks.com) guidelines for improving performances were followed (e.g. maximising the use of built-in functions). DTLD and PARAFAC-ALS are part of the N-Way toolbox (Andersson
Conclusions
The tests on the nine algorithms showed quite clearly that the advantages of some of the recently proposed alternatives to PARAFAC-ALS were somewhat overestimated.
In particular, ASD appeared consistently inferior to the other iterative methods on both the simulated and the real data sets. Conversely, SWATLD showed a good capability of finding the real underlying components, albeit biased by the fact that it yields solutions with higher core consistency than the least squares one. The distortion
Acknowledgements
The authors are grateful for support from STVF through the ODIN consortium. They would also like to thank N.M. Faber for providing some of the m-files.
References (44)
- et al.
The N-way Toolbox for MATLAB
Chemometrics Intell. Lab. Systems
(2000) - et al.
Improving the speed of multiway algorithms part II
Compression. Chemometrics Intell. Lab. Systems
(1998) - et al.
A novel trilinear decomposition algorithm for second-order linear calibration
Chemometrics Intell. Lab. Systems
(2000) - et al.
Recent developments in CANDECOMP/PARAFAC algorithmsa critical review
Chemometrics Intell. Lab. Systems
(2003) - et al.
Three-way (PARAFAC) factor analysisexamination and comparison of alternative computational methods as applied to ill-conditioned data
Chemometrics Intell. Lab. Systems
(1998) - et al.
Alternating coupled matrices resolution method for three-way arrays analysis
Chemometrics Intell. Lab. Systems
(2000) A weighted non-negative least squares algorithm for three-way ‘PARAFAC’ factor analysis
Chemometrics Intell. Lab. Systems
(1997)- et al.
Two-factor degeneracies and a stabilization of PARAFAC
Chemometrics Intell. Lab. Systems
(1997) - et al.
Jack-knife technique for outlier detection and estimation of standard errors in PARAFAC models
Chemometrics Intell. Lab. Systems
(2003) - et al.
Analysis of the effect of crystal size and color distribution on fluorescence measurements of solid sugar using chemometrics
Appl. Spectrosc.
(2000)
Numerical Methods for Least Squares Problems
Multi-way analysis in the food industry
A new efficient method for determining the number of components in PARAFAC models
J. Chemometrics
Analysis of individual differences in multidimensional scaling via an N-way generalization of Eckart–Young decomposition
Psychometrika
Candelinc—a general-approach to multidimensional-analysis of many-way arrays with linear constraints on parameters
Psychometrika
Pseudo alternating least squares algorithm for trilinear decomposition
J. Chemometrics
Matrix Computations
A new algorithm to solve PARAFAC-model
Behaviormetrika
Application of method of rank annihilation to quantitative-analyses of multicomponent fluorescence data from video fluorometer
Anal. Chem.
Application of the method of rank annihilation to fluorescent multicomponent mixtures of polynuclear aromatic-hydrocarbons
Anal. Chem.
Simultaneous multicomponent rank annihilation and applications to multicomponent fluorescent data acquired by the video fluorometer
Anal. Chem.
Cited by (335)
Alternating cyclic vector extrapolation technique for accelerating nonlinear optimization algorithms and fixed-point mapping applications
2024, Journal of Computational and Applied MathematicsUsing an optimised neural architecture search for predicting the quantum yield of photosynthesis of winter wheat
2023, Biosystems EngineeringA novel estimation procedure for robust CANDECOMP/PARAFAC model fitting
2023, Econometrics and StatisticsTensor decomposition for learning Gaussian mixtures from moments
2022, Journal of Symbolic ComputationOnline Nonnegative and Sparse Canonical Polyadic Decomposition of Fluorescence Tensors
2022, Chemometrics and Intelligent Laboratory Systems