Visualising interactions in bi-and triadditive models for three-way tables

This paper concerns the visualisation of interaction in three-way arrays. It extends some standard ways of visualising biadditive modelling for two-way data to the case of three-way data. Three-way interaction is modelled by the Parafac method as applied to interaction arrays that have main effects and biadditive terms removed. These interactions are visualised in three and two dimensions. We introduce some ideas to reduce visual overload that can occur when the data array has many entries. Details are given on the interpretation of a novel way of representing rank-three interactions accurately in two dimensions. The discussion has implications regarding interpreting the concept of interaction in three-way arrays.

1. Setting the scene "… It is important that the final model or models should make sense physically: at a minimum, this usually means that interactions should not be included without main effects nor higher-degree polynomial terms without their lower-degree relatives.Furthermore, if the model is to be used as a summary of the findings of one out of several studies bearing on the same phenomenon, main effects would usually be included whether significant or not.Strict adherence to this policy makes it easier to compare the results of various studies and helps to avoid the apparent conflicts that occur when different fitted models with different sets of terms are used in each study."McCullagh and Nelder (1989, p.89)In this paper, we are concerned with three-way tables X with elements x ijk ði ¼ 1; …; I; j ¼ 1; …; J; k ¼ 1; …; K).Thus, the factors used to classify the three ways have equal status (sometimes called modes) while the body of the table contains values of a quantitative variable that may be regarded as a dependent variable -as classically typified by a threeway table arising from agricultural experiments with fertilizer treatments as factors and crop yield as the response.The factors are treated as categorical variables but if they happen to have numerical values, this may be taken into account when interpreting interactions.The primary emphasis is on the visualisation of interaction with a supplementary interest in estimation and interpretation seen in the light of the quotation from Ref. [26].To dispel any suggestion to the contrary, we emphasize that the quotation is not an expression of a mathematical fact but more an observation on how data can usually be expected to behave.In the psychometric literature, a three-way table is sometimes referred to as onemode three-way data [7,9,23] or, shorter, as (data) array, whereas in chemometrics the terminology tensor for X is more common.
Three-way tables are usually analysed by linear models containing additive terms representing main effects, two-factor interactions, and three-factor interactions.The number of factors can be readily extended to any number of "ways".The form of such models readily respects the [26] quotation.Note that with a dependent interval variable there is a fundamental need for at least one additive parameter to represent translation (e.g.Celsius to Fahrenheit).
For reference, and to establish notation, we list the basic results for additive models.The model is where the terms with a single suffix represent main effects, those with double suffices two factor interactions and g ijk represents contributions from three factor interactions.Some components of the interactions may be regarded as "error".The estimating equations are subsumed in the identity: where the expressions in braces in (2) estimate the corresponding parameters in (1).Note that we adopt the convention that a "hat" on the left-hand-side implies that the terms on the right-hand-side are parameter estimates, else they are the parameters themselves.The terms in (2) contribute to an orthogonal analysis of variance: where a; b; c are vectors of the main effects, D; E; F are matrices of the two-factor interactions and G 2 represents the sum-of-squares of the elements of the three-factor interaction.
When interactions have been estimated, there remains the problem of their interpretation.The terms in (2) represent overall contributions to each main effect and interaction.To help interpret overall representations of interaction, several simple approximations have been proposed.One possibility is to focus on the larger (positive or negative) terms.Another is to fit linear and quadratic polynomials to get, for example, linear Â linear Â quadratric estimates.Even the simpler of these can be difficult to interpret and, strictly speaking, such expressions are valid only when the classifying factors are numerical (like levels of fertilizer applications).
Another possibility is to fit product terms like a i b j .Products of two factors have bilinear regression interpretations and a nice geometrical representation that underpins useful visualisations of two-factor interaction.This possibility of biadditive modelling is discussed in Section 2. A biadditive model gives the best rank-r least-squares approximation to a two-way table/matrix but this optimal mathematical property should not necessarily be taken as an expression of an appeal to underlying substantive multiplicative effects.
In a parallel literature, models for analysing three-way data (summarised in Refs.[24,31] often include triple product terms like a i b j c k .Included are three-mode principal component analysis [34] and methods as the Candecomp [8] and Parafac models [21] (both models are equivalent and commonly denoted as the CP-model).A desirable computational requirement for fitting three-way multiplicative models is a universal algorithm for fitting a general canonical decomposition for three-way arrays.Such models are discussed in Section 3. It is clear that triple product terms may be potentially useful in many contexts and considered as a natural extension for representing triadditive interactions in a similar way that biadditive models may represent two-factor interactions.
In many psychometric and chemometric methods, the triple product term dominates the model, even to the extent of excluding lower order terms, thus not respecting the maxim of [26] cited at the start of this paper.This is because in psychometrics the methods are intended as generalisations of Principal Component Analysis and related methods that do not admit a dependent variable; such methods are beyond the scope of this paper.Nevertheless, triadditive terms may be used to approximate three-way interactions.In the following we exploit the fact that the Candecomp-Parafac algorithm can be useful for fitting three-way multiplicative interactions in three-way models.We explore the consequences for the McCullagh and Nelder dictum if this route is taken.Visualisation is important in the interpretation of biadditive interactions and we provide suggestions for its improvement: Appendix A discusses how to calibrate axes Appendix B provides details on optimising a parallel axis display of the interactions and Section 4 demonstrates these methods.Furthermore, we show how triadditive terms may be visualised and interpreted.
In the above, we have regarded the overall main effects and interaction terms in (2) as the definitive expressions of interaction.These may then be approximated as we have described, by linear, biadditive or triadditive estimates, perhaps including other parts of the interactions in an error term.For linear and biadditive estimates the procedure of estimating the biadditive part of each interaction, conditionally on the usual least-squares estimates of the linear part, usually turns out to be equivalent to unconditional estimation.However, this is not true for some of the biadditive models we discuss below and for triadditive models it is never true.Sections 2 and 3 briefly summarise some of the current insights in biadditive and triadditive models and discuss various ways of modelling and interpreting interactions using these models.These sections are not meant provide an exhaustive and complete overview of all knowledge on biadditive and triadditive models, as good sources for that already exist [24,31].Subsequently, biadditive and triadditive visualisations are constructed for an example from agricultural (Section 4) research.Although these visualisations are based on the Candecomp-model [8]; the visualisations can also be based on other techniques for analysing three-way arrays.Section 5 concludes the paper.

Biadditive models
In this section we summarise well-known results for biadditive models.This establishes notation that is needed for similar developments with triadditive models discussed in Section 3.

Biadditive models for two-way tables
For an I Â J table X with elements x ij the general biadditive model is: where a i and b j represent row and column main effects, and c ir and cjr ðr ¼ 1; …; RÞ model the multiplicative interaction.The error terms ε ij are assumed to be independently distributed with equal variances.Many classical models, such as Tukey's model for one degree of freedom for non-additivity [35], can be considered as special cases of a biadditive model.Alternative names under which (4) has appeared, are FANOVA (FActor ANalysis Of VAriance) [15] and AMMI (Additive Main effects and Multiplicative Interactions) [13].Also the GEMA-NOVA (Generalised multiplicative ANOVA) model (cf.[6]) is related.We prefer the neutral biadditive model terminology which is in line with general statistical usage [11].These authors were interested in biadditivity because they thought that substantive genetic effects were better modelled in multiplicative rather than additive terms.In general, model ( 4) is not fully identified.The simplest identification constraints for the general model are ensuring that the matrix P R r¼1 c ir cjr of interaction parameters of rank R is uniquely parameterised in the form of its singular value decomposition with singular values σ 1 ; …; σ R .
The analysis of variance corresponding to a two-way version of (4) is: where ρ ¼ rankðXÞ.

Biadditive models for three-way tables
Biadditive terms may be used to model interaction in three-way tables (cf.[16]).For an I Â J Â K table X with elements x ijk we may consider the following biadditive model: for i ¼ 1; …; I; j ¼ 1; …; J; k ¼ 1; …; K, where the ε ijk are the elements of the three-way error array E. Similar identification constraints to those already discussed for model (4) may be applied for the biadditive model (7) for three-way tables: R, together with the SVDs of the three biadditive interaction matrices as they occur in (2).The resulting analysis of variance is: where the singular values σ ps ðs ¼ 1; …; PÞ, σ qs ðs ¼ 1; …; QÞ and σ rs ðs ¼ 1; …; RÞ refer to the respective residual tables Z i , Z j and Z k defined as in (2), and σ 2 is the residual sum-of-squares obtained from all the singular values not included in the summations.The solution for the multiplicative constants is then obtained from the SVD of the two-way tables of residuals Z i , Z j and Z k .This is a simple generalisation that may be readily extended to tables of any number of "ways".
The choice of ranks P, Q and R can be made by ad hoc arguments, such as that rank 2 approximations can be visualised and communicated in an understandable way.Another option lies in more formal arguments such as obtaining corresponding degrees of freedom, for instance for the A Â B interaction, through the rule of thumb that (i) degrees of freedom for P ¼ 1; 2; …; minðIÀ 1; JÀ 1Þ should add up to that of the A Â B interaction in the two-way ANOVA table, (ii) the df for dimension i should be two less than that for dimension iÀ 1.According to ([19]; Section 6.3), this rule was first given by Ref. [29].A formal test of significance for P, Q or R ¼ 1 has been given by Ref. [10].Other approaches include cross-validation and using multiway extensions of the Kaiser criterion or scree plot [25], such as the DifFit procedure for Tucker3 models [33].See Smilde et al. (2004, Section 7.4) and Kroonenberg (2008, Section 8.5) for an overview of component-selection methods.

Visualisation for biadditive models
It is useful, especially when R ¼ 2, to plot the rows of c r ðr ¼ 1; …; RÞ to give I row-points and the rows of cr ðr ¼ 1; …; RÞ to give J columnpoints.In this biplot, the inner-product determined by a pair of points, one from each set, gives a visualisation of the corresponding interaction.This is a well-known form of biplot (see e.g.Ref. [19]).Another possibility is to present the rows as axes and the columns as points (or vice versa).The axes may be calibrated, making it trivial to find values of inner products.
Furthermore, axes may include markers for the row or column main effects.As we show in Appendix A, calibrated axes may be provided simultaneously for rows and columns and both sets of main effects may be included.In addition, the values of αþ β ¼ 1 (as defined in Appendix A) are at choice and λ-scaling is available (see Ref. [19]).In this way, a variety of equivalent representations, which may be regarded as items drawn from a toolbox, is available for presentational purposes.One may choose among the possibilities to represent only the more important interactions.Some examples are included in Section 4 of this paper.
The biplot representation of two-factor interactions is an attractive aid to interpretation.Also the biadditive model of three-way data can be visualised, now by three biplots, one for each biadditive term in (7).

Triadditive models for three-way data
For an I Â J Â K table X with elements x ijk , consider the following triadditive model: This model is an extension of ( 7) where the error array E is partitioned into a rank-S triadditive part G and a new error array E with, generally, a smaller sum of squared elements than that of (7).For identification, the usual zero-sum identification constraints may be applied to all the parameters but when applied to the triadditive parameters g is , gjs , e gks it has unexpected implications.This is because adding constants α, β, γ replaces the triadditive term by ðg is þ αÞðg js þ βÞð e gks þ γÞ which, on expansion, induces additional additive and biadditive terms.The additive terms may be absorbed into zero-sum main effects without affecting the form of the model.This is not so for the biadditive terms, where unabsorbable parts of the triadditive interaction contribute to the biadditive parameters, thus increasing their rank.Thus, this reparameterisation changes the form of the model.One consequence is that the least-squares estimates of the triadditive interaction parameters are not the same as the estimates conditional on the estimated main effects and biadditive interactions.Another, is that the usual orthogonal analysis of variance is not available.This position may be accepted and algorithms developed to fit the model but a more simple option is to fit the triadditive part conditional on the main effects and the saturated biadditive component of the model.That is, we fit the triadditive part of the model to the biadditive residual table: Triadditive interactions in (9) may be modelled in two ways.If z ijk represents a typical term of the interaction we may fix one factor, say i, and consider the I two-way tables fz 1jk g; fz 2jk g; …; fz Ijk g.Each of these tables may be fitted by a biadditive model and the results compared.This approach is consistent with the classical notion of interaction as a difference in response to a factor, or set of factors (here j and k), at different levels of another factor (here i).Of course, we may interchange the roles of i, j and k.The other approach is to fit a truly triadic model with the Candecomp-Parafac algorithm [8,21], minimising: We choose for this approach as it is a truly triadic approach.The approximation (11) may be viewed as the triadditive counterpart of the Eckart-Young theorem but lacking a nice known canonical decomposition.(See also [30]; which is said to be the first example of the SVD leastsquares property, albeit in a very different field from data analysis.)This approach is close to the classical approximation of interactions by orthogonal polynomials in linear models.Here we fit a biadditive approximation to the two-way interactions and a triadditive approximation to the three-way interaction terms of (1) and ( 2).The residuals from the triadic term contribute to the term (11), while the biadditive part contributes components what we denote by σ 2 in (8).In a good fit, these two components should be comparable giving some indication of stability and, when available, they may be compared with independent estimates of replication-error.From the statistical point of view we need some concept akin to that of degrees of freedom in linear models.What is known about this is summarised by Kroonenberg (2008, Section 8.4).Related to this is the concept of rank for three-way arrays (cf.[32] and Smilde et al. (2004, Section 2.6)).Triadditive rank is defined as the smallest value of R that gives an exact triadditive fit.The interaction array Z, with its zero marginals, generally has lower rank than the data array X [2].Since our focus lies on the visualisation of interactions, here we will not formally study rank properties of Z.

Visualisation for three-way data
As with the biaddittive model, when a rank R triadic model (11) has been fitted, there is interest in expressing the interaction in graphical form.In the rank one case ðR ¼ 1Þ, the points for u i1 ði ¼ 1; …; IÞ; v j1 ðj ¼ 1; …; JÞ; w k1 ðk ¼ 1; …; KÞ may be placed on separate orthogonal coordinate axes, which we shall label u, v and w.Then, u i1 v j1 w k1 is simply proportional to the volume of the tetrahedron with these three points on orthogonal axes and the origin as vertices (Fig. 1, left).
When R ¼ 2, the visualisation remains basically Euclidean in three dimensions and it may be interpreted in terms of tetrahedronal volume where the vertices of the tetrahedra are confined to the origin and three orthogonal planes (Fig. 1, right).The justification of this approach follows from the trilinear identity: (see also equation ( 4) in Ref. [1]).The rows of the determinant on the left hand side may be interpreted as giving the coordinates of three points, one in each of three orthogonal dimensions, while the right hand side gives a term in the rank two triadditive model.[1] give further details and show that, without loss of information, this representation may be shown in two dimensions to give a visualisation which resembles a biplot, with one set of K coplanar points and two sets of calibrated axes representing the remaining IJ factors.Thus, it is a 'triplot' rather than a biplot (see e.g.Ref. [19]).Whilst [1] explain the technical construction of these triplots, instruction on how to interpret these triplots, especially in the case of interaction arrays, is lacking.We provide such explanation Section 4. That rank-two trilinear interactions may be shown in two dimensions, gives them similar status to interactions for bilinear models and makes direct three-dimensional tetrahedral visualisations unnecessary.We believe that this is a major step forward.
Because volume is invariant to orthogonal transformations, one may deduce from the above three-dimensional representation that the parameters of rank 2 triadditive models are determined only up to arbitrary orthogonal rotations in three dimensions.This degree of arbitrariness is similar to that found in biadditive models where inner-products or, equivalently, areas [18] rather than volume are the invariants.Orthogonal transformation is not the only invariant for rank 2 triadditive models; for example, provided αβγ ¼ 1, we could also scale the three axes by α, β, γ, respectively, without affecting volume.Our experience is that visualisation that yields easiest interpretation is achieved when α, β and γ are chosen such that P Higher rank solutions to biadditive models can be shown as threedimensional images or by exhibiting several planar cross-sections of the higher-dimensional space.Neither of these is satisfactory and it is the two-dimensional approximations that are by far the most important.Nevertheless, it is interesting to see what progress can be made with representing triadditive terms for R ¼ 3. We could show this as three volumes, each of unit rank ðu i1 v j1 w k1 Þþ ðu i2 v j2 w k2 Þþ ðu i3 v j3 w k3 Þ, or of two volumes, one of unit rank and the other of rank two ðu i1 v j1 w k1 Þþ ð After equation ( 12), we explained how this determinant is equal to the volume of a single tetrahedron.Using analogous arguments, equation (13) equals three times the sum of the volumes of the tetrahedra designated by the three separate determinants.We have seen that when R ¼ 1, the three axes share a common origin and when R ¼ 2 the three planes share an orthogonal set of axes.When R ¼ 3 we retain the orthogonal axes u, v, w but, as is shown by (13), it is the projections of the points ðu i1 u i2 u 13 Þ, ðv j1 v j2 v j3 Þ, ðw k1 ; w k2 ; w k3 Þ onto the ðvwÞ, ðwuÞ, ðuvÞ planes that determine the vertices of the operative tetrahedra.The display of Fig. 2 shows that this visualisation is on the boundary of what is relevant for practical purposes.
Interestingly, when R ¼ 4 we may write ðu i1 v j1 w k1 þ u i2 v j2 w k2 Þþ ð u i3 v j3 w k3 þ u i4 v j4 w k4 Þ the sum of two rank 2 terms each representable by a single tetrahedron.However, adding even two volumes is not acceptable.Fig. 1.Rank R ¼ 1 (left) and R ¼ 2 (right) fits to the triadditive terms for Blackman's data.Blue triangles refer to Factor A (the levels of nitrogen), red circles to Factor B (trial sites) and brown squares to Factor C (varieties).For the R ¼ 1 fit, all points lie on orthogonal axes, for the R ¼ 2 fit, they all lie on orthogonal planes.The tetrahedra corresponds to the interaction "low nitrogen Â Edinburgh Â Kinsman".(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) We conclude that rank two representations of triadditive models are at the limits of useful graphical representation; higher ranks are possible but are impracticable.

Application: response of wheat varieties to the application of nitrogen fertilizer at different sites
Blackman et al. [4] studied the effect of the application of nitrogen fertilizer to several varieties of winter wheat of contrasting height grown at different trial sites.The data consists of a fully crossed design with the following three factors: A Rate of nitrogen application (I ¼ 2 levels, low and high) B Trial sites (J ¼ 7 locations in the United Kingdom) C Variety (K ¼ 12 different varieties).
The names of the factor levels for factors B and C are given in Table 1.A fourth factor, indicating whether the variety is either 'conventional' (varieties Cappelle, Ranger, Huntsman, Templar, and Kinsman) or 'semidwarf' (varieties Fundin, Durin, Hobbit, Sportsman, TJB295/95, TJB325/464, and Hustler), is excluded from our analysis as its obviously not a crossed factor.The dependent variable is grain yield, measured in grams per square meter.One trial site (Edinburgh) is located in Scotland, the six others are all located in Cambridgeshire and Oxfordshire, England.In this section we are mainly concerned with visual presentation of interactions rather than with substantive analysis.

Biadditive visualisation
First, we fit the biadditive model as outlined in Section 2.2.Table 2 shows that factor B, Trial Site, is the most important main factor and the interaction between A, rate of nitrogen application, and B is the most important two-way interaction.The main effects constitute 84% of total variation in grain yield, the two-way interactions 14% and the three-way interaction 2%.
Table 2 also provides the sums-of-squares of the low-rank approximations to the two-way interaction between B and C, according to Equation (7) with approximations to degrees of freedom as suggested by Ref. [29] (see Section 2.2).Since Factor A has two levels, df A ¼ 1.Hence, this low-rank approximation does not apply to the AB and AC interactions: the full-rank approximation is already of the lowest rank possible.Were df A > 1, the treatment of the low-rank approximations to interactions AB and AC would have been analogous to that of BC.Corresponding to BC, most information, 79%, is captured in the first two dimensions.
For this data, two-dimensional biplots of interactions with factor A Fig. 2. A demonstration of a three-way interaction for the rank R ¼ 3 fit to triadditive terms, for a constructed example with conveniently chosen coordinates.All levels of all factors now have coordinates that are not restricted to (orthogonal) axes nor planes.The three points A, B, C, are projected onto the vw, uw and uv planes, respectively.Subsequently, the polygon OA 0 B 0 C 0 is constructed (left).Similarly, polygons are constructed for projections onto uw, uv and vw (middle) and uv, vw and uw (right).The interaction ABC is proportional to the sum of the volumes of the three tetrahedra thus obtained.are not relevant: A has only two levels, thus the interactions are onedimensional.Fig. 3 gives a series of equivalent biplots for interaction BC.In all cases, interpretation is through evaluating inner-products, either directly or indirectly.Fig. 3a visualises the interaction BC in the conventional way.Often, the points are connected to the origin and perhaps endowed with arrows.The interactions of the varieties at the trial site in Edinburgh clearly deviate from those at the six English sites.A closer examination confirms that the McCullagh and Nelder dictum, cited at the beginning of this paper, holds.Interestingly, no clear distinction in interaction can be found between the regular and the semidwarf varieties.Fig. 3a is useful for assessing global patterns in the data but no numerical values can be read off.For this, calibrated axes are needed.The technicalities behind the construction of such axes simultaneously for sites and varieties is explained in Appendix A. The biplots in the other panels make use of such calibrated axes.They give the same information as Fig. 3a, but in 3b and 3c, while varieties continue to be represented by points, trial sites are represented by calibrated axes.The Figures show exclusion (3b) vs. inclusion (3c) of main effects but otherwise are identical; thus Fig. 3b displays the biadditive interactions after the main effects have been partialed out, whereas these are still included in Fig. 3c.The only difference between panels (b) and (c) is the calibration of the axes: where in panel (b) all axes have value 0 at the origin, this is not the case in panel (c).Fig. 3d shows calibrated axes for both varieties and sites.Note that a variety projected onto a site-axis gives the same calibration as the same site projected onto the corresponding variety axis.For example, consider variety Sportsman and site Edinburgh (as shown in Fig. 3d): The projection of Sportsman onto Edinburgh is À 30:33 g/sqm, which is equivalent to the projection of Edinburgh onto Sportsman.The same holds for all other pairs of sites and varieties.
Thus, with Fig. 3a inner products are not needed to rank varieties within a site or to rank sites growing the same variety but it is difficult to make numerical comparisons between sites and varieties.This problem is reduced by using the calibrations in Fig. 3b and c but the calibration markers tend to lead to problems of visual overload.
Fig. 4 is a compromise which preserves most of the useful information and is easy to use.Essentially, it consists of taking the axes of one set of calibrations (say, the seven sites) and laying them horizontally on successive lines with a common origin in a so-called parallel coordinate plot (cf.[22]).The different interval of calibration on each axis will be clear Fig. 3. Visualisation of the rank R ¼ 2 approximation to the biadditive interaction between factors B and C. First (a) a regular biplot is given (with þ indicating the origin; trial locations are denoted by '⋅' and varieties by a triangle), followed by a biplot where trial sites have been replaced by calibrated axes; where calibration is done with μ ¼ 0 (b) and μ ¼ bj (c).Finally, panel (d) shows a biplot where both varieties and trial sites are represented by calibrated axes.Abbreviations in bold font correspond to trial sites.See Table 1 for the full labels for the abbreviations.
and can be removed by normalising each line to have an equal interval of calibration.Then, the calibration markers on the successive lines can be removed and replaced by a single calibrated axis applicable to all sites, as shown in Fig. 4. Parallel coordinate plots date back to (at least) the 17th century [12] and gained popularity through the work of Inselberg in the past four decades [22].The usage of parallel coordinate plots in the context of three-way analysis is not new (cf.[24]; p. 400), but this paper is, to our knowledge, the first that employs parallel coordinate plots to visualise three-way interactions.
In this example, there is no logical ordering for the sites.Rather than the alphabetical ordering in Fig. 4, any other of the J! ¼ 5040 orderings can be used.Although all variations provide exactly the same information, some allow for easier interpretation because there is less 'clutter', such as fewer line-crossings.When J is not too large, one can resort to manual reordering but for larger values of J, an automated procedure is preferable.We propose such a procedure, based on correspondence analysis (cf.[20]).Technicalities of this procedure are provided in Appendix B and Fig. 5 shows the optimal ordering.This figure provides exactly the same information as Fig. 4 but is easier to interpret.Now, the performance of every variety at each site may be readily compared directly.The main effects may be included if desired but we have not done this with these data because of the disproportionate main effect of Edinburgh (a value þ 232 grams per square meter; whereas the other six sites have main effects between À 113 and þ 46 grams per square meter).Of course, an equivalent procedure can be used for the varieties rather than for the sites.

Triadditive visualisation
Having eliminated all main and multiplicative effects according to (10), Candecomp-Parafac approximations of different rank were fitted to b Z. Table 3 displays the breakdown of the ABC-interaction SS of 49812 into approximations of rank 1 to 6. Rank 2 and 3 approximation explain 63% and 78% of the variation in grain yield, respectively.Thus, visualisations on the basis of these approximations will yield useful insight into the structure of b Z. Fig. 1 visualises the rank 1 and 2 approximations to the three-way interaction term.The highlighted interaction in each figure is that between a low rate of nitrogen application, trial site Edinburgh variety Kinsmen.The data have been scaled by α, β, and γ in such a way that kr because this provided a satisfactory visual setting for interpretation (the dispersion in the three dimensions is made the same; without affecting the volume of the tetrahedra).Fig. 1(left) visualises the rank R ¼ 1 approximation and shows how, by looking at tetrahedra, one can quickly get an impression of a specific triadditive interaction.Fig. 1(right) displays the visualisation for R ¼ 2, via three biplots for the three factors.Each biplot may be visualised in one of the three orthogonal planes (uv, uw and vw) through the origin.The interaction of interest remains proportional to the volume of a single tetrahedron.
A three-dimensional rank R ¼ 3 visualisation is crossing the line of useful application (as outlined in Section 3.2).It is much more simple to look at a two-dimensional visualisation through a so-called triplot.Here, we use the term triplot in the same way as in Ref. [1].According to [37]; the use of the term 'triplot' in this context dates back to [3].[27]; (p.50) also use this term, in a slightly different context related to biplots.Furthermore, the term triplot is also used in for triangular diagrams, which is a unrelated field of work.As the contexts are fully different, this should not cause confusion.In this display, each IJ combination of levels is represented by a calibrated axis while each level of K is represented by a point (for the Blackman data we have I ¼ 2, J ¼ 7 and K ¼ 12).Thus, an axis combining a Site (e.g.Edinburgh) with the Higher Level of Nitrogen (e.g.denoted by H) might be labelled "Edinburgh H".While another axis might be labelled "Edinburgh L", where L denotes a Lower Level of Nitrogen.Because I ¼ 2 the two Edinburgh axes coincide, as do the axes for all other sites.Fig. 6 displays such a triplot for the interaction array b Z.All IJ combinations of nitrogen-rate and trial-site are displayed by calibrated axes but only J ¼ 7, rather than IJ ¼ 14, distinct axes are necessary.We use the convention that the label "Edinburgh" denotes not only the site but also the high rate of nitrogen.The marker for the low rate of nitrogen in Edinburgh could be placed at the other end of the axis but it is superfluous.The markers on the axis are positive in the section between the label (e.g.Edinburgh) and the origin and negative away from the origin; the opposite holds for the implicit Edinburgh Â low marker.All K ¼ 12 varieties are displayed as points.
By projecting variety k onto the combined rate-site axes, triadic ranktwo interactions can be read directly off the calibrations to give the estimation of the term for variety k and all combinations of levels of i and j.A 'projection circle' on the diameter determined by the point displaying the variety and through the origin, gives a convenient way of accessing all projections of the variety onto the J ¼ 7 rate Â trial axes together with their associated calibrations.Such projection circles have been Fig. 4. For all 7 trial sites the projections of the varieties (with μ ¼ 0) are given in this single-axis diagram.A single calibrated axis applies to all sites.Abbreviations in bold font correspond to trial sites.See Table 1 for the full labels for the abbreviations.introduced in the context of biplots in Refs.[17] and [19]; and in the context of triplots in Ref. [1].Fig. 6 shows this visualisation for the Blackman data where the point 'Cap' represents the variety Cappelle.Sites Begbroke, Trumpinton, and Earith give positive interactions, Boxworth about zero and sites Craftshill and Fowlmore give negative interactions between Cappelle and high levels of Nitrogen.The signs are reversed for interaction with low levels of Nitrogen.The intervals of calibration may be refined at will but here we give only a marker 10 g per square meter.It is important to keep in mind when interpreting these triadic interactions that these are the values after main and biadditive effects (accounting for 98:07% of variation, see Table 2) have been partialed out: The triplot focuses on the remaining 1:93% of variation and large differences in the triplot denote, in this example, only relatively small differences on an overall level.
Note that (a) although this two-dimensional visualisation may look like a biplot it involves three factors and thus is really a triplot and (b) it remains valid when I > 2, though without the simplifications of coincident axes, which might introduce visual overlad.Both [1] and [37] provide examples of such a triplot with I ¼ 3.

Discussion
Essentially, our approach is to adopt the usual linear models for representing main effects, two factor interactions and three factor interactions.The two factor interactions may be approximated by multiplicative bilinear terms and the three factor interactions may be approximated by multiplicative trilinear terms.In the bilinear case the approximations have standard least-square estimates, based on singular value decompositions, but in the trilinear case, we propose that the estimates be conditioned on the residuals from the saturated bilinear model.In principal, it would be possible to do a full unconditional least-squares solution but the conditional approach is easier and avoids difficulties with constraints.In the bilinear case identification constraints are not substantive but in the full trilinear case there is a troubling substantive interaction between the bilinear and trilinear parameter constraints.This problem is avoided when using the conditional method of analysis.The suggestion of applying a triadditive model to three-way residuals has also been made by Ref. [36]; who used a Tucker3 model rather than the Candecomp-Parafac model.[37] use an orthogonal Parafac decomposition as basis for their visualisations and arrive at figures similar to Fig. 3a on basis of geometric arguments.
We do not claim that the biadditive and triadditive models are substantive models per se, although in certain applications they could be.We make use of biadditive and triadditive models as a useful framework to base our visualisations on.A special virtue of biadditive models is the way that they lend themselves to simple biplots for visualising the interactions between rows and columns of the two classifying factors.This is particularly useful when biadditive interactions are adequately approximated in two dimensions and in this paper we have proposed how these biplots may be enhanced.It would be helpful if similar visualisations were available for triadditive interactions and, following [1]; we demonstrate how two-dimensional triplots for rank-two tridimensional interaction tables may be formed, in which all three-dimensional tetrahedronal information is retained.When one factor is at two levels, some striking simplifications occur, as is demonstrated in Section 4. When I, J, K > 2, there is a risk of visual overload.Such overload can be reduced through smart choices, constructing parallel coordinate plots (such as Fig. 5) for triplots and through interactivity.For instance, markers for calibrated axes could be displayed only when a certain axis is selected, and one could use tick boxes to select which of the IJ axes and K points should be shown.Finding out which approaches work best against visual overload is an interesting path for future research.Furthermore, additional smart choices w.r.t.calibration, (arbitrary) rotation and use of colour can enhance the interpretability [5].
Rank two triplot displays in two dimensions seem to be at the bounds of practical utility.Attempts to visualise rank-three displays in three dimensions are not promising.Fortunately, as with biadditive biplots, it is the rank-two displays that are the most useful and rank two tridimensional visualisations show similar promise.
At the outset of this paper we drew attention to the adage of McCullagh and Nelder about interactions being predicated on their main effects and lower orders of interaction.Our approach of conditioning three-order interactions on main effects and two-factor interactions is in accord with the adage.Nevertheless, at several points in our discussion we have seen that main effects and lower order interactions may be ignored when fitting a higher-order interaction.Sometimes, but not always, it seems that, as with Tukey's model of non-additivity, additive terms may be absorbed in equivalent multiplicative parameterisations of the model.It seems to us that it is always wise to keep the McCullagh and Nelder adage in mind but there are occasions, especially with multiplicative relationships, when it is less persuasive.

Software
All computations have been performed in R, using self-written code (available upon request from the first author).For the Candecomp-Parafac decompositions the R-package ThreeWay [14] has been used.For the correspondence analyses, the R-package ca [28] has been used.Fig. 6.Triplot.Axes represent rates and sites.Since I ¼ 2, the axes for low and high rates coincide.Site labels are placed at the positive end of the 'high'-axis.The signs are reversed for predicting interactions to the low rate of nitrogen.A single positive and negative marker is shown on each axes; these correspond to 10 g per square meter grain yield.A projection circle through Cappelle cuts the axes at the calibrations corresponding to the seven calibration points giving the rank-two triadic interactions.These are positive or negative, depending on whether they occur on the same or opposite side of the origin as the site label.Abbreviations in bold font correspond to trial sites.See Table 1 for the full labels for the abbreviations.
. With this degree of arbitrariness, we see little point in paying much attention to the estimated values of the parameters u, v, w but rather to focus on the invariants, such as volume and the actual fitted values b x ijk and b z ijk .

Fig. 5 .
Fig. 5.A similar visualisation as Fig. 4, now with the ordering of sites according to the correspondence analysis algorithm outlined in Appendix B.

Table 1
Overview of the trial sites (top) and varieties of wheat (bottom) of the Blackman data set, as well as the abbreviations used in later visualisations.

Table 2
(7)VA-breakdown of Blackman's data.The SS for the rows with specific values for r are obtained via(7).The corresponding degrees of freedom are obtained via the rule of thumb explained in Section 2.2.(Note that, since dfA ¼ 1, no similar breakdown for the AB and AC interaction is possible.)

Table 3
Candecomp-Parafac approximations to the three-way interaction ABC for different ranks S for Blackman's data.