Recent Advances in Visualizing Multivariate Linear Models

This paper reviews our work in the development of visualization methods (implemented in R) for understanding and interpreting the effects of predictors in multivariate linear models (MLMs) of the form Y = XB +U , and some of their recent extensions. We begin with a description of and examples from the Hypothesis-error (HE) plots framework (utilizing the heplots package), wherein multivariate tests can be visualized via ellipsoids in 2D, 3D or all pairwise views for the Hypothesis and Error Sum of Squares and Products (SSP) matrices used in hypothesis tests. Such HE plots provide visual tests of significance: a term is significant by Roy’s test if and only if its H ellipsoid projects somewhere outside the E ellipsoid. These ideas extend naturally to repeated measures designs in the multivariate context. When the rank of the hypothesis matrix for a term exceeds 2, these effects can also be visualized in a reduced-rank canonical space via the candisc package, which also provides new data plots for canonical correlation problems. Finally, we discuss some recent work-in-progress: the extension of these methods to robust MLMs, development of generalizations of influence measures and diagnostic plots for MLMs (in the mvinfluence package).


Introduction
Multivariate response data are very common in applied research. A research outcome (e.g., depression, job satisfaction, academic achievement, childhood attention deficit hypheractivily disorders-ADHD) may have several observed measurement scales or related aspects. In this framework, the primary concern of the researcher is to ascertain the impact of potential predictors on two or more response variables. For example, if adolescent academic achievement is measured by reading, mathematics, science, and history scores, do predictors such as parent encouragement, socioeconomic status and school environmental variables affect all of these outcomes? And, do they affect these outcomes in similar or different ways?
Statistically, this is easy, because the classical univariate response model for ANOVA and regression, y = Xβ + u, with u ∼ N (0, σ 2 I) generalizes directly to an analogous multivariate linear model (MLM), Y = XB + U for multiple responses, Y = (y 1 , y 2 , . . . , y p ). Happily as well, hypothesis tests for the MLM are also straight-forward generalizations of the familiar F and t tests for univariate response models.
However, due to the possibly high-dimensional nature of the data, visualizations for multivariate response models are not as straightforward as they are for simpler, univariate models, either for understanding the effects of predictors, model parameters, or for standard model diagnostics, such as influence plots. Consequently, the results of such studies are often discussed solely in terms of coefficients and significance, and visualizations of relationships are presented for one response variable at a time. This is unfortunate, because visualization affords us a window to truly see what is happening in our data, and can aid in interpretation, yet the univariate graphical methods cannot show the important relations among the multivariate responses. The aim of this paper is to present a few methods that are currently employed for the visualization of high-dimensional data, and then review a series of methods we have worked on to apply these methods to MLMs. These extensions involve the Hypothesis-Error (HE) plot framework and the use of rank reduction, and can be utilized under a range of circumstances.
The plan of this paper is as follows: In Section 2, we illustrate three basic techniques that can be utilized in presentation graphics to enhance the usefulness and applicability of traditional statistical graphic displays for multivariate data. Next, Section 3 describes the framework for generating HE plots, that are useful for visualizing the relationships found in MLMs. The idea of low-dimensional visualization introduced in Section 2.2 is extended to MLMs in Section 4. Finally, in Section 5, we describe some recent extensions of these methods, designed to make graphical methods for MLMs more closely approximate the range of techniques available for univariate response models.

Basic Approaches to Visualizing Multivariate Data
Attempts to visualize multivariate data using static graphs almost always have to proceed by reducing the complexity of the data to what can be shown on a 2D page or screen image. The most common methods involve a pairwise display of bivariate relationships in a scatterplot matrix and various dimension-reduction techniques, as we illustrate below.
An important adjunct to these basic ideas is the use of plot annotations to highlight important features of the data or the relationships among variables that might not be seen otherwise or as readily. Another consequential idea is the use of plot summaries to suppress some details of the data that obscure the features we want to see.
To frame our approach in a general context, we begin with some examples of visualizations of the well-known iris dataset, made famous by Fisher (1936). This contains data on measurements of four characteristics of iris flowers (sepal length and width, and petal length and width), for three species of the flower (setosa, virginica, and versicolor).

Bivariate Views: Scatterplot Matrices
One basic method often employed when investigating the relationship between multiple response variables is to generate a scatterplot matrix, which features every possible pairwise relationship between the variables.
These allow us to notice trends between the response categories, and can be annotated to include more information, such as via shading to differentiate between levels of a categorical variable, data ellipses for visualizing confidence intervals, or through the addition of simple regression lines. In each case, the analyst gets to observe how each variable relates to the others. For example, see Figure 1 for a scatterplot matrix concerning the four variables in the iris dataset, annotated with 68% data ellipses and simple regression lines for each type of flower. However, this pairwise display does not yield any information on how the variables interact within a higher dimensional space. With three variables, we can no longer simultaneously plot and print the simple raw data on a static graphic, and must utilize software to view an interactive data cloud that can be rotated. Yet, with more variables, even this approach fails.

Low-Dimensional Views: Biplots
One way to understand how three or more dependent variables are related at once in a static graphic is to map the data into a low-dimensional space that preserves as much information in the data as possible. One such technique is the biplot (Gabriel 1971, Gabriel 1981, Gower, Lubbe & Roux 2011, which projects the data into a 2D (or 3D) space accounting for the largest amount of variance in the data. The name "biplot" comes from the fact that this technique displays both the observations and the variables in a single, joint plot.
The standard biplot is essentially based on a principal components (or singular value) decomposition, plotting the component scores for the observations on the first two (or three) components. On such a plot, vectors (component loadings) are drawn for each variable that indicate the relationship they hold with each of the components. For example, see Figure 2. In this plot, one can observe that petal length and width are strongly related to differentiating between the species on the first component (plotted along the X axis), while sepal width differentiates between species on the vertical axis. These two dimensions account for 95.8% of the variance of the four iris variables, ensuring the adequacy of this visual summary. In less structured data, when the two dimensions account for less variance, the biplot still provides a useful 2D summary between the data, the variables, and the strongest available components. See Gower et al. (2011) for a wider presentation on this topic.

Visual Summaries: Data Ellipses
The data ellipse (Monette 1990) (or concentration ellipse, Dempster, 1969, Ch. 7) provides a remarkably simple and effective display for viewing and understanding bivariate relationships in multivariate data. The data ellipse is typically used to add a visual summary to a scatterplot, indicating the means, standard deviations, correlation, and slope of the regression line for two variables. See  for a complete discussion of the role of ellipsoids in statistical data visualization.
Formally, for two variables, Y 1 , Y 2 , the sample data ellipse E c of size c is defined as the set of points y = (y 1 , y 2 ) T whose squared Mahalanobis distance, D 2 (y) = (y −ȳ) T S −1 (y −ȳ), from the means,ȳ, is less than or equal to c 2 , where S is the sample variance-covariance matrix, When y is (at least approximately) bivariate normal, D 2 (y) has a large-sample χ 2 2 distribution (χ 2 with 2 df), so taking c 2 = χ 2 2 (0.68) = 2.28 gives a "1 standard deviation bivariate ellipse", an analog of the standard intervalȳ ± 1s, while c 2 = χ 2 2 (0.95) = 5.99 ≈ 6 gives a data ellipse of 95% coverage. A bivariate ellipse of ≈ 40% coverage has the property that its shadow on the y 1 or y 2 axes (or any linear combination of y 1 and y 2 ) corresponds to a univariateȳ ± 1s interval. These ideas generalize readily to p-dimensional ellipsoids, and we will use the term "ellipsoid" below to cover all cases.
Thus, in Figure 1, the data ellipses show clearly how the means, variances, correlations, and regression slopes differ systematically across the three iris species in all pairwise plots. The iris setosa flowers (in blue) are not only smaller on all variables than the other species, but its variances are also smaller, and the correlations differ from those of the other species.
Similarly, in the reduced-rank principal component analysis (PCA) view shown in Figure 2, you can see that the component scores are uncorrelated for setosa but slightly negatively correlated for the other two species. The data ellipse for the total sample (ignoring species), shows that the two components, PC1 and PC2, are of course uncorrelated, as guaranteed in PCA. The variable vectors in this plot show that PC1 is largely determined by three of the variables while PC2 is determined mainly by sepal width.

Hypothesis-Error (HE) Plots
Hypothesis-Error (HE) plots build upon the idea of the data ellipse, but attempt to also display indicators of significance and effect size for predictors (Friendly 2007), by visualizing hypothesis and error covariation as ellipsoids. These plots can be generated using the heplots package  in R (R Core Team 2013), as shown in the Appendix, and, by default, will display significance in terms of the Roy's largest root test.
The MLM we want to visualize here is given by: for p responses, Y = (y 1 , y 2 , . . . , y p ), and q regressors in X. The regressors can comprise quantitative predictors, factors (represented as dummy variables or contrasts), interaction terms, or any other term (e.g., a polynomial or spline) that can be represented within the framework of the general linear model.
The essential ideas here are that: • Every multivariate hypothesis test is carried out by a multivariate analog of the general linear test, H 0 : C h×q B q×p = 0 h×p , where the matrix of constants C selects subsets or linear combinations (contrasts) of the coefficients in B to be tested.
• Every such hypothesis gives rise to (p × p) matrices, H and E that are the multivariate analogs of the familiar sums of squares, SS H and SS E , used in univariate F tests.
• The univariate F test statistic, F = SS H /df h SS E /dfe has a direct multivariate analog in terms of the latent s = min(p, df h ) non-zero latent roots, The various multivariate test statistics such as Wilks' Λ, the Pillai and Hotelling trace criteria, and Roy's maximum root test are all functions of the λ i that reflect different geometric properties of the size of the H ellipsoid relative to the size of the E ellipsoid. The statistical and geometric details are described in Friendly (2007) and .
An animated display of these ideas and the relations between data ellipses and HE plots can be seen at http://www.datavis.ca/gallery/animation/manova/.

Visualizing H and E (Co)variation
A standard ellipsoid representing residual (error) variation reflected in the E matrix is simply the data ellipse of the residuals in U , scaled as E/df e . In an HE plot, we typically show this as an ellipse of 68% coverage, but centered at the overall means of the variables plotted, so that we can also show the means for factors on the same scale. This also allows you to "read" the residual standard deviation as the half-length of the shadow of the E ellipsoid on any axis.
An ellipsoid representing variation in the means of a factor (or any other term reflected in (3) in the H matrix is simply the data ellipse of the fitted values for that term. Scaling the H matrix as H/df e puts this on the same scale as the E ellipse, as shown in the left panel of Figure 3. We refer to this as effect size scaling, because it is similar to an effect size index used in univariate models, e.g., ES = (x 2 −x 2 )/s in a two-group, univariate design. The geometry of ellipsoids and multivariate tests allow us to go further with a re-scaling of the H ellipsoid that gives a visual test of significance for any term in a MLM, simply by dividing H/df e further by the α-critical value of the corresponding test statistic. Among the various multivariate test statistics, Roy's maximum root test gives H/(λ α df e ) which has the visual property that the scaled H ellipsoid will protrude somewhere outside the standard E ellipsoid if and only if Roy's test is significant at significance level α. For these data, the HE plot using significance scaling is shown in the right panel of Figure 3.
You can interpret the HE plot by recalling that they reflect data ellipsoids of the fitted values (group means, here) and the residuals. The direction of the H ellipsoid relative to that of E reflects the linear combinations of the response variables shown that depart from the null hypothesis.
In this example, the iris data has p = 4 dimensions, but with three groups, df h = 2, so the H and E ellipsoids all have s = min(p, df h ) = 2 non-zero dimensions. To see these relations for all variables together, it is easy to use a scatterplot matrix format, as shown in Figure 4. The plot shown in Figure 3 is just the panel in row 1, column 2, and the entire plot in Figure 4 is just an alternative visualization of the data-based scatterplot matrix shown in Figure 1, focussed on the evidence for differences among the species means. It is now easy to see the relationships among all four variables that are reflected in the multivariate tests of H 0 : no differences among species means. For all variables except sepal width, the variation of the species means is essentially one-dimensional; however, sepal width has an opposite pattern to the others.
As a supplementary example, let's look at the Romano-British pottery dataset (Tubb, Parker & Nickless 1980). This data pertains to 26 ancient pottery samples found at four kiln sites in Britain (Ashley Rails, Caldicot, Isle of Thorns, and Llanedryn), and their chemical makeup. Each sample was quantified in terms of it's aluminum (Al), iron (Fe), magnesium (Mg), calcium (Ca), and sodium (Na) content, which naturally yields a one-way multivariate analysis of variance (MANOVA) design with 4 groups (site) and 5 responses (chemical composition). The primary question posed by this data is: Can the chemical composition of the samples differentiate the sites? And, how can we understand the contributions of the chemical elements to such a discrimination?
Using the HE plot framework, we can visualize each pairwise combination of the chemical compounds, and assess the sites in terms of effect or significance scaling, see the left panel of Figure 5. In this view, we can see that there does appear to be significant separation between the locations, with the Caldicot and Llanedryn being relatively high in iron but low in aluminum, while the Ashley Rails and Isle of Thorns samples being the opposite. Further, we can utilize this approach to decompose a multi-degree of freedom hypothesis test into a set of orthogonal contrasts, exactly as the univariate SS H may be decomposed in an ANOVA. Each subhypothesis matrix with rank greater than 1 will have hypothesis ellipse, while those of rank 1 will plot as a vector in the HE graphic. In this example, the overall 3 df hypothesis tests the equality of the four Site mean vectors. Also overlaid on this plot is the 2 df sub-hypothesis test, which is the contrast between the average of Caldicot and Isle Thorns against Ashley Rails. This in turn can be decomposed into two 1 df tests for each of Caldicot and Isle Thorns versus Ashley Rails, in which only the former is significant. The hypothesis ellipsoids in this plot have the interesting visual property that they form conjugate directions with respect to the ellipsoid for the joint test, provided the sub-hypotheses are statistically independent. More details about this approach can be found in .

More General Models
The HE plot framework extends quite naturally to all cases of the general multivariate linear model. This includes multiple MANOVA models with two or more factors (and their interactions), multivariate analysis regression (MMRA), models with mixtures of factors and quantitative regressors (MANCOVA) (Fox, Friendly & Weisberg 2013) and repeated measure designs (Friendly 2010). More importantly, HE plots provide a visualizations of such complex models from which many features can be "read" far more directly than from the collection of many numerical tables that they summarize.
All hypothesis tests for these models can be formulated as special cases of the general linear test, H 0 : C h×q B q×p = 0 h×p , giving rise to H and E matrices as indicated in (3) and (4). For example, in a MMRA model, the test of the multivariate hypothesis H 0 : B = 0 that all predictors have no effect on any responses is given by specifying C = I q×q , while the multivariate test for the i th predictor is given by using C = (0, 0, . . . , 1, 0, . . . 0), with a value of 1 in position i.
We illustrate the flexibility of these models and HE plots using data from an experiment by William Rohwer (Timm 1975, Table 4.7.1) on kindergarten children (n = 37 of low socioeconomic status (SES), n = 32 of high SES) designed to examine how well performance on a set of paired-associate (PA) tasks can predict performance on measures of aptitude and achievement: the Peabody Picture Vocabulary Test (PPVT), a Student Achievement Test (SAT), and the Raven Progressive matrices test. The PA tasks varied in how the stimuli were presented, and are called named (n), still (s), named still (ns), named action (na), and sentence still (ss). A simple MANCOVA model, allowing different intercepts (means) for the SES groups, but assuming that the regression slopes for the PA tasks are the same for both groups can be fit in R as follows: rohwer.mod <-lm(cbind(SAT, PPVT, Raven)~SES + n + s + ns + na + ss, data=Rohwer) Multivariate hypothesis tests show that only SES, ns and na have significant effects. But, how can we understand these results and the nature of the relationships in these data?
The left panel of Figure 6 shows the HE plot for this model, for the first two variables, SAT and PPVT. The ellipsoid labeled "Regr" gives the result of an overall multivariate test for all of the PA tests jointly. As can be seen based upon its extension past the error ellipse, this test is highly significant. Note that the predictors are all 1 df h terms, so the H matrices are all of rank 1, and their H ellipsoids degenerate to lines.
We can interpret this display as follows: (a) The length of each predictor line indicates the strength of its relationship to the two responses jointly. (b) Only the predictor lines for na and ns lie outside the E ellipsoid, and the latter is just barely significant. (c) The orientation of each predictor line shows its relationship to SAT and PPVT. (d) The means for the SES groups show that the high group performs better on both the SAT and the PPVT, but more so on the PPVT. (e) The Regr ellipsoid for the overall test of the regression variables can be seen as a sum of the contributions of the individual predictors, some of which make it larger in the direction of SAT, others in the direction of PPVT. (f) The orientation of the E ellipsoid indicates a small positive correlation between the residuals for SAT and PPVT; the shadows of this ellipsoid on the horizontal and vertical axes reflect the residual standard deviations, apparently larger for SAT than PPVT. To test the assumption of equal slopes in the simple MANCOVA model, we can fit extended models that allow heterogeneous slopes and intercepts in the two groups. One way to do this is to fit separate multivariate regression models for the two groups, and overlay the HE plots on common scales. This result is shown in the right panel of Figure 6. Some additional aspects that can be seen here are: (a) The centers of the ellipsoids show the group means that were reflected in the SES factor in the MANOVA. (b) The overall regression ellipsoid testing H 0 : B = 0 is more aligned with the SAT axis for the high SES group than for the low group, reflecting the better prediction of SAT than PPVT for the high group. (c) Among the individual predictors, na and ns are more important predictors in the high SES group, and are also more important in predicting SAT and PPVT.
In closing this section, we hope we have convinced the reader that HE plots, once you learn how to read them, provide much more direct information about the relationships between predictors and responses in complex MLMs than is easy to understand from tables or univariate displays.
In Figure 6 we only showed the HE plots for two of the response variables. As shown in Figure 7, we can extend this to any number of responses using an HE plot matrix. However, the idea of a low-dimensional view for multivariate data per se discussed in Section 2.2 has a direct connection with similar low-D views for MLMs, that we will discuss in the following section.

Generalized Canonical Discriminant HE Plots
For even a one-way MANOVA design with three or more response variables, it is difficult to visualize how the groups vary on all responses together, and how the different variables contribute to discrimination among groups. In this situation, canonical discriminant analysis (CDA) is often used, to provide a low-D visualization of between-group variation, analogous to the biplot technique for purely quantitative variables.
CDA amounts to a transformation of the p responses, Y n×p into the canonical space, Z n×s = Y E −1/2 V , where V contains the eigenvectors of HE −1 and s = min(p, df h ). It is well-known (e.g., Gittins, 1985) that canonical discriminant plots of the first two (or three, in 3D) columns of Z corresponding to the largest canonical correlations provide an optimal low-D display of the variation between groups relative to variation within groups.
For a one-way design, the canonical HE plot is simply the HE plot of the canonical scores in the analogous MLM model that substitutes Z for Y . This is shown in Figure 8 for the iris data. The interpretation of this plot is the same as before: if the hypothesis ellipse extends beyond the error ellipse, then that dimension is significant. Vectors for each predictor are then superimposed and demonstrate the relation between each and the two canonical dimensions. The interpretation of this plot is quite simple: In canonical space, variation of the means for the iris species is essentially one-dimensional (99.1% of the effect of species), and this dimension corresponds to overall size of the iris flowers. All variables except for sepal width are aligned with this axis.
We have extended this to generalized canonical discriminant plots for the general MLM as follows: (a) Let t index the various hypothesized terms in an arbitrary MLM. Canonical discriminant analysis can be extended by performing the canonical analysis of the H t and E matrices for each term in the model, based on the eigenvalues and eigenvectors of H t E −1 . Then, for term t: (b) The canonical discriminant HE plot is the HE plot of canonical scores in the model Z t ∼ •, where • symbolizes all terms in the original MLM for Y . These and other extensions of canonical analysis are implemented in the candisc package in R ).
An application of this idea is shown in Figure 9 for the Rohwer data. In this case, the two dimensions account for 93.7% of the variability in achievement scores, and each predictor is plotted in relation to the two canonical dimensions. It is directly observable how both the predictors in the model and the outcome variables relate to the canonical dimensions. For instance, the SAT and PPVT extend horizontally, reflecting a strong loading on dimension 1, while the Raven is aligned more vertically, and is more strongly associated with dimension 2. For the predictors, ss is almost perfectly aligned with dimension 1, while the other variables load on both. Similar to the biplots that were previously discussed, this plot provides a compact 2D view for a complex multivariate problem.

Recent Extensions
The general goal of this work has been to extend data visualization methods for univariate response models to their counterparts for MLMs. Towards this end, we describe some recent efforts to address: The interpretation of model coefficients, the use of such plots in robust analyses, and in examining data for influential cases.

Coefficient Plots for MLMs
In presentations and articles, it is commonplace to present results of fitted models in tables of estimated coefficients and their standard errors. This practice, while convenient, is now often deprecated (Gelman, Pasarica & Dodhia 2002, Kastellec & Leoni 2007 in favor of plots of point estimates and confidence intervals that, when carefully done, can communicate findings more clearly.
The left panel of Figure 10 gives an example of a univariate coefficient plot for the results of the linear model for SAT in the Rohwer data, showing 68% and 95% confidence intervals set at β ± {1, 2}se β . The corresponding MLM would require three such plots, one for each response variable and could not indicate multivariate confidence regions for joint hypothesis tests.
However, a simple generalization of this idea is the multivariate coefficient plot, which is illustrated in bivariate form in the right panel of Figure 10. To simplify the plot, this shows only joint 68% confidence ellipses, corresponding to a multivariate version of β ± se β . This has the property that a joint, multivariate test of H 0 : β = 0 is rejected when the confidence ellipse does not cover the origin (as shown by shading). See  for more details.

Robust MLMs
All calculations and test statistics for classical, normal-theory linear models are based on standard, first and second moment summaries, such as mean vectors and covariance matrices. It is well-known that these can be distorted by multivariate outliers, particularly in smaller samples.
In principle, such effects in standard multivariate analyses can often be countered by using robust mean and covariance estimates, such as simple multivariate trim- ming (Gnanadesikan & Kettenring 1972) or the high-breakdown bound minimum volume ellipsoid (MVE) and minimum covariance determinant (MCD) methods (Rousseeuw & Leroy 1987, Rousseeuw & Van Driessen 1999. Often, these robust methods supply weights that can be used to "robustify" other multivariate methods and visualization techniques, but this integration in applied software is still quite spotty, inference for the general MLM is weak and associated graphical methods are limited. In the heplots package , we extend normaltheory HE plots to robust equivalents via similar use of weighting and robust covariance estimation, using a simple iteratively reweighted least squares approach. Other approaches do provide high breakdown bound estimates, and will be implemented in the future. As an example, we have reestimated the MANOVA model with the Romano-British pottery data utilizing the robmlm function in the heplots package. Fitting is done by iterated re-weighted least squares, using weights based on the Mahalanobis squared distances of the current residuals from the origin, and a scaling (covariance) matrix, and its visual output is shown in Figure 11.
In this example, the figure on the left illustrates changes in observation weight. It clearly shows that three data points from Llanedyrn and one from Ashley Rails are atypical and given weights close to zero. The right figure demonstrates the difference between the classical and robust estimates. It is interesting to note that there is only minimal change in the mean ellipse when using the robust estimation, but there is a substantial difference in the error covariance ellipse. This is a quick representation that visually demonstrates how effects become more visible using robust methods.

Influence Diagnostics for MLMs
A wide variety of influence diagnostics (e.g., Cook's distance, DFFITS, Hat values) and associated plots (leverage-influence plots) for detecting influential observations in univariate models have been available for a long time (Cook & Weisberg 1982). As well, a general theory of influence diagnostics for MLMs (Barrett & Ling 1992) is available to support these measures, but there is no available software implementation allowing these methods to be used. Our mvinfluence package is an initial implementation of some of these ideas (Friendly 2012).
The generalization of influence measures to multivariate response models is relatively straight-forward, and also extend to case deletion diagnostics for subsets (I) of size m > 1. For example, the multivariate analog of Hat values, that measure leverage of observations in terms of the predictors is H I = X I (X T X) −1 X T I , where X I refers to the rows of X in subset I.
The multivariate version of Cook's distance can be represented as the standardized distance between the estimated coefficients B using all cases and B (I) estimated omitting subset I.
A wide variety of other influence measures is also defined (e.g., DFFITS, COVRATIO), but all can be expressed in the general form for some function of leverage and the multivariate residuals corresponding to subset I.
We illustrate these methods using the MLM for the Rohwer data using only the low SES group in Figure 12. The left panel is a bubble plot of Cook's D I against leverage, H I for subsets of size m = 1, also showing D I by the area of the bubble symbol. It can be seen that observation 5 is highly influential, but four other cases also have Cook's distances greater than the nominal cutoff for identifying "large" values. The right panel shows a novel form of influence plot (the LR plot) suggested originally by McCulloch & Meeter (1983). From (6), it follows that a plot of log(Leverage) vs. log(Residual) will have contours of constant influence along lines with slope = −1. This plot simplifies the interpretation of influence plots by placing all observations in relation to their influence away from the origin.

Summary and Conclusion
The classical univariate general linear model is the cornerstone for the development of much of modern statistical theory and practice. A great deal of the applied usefulness of this methodology stems from development of a wide range of visualization methods allowing researchers to see and understand their data, diagnose possible problems with assumptions and model specification and communicate their results effectively.
As we have argued, the extension of this model to multivariate responses is well-developed in theory and increasingly prevalent in applied research. Yet, the analogous advancement of visualization methods for MLMs has lagged behind. In this paper we have set out a framework for filling these gaps, based on the following general ideas: Data ellipsoids provide a simple visual summary of bivariate relations under classical (Gaussian) assumptions.
• They highlight important differences among groups (means, variances, correlations) in MANOVA designs; • They can be embedded in scatterplot matrix format to show all pairwise, bivariate relations; • They extend easily to 3D visualizations, and can be modified to use robust estimators.
HE plots provide a visual summary of multivariate hypothesis tests for all MLM models.
• They show H ellipsoids with group means for MANOVA factors and 1 df h ellipsoids for quantitative predictors in ways that facilitate interpretation of multivariate effects. • They can be embedded in an HE plot matrix to show all pairwise views.
Dimension-reduction techniques provide low-dimensional (2D or 3D), approximate visual summaries for high-D data.
• The biplot shows multivariate observations and variable vectors in the low-D view that accounts for the maximum variance in the data. • Canonical HE plots are similar, but show the dimensions that account for maximal discrimination among groups or maximal canonical correlation.
In the popular children's TV show Sesame Street, it was common to sign off with "Today's show has been brought to you by the letter E", where they might have featured elephants, eagles, emus, and ellipses. In a similar vein and as a coda to this paper, we also remark that this approach has been provided by the beautiful and useful connections that exist among aspects of statistical models, matrix algebra, and geometry ). There are ellipsoids everywhere and almost all the properties of multivariate tests and dimension-reduction techniques can be understood in terms of eigenvalue decompositions. This framework provides opportunities for further extensions.
R code for some of the figures in this paper are included in the Appendix and related examples in the web supplement for a conference presentation, http: //www.datavis.ca/papers/ssc2013/.