Functional Data Analysis and Visualisation of Three-dimensional Surface Shape

The advent of high resolution imaging has made data on surface shape widespread. Methods for the analysis of shape based on landmarks are well established but high resolution data require a functional approach. The starting point is a systematic and consistent description of each surface shape. Three innovative forms of analysis are then introduced. The first uses surface integration to address issues of registration, principal component analysis and the measurement of asymmetry, all in functional form. Computational issues are handled through discrete approximations to integrals, based in this case on appropriate surface area weighted sums. The second innovation is to focus on sub-spaces where interesting behaviour such as group differences are exhibited, rather than on individual principal components. The third innovation concerns the comparison of individual shapes with a relevant control set, where the concept of a normal range is extended to the highly multivariate setting of surface shape. This has particularly strong applications to medical contexts where the assessment of individual patients is very important. All of these ideas are developed and illustrated in the important context of human facial shape, with a strong emphasis on the effective visual communication of effects of interest.


Introduction
Statistical shape analysis is a research topic which has seen very substantial growth and development in recent years. Early work in this area focused on representations of shape through carefully chosen landmarks, as point locations with an interpretation which corresponds across different shapes. Dryden and Mardia (2016) provide a very comprehensive description of methods for the analysis of landmarks, but the later chapters of the book also indicate the much wider array of data types which are becoming available, driven by rapid advances in imaging technology. A particular example is the increasing availability of sensors which employ techniques such as laser scanning or stereo-photogrammetry to create high-resolution data on surface shape in three dimensions. This has a very wide variety of applications and it is the focus of the present paper. Figure 1 shows an image of a human face as an example of the kind of 3D surface data which is now easily obtainable.
Single instances of 3D surface data can be displayed in a variety of ways; in particular, the rgl package (Adler et al., 2019) is an indispensable tool for those from the R (R Core Team, 2019) community, as it provides access to the OpenGL industry-standard tools for 3D display. However, the effective display of patterns and variation in collections of 3D objects is more challenging. Bowman and Bock (2006) gave some discussion of this for 3D points and curves, but the aim of the present paper is to provide new tools for the modeling and visualisation of samples of 3D surface data.
The starting point is a description of an individual surface which has a consistent meaning across all the surfaces in the dataset. This can be approached in different ways and the particular method adopted here is described in Section 2. Some obvious issues of analysis then commonly arise.
These include the need for methods to: • register the surfaces to a common co-ordinate system; • characterise the variation present in a sample of surfaces; • compare surface shapes across groups; • assess the surface shape of an individual against a relevant control set.
These problems are tackled here from a functional perspective. Adaptations of standard methods of Procrustes analysis are introduced in Section 2, using a metric based on an approximated surface integral rather than individual point locations. Non-linear registration through warping is also described as a means of displaying the results of analysis at higher resolution, for visual effect. Basic methods of visualising surface differences are also reviewed.
Section 3 discusses the use of principal components in exploring the variation in surface data and in comparing groups. Again a functional perspective is adopted, based on surface integration. This section also discusses how effects can be visualised by characterising the shape changes associated with appropriate subspaces, rather than through examination of individual components.
Section 4 addresses the situation where there is a need to assess the characteristics of individual surfaces, and in particular of any shape features which are not consistent with control shape. Some final discussion and reflection is provided in Section 5.
The methods proposed in the paper are illustrated throughout on images of human facial shape. There is a strong emphasis on the creation of visual displays which communicate patterns in the data, the evidence and nature of group differences, and the distinctive characteristics of individuals, as clearly as possible. Graphics are provided in static form but animations are also available in the Supplementary Information.
2 Some fundamental tools for surfaces 2.1 Facial models A model for an individual surface should provide a structured representation of shape whose components have a consistent interpretation across the other surfaces in the dataset. This then allows the investigation of pattern and variation in shape. Landmarks satisfy this criterion and so, while the information they carry is limited, they have often been used as the starting point for more complex models. Paulsen and Hilger (2003), Hammond et al. (2004) and Mao et al. (2006) give examples of this approach where a template of a human face is 'warped' onto an observed image. Landmarks on the template are transformed in a non-linear manner to match those on the image exactly, with the surface of the template then adjusted further to improve the match with the surface of the image. This might be done by locating closest points or by matching the characteristics of local surface curvature. The resulting transformed template then provides a model for the surface whose meaning corresponds across all the images in the dataset.
In an alternative approach, Vittert et al. (2019) took the view that ridge and valley curves provide the key information on shape, as these capture the locations where curvature is strongest. The two left hand images in Figure 2 give examples of facial curvature, here in the form of Gaussian curvature and shape index ; see Koenderink and van Doorn (1992) for details.
Curvature information can then be used to fit a model consisting of a set of ridge, valley or other geodesic curves, with landmarks as end-points. A full surface representation can easily be constructed by interpolation across the surface patches bounded by these curves, although Vittert et al. (2019) give examples where the focused representation based on curves alone can be more informative. An example of the resulting facial model is shown in the right hand panel of Figure 2, which uses colour and size to indicate the hierarchical nature of the information captured in landmarks, curves and surface patches.

Registration
A key issue in the analysis of shape is that the observed images do not necessarily lie in a common co-ordinate system. The process of data capture does not usually give each image the same origin or orientation. The relative sizes of the images may also be viewed as unimportant from a shape perspective. It is therefore necessary to remove these extraneous aspects before statistical analysis. Different approaches to this are outlined below. These are described in the context of transforming an image X to match a reference image Y , where X and Y are J ×3 matrices whose rows give the 3D positions of a fitted model in the discretised form of J point locations. The process of transforming X to match Y is referred to as registration.

Procrustes methods
A very effective approach is to find the rotation matrix Γ, scaling parameter β and translation parameter γ which bring X as close to Y as possible.
Adopting a similar notation to Dryden and Mardia (2016), the method is expressed as: where ||.|| is the Euclidean norm and X and Y are assumed to have centroid 0. This is a standard example of an approach referred to as Procrustes registration. The ideas and methods of implementation involved are comprehensively described by Dryden and Mardia (2016). The solution to (1) provides the basis of an iterative algorithm to match multiple images where Y represents a mean shape and the criterion is summed over multiple images X i ; i = 1, . . . , n. A constraint on the size of Y is adopted to avoid degenerate solutions.
These methods arose in the context of shape representations based on landmarks, often well-separated spatially. The representations we are now dealing with may have a discrete point-based form, for convenience, but they represent a continuous surface. This leads immediately to a functional data analysis perspective, as described by Ramsay and Silverman (1997). In the current setting, functional registration is achieved through min β,Γ,γ Sy where S y denotes the surface indexed by Y . The function x(y) indexes the point x on the surface S x which has geometrical correspondence with the point y on surface S y . The models for the two surfaces established this. The integral can now be approximated in discrete form as where the weight a j gives the surface area which surrounds point y j and A is a diagonal matrix containing the weights a j . These weights can be calculated easily. If T 1 , . . . , T T denote the set of surface triangles, N j is the set of indices of triangles which have x j as a vertex and |.| denotes area, then the weights are simply a j = 1 3 t∈N j |T t |. The divisor 3 apportions one third of the area of each triangle to each of its three vertices.
Expansion of the right hand side of expression (3), following the derivation of the unweighted case in Dryden and Mardia (2016), shows that the minimum is achieved when the matrices AX and AY are column-centred, withγ = 0, The case of matching one shape X to another Y is referred to as ordinary Procrustes registration. This provides the building block for generalised Procrustes registration which seeks a common registration of multiple shapes X 1 , . . . , X n . The aim now is to minimise the sum of the deviations of trans-formed shapes from a common mean µ. This can be expressed in functional form as with the transformation parameters β i , Γ i , γ i ; i = 1, . . . , n. The weights in the diagonal matrix A are now the areas surrounding the vertices of the mean shape µ. Following again the general structure outlined in Dryden and Mardia (2016), and beginning with each AX i column centred, minimisation can be achieved by successive ordinary weighted Procrustes registration of the adjusted shapes X P i = β i XΓ i − 1 k γ T i onto the mean µ, which itself is estimated simply as the average of the X P i . A size constraint is required to ensure that the solution does not degenerate to 0. From a functional perspective, size is expressed in the surface area of each shape and this is easily calculated in discrete form as the trace of A. Notice that integration is carried out over the mean surface µ, so that A changes with each iteration. The facial models displayed in Figure 3 includes two curves at the brow and columella (between the nostrils), to avoid noisy areas of the image around the eyes and nostrils. These curves are included in registration, and in later analysis, simply by considering a small patch around each curve point, with area set to the average of the areas surrounding the surface points.

Warping
Procrustes registration brings the co-ordinates of shape X as close as possible to those of shape Y , within the limits imposed by the use of translation, rotation and scaling only. However, there are situations where it is useful to match these two shapes exactly. This arises in some of the methods involved in constructing facial models, described in Section 2.1, where a template is initialised on an observed image by exact matching of a set of landmarks.
A further example is in the improvement of the visual comparison of shapes such as those in Figure 3. If a high resolution facial template is available, with an embedded shape model Z which corresponds to that of X and Y , then a smooth function which transforms Z to X exactly can be identified, in a process known as warping. This function can then be applied to the template to creates a visual display which has smoother and more attractive surfaces than the lower resolution model X and which adds in detailed features such as nostrils and eyes, giving a more effective and interpretable display of a human face. While care should be taken not to interpret the form of these very detailed features, the principal characteristics of the display all reflect the underlying model. Use of a template can also help to anonymise individual faces. The right hand images in Figure 3 show the effects of employing templates in this way to both male and female means.
In the analysis of 2D shapes based on landmarks, the concept of a deformation grid to describe shape change is a very old one; see Thompson (1917).
This uses a function which maps one set of landmarks to another exactly but, as it is expressed in functional form, this function can also be applied to a regular grid of locations over the first image to create a warped grid which expresses the underlying transformation. Methods based on pairs of thinplate splines were first introduced in 2D by Bookstein (1989) and developed further by Bookstein (1997). The topic is also explained clearly by Dryden and Mardia (2016). Corresponding methods in 3D were first introduced by Gunz et al. (2005) and applied to skulls by Mitteroecker et al. (2004) and Mitteroecker and Bookstein (2008), to long bones by Frelat et al. (2012), and to mice heads by Waddington et al. (2017). The literature on radial basis functions uses the same techniques but employs a different language. The technical details of warping in 3D are described in the Appendix.

Visual comparison of two shapes
The male and female example of Figure 3 raises the question of how two shapes can most effectively be compared visually. The challenge is that in addition to the 3D shapes themselves, comparison involves an additional vector field of differences, with a displacement vector at each position on the individual shapes. A helpful strategy is to display one shape and use colour to inform on the shape difference from the other shape at each location.
The lower rows of Figure  3 Visualising shape datasets

Exploring variation
While a visual comparison of means is useful, an understanding of the variation involved in a dataset is necessary for any form of statistical analysis.
A simple device is to display the size of the variation at each location on the model. Figure 5 shows the value of log | det(Σ j )|, whereΣ j is the empirical covariance matrix of the x-, y-and z-coordinates at location j after Procrustes registration. The regions of higher variability include the eyes, whose reflective surface can introduce some inaccuracy, the forehead, which lies at the edge of the facial surface, and the chin and nasal tip, where the degree of prominence can vary considerably. Effects associated with the model curves which traverse the cheeks, where flatness can induce some variability in location, are also apparent.
Descriptions which capture the correlation between locations are clearly required. These also need to deal with the difficulty that the dimensionality of the shape representation (for example, 917 3D points in a discrete representation of a surface) often exceeds by a large margin the number of The multiplier c is often set at ±2, or ±3 if some magnification is required. Dryden and Mardia (2016) give all the details.
A functional data analysis perspective can be applied in this setting by following the pattern described by Ramsay and Silverman (1997). When the data are in the form of functional objects, x(s), where s lies in an appropriate sample space S, principal components are then defined as orthonormal functions β(s) which successively maximise the variance of β(s)x(s)ds. In many settings, the sample space S is a time interval or a spatial region in standard Cartesian form. In the present setting, the functional object has the much more complex form of a 2D manifold embedded in 3D space. The immediate problem is how to parametrise this in a suitable sample space S. A solution is provided by setting this to be the Procrustes mean shape, S µ . Any other shape in the sample can then be expressed through the three functions {x(s), y(s), z(s)} which give the 3D deviations of this shape from the mean at location s.
This takes us to the realm of multivariate functional principal components which seek to maximise the variance of as discussed by Ramsay and Silverman (1997). As usual, computations are conveniently based on discrete approximations to these integrals. The model form of each shape has a consistent triangulation so, for example, a convenient approximation can be written as where a j is the area surrounding s j .   in the data may be reasonably large, with 10 components required to capture 82% of the variation in this case. Figure 6 shows the scores, vec(X i −X) T e k , on these principal components, with the diminishing widths of the boxplots illustrating the gradual reduction in variation across the components. The shape changes associated with the first four principal components are indicated by superimposing the faces which correspond to c = ±2 standard deviations (pink and green). Methods for investigating individual components are discussed below in the context of comparing groups but the variation in a single group can be helpfully displayed through the idea of a 'grand tour', proposed for general multivariate data by Asimov (1985). A very simple version of this uses a vector of p independent normal random variables z to create a random sample of locations in the space of the first p principal components, {z k λ j e k ; k = 1, . . . , p}. Turning these into shapes and tracking between successive pairs by simple interpolation creates an animation which randomly explores the variation in shape. Figure 6 illustrates five random positions which form the staging posts of a tour. This approach forms the basis of a comparison between individuals and a control dataset in Section 4.

Assessing differences between groups
When groups representing different populations are present in a dataset, principal components provide a helpful way of reducing the dimensionality of the space in which comparison takes place, while retaining as much of the variability as possible. If components are simply constructed from the combined dataset, without reference to the group structure, then the vari-ation captured by each component will contain both intra-and inter-group contributions. The top left hand plot in Figure 7 displays the scores for the principal components constructed in this way for the sexual dimorphism data. As the signs of the eigenvectors which define the components are arbitrary, these have been reversed where necessary to ensure that the male mean score is higher, for ease of interpretation. It is often the case that the first few components capture large scale variation (greater width and smaller height etc.) which is common across groups, with group differences associated with more subtle aspects of shape.
In the reduced space of the first p components, a global assessment of the evidence for mean differences in male and female shape is provided by p-values has been adopted as 0.05/10 = 0.005 and colour (red) has been used to indicate where that threshold has been exceeded. As Bedrick (2019) points out, it is important to note that the p-values associated with individual components should be interpreted in the context of the null hypothesis that the mean scores are identical for all components simultaneously.
The images in the lower part of Figure 7 indicate, for those components which exhibit evidence of differences in groups, the nature of the associated shape change in the usual form of ±2 √ λ k from the mean. The image corresponding to the positive end of the scale is more strongly associated with male shape (green) and the negative end of the scale with female shape (pink).
The association of male shape with more prominent nose, chin and eyebrows, and female shape with more prominent cheeks, is apparent. However, the individual components cannot be given special status in the description of male-female differences as they were constructed simply by maximising the variation explained across the whole dataset. It is therefore helpful to construct a combined display which corresponds to movement along all these components simultaneously. This is aided by the earlier modification of components to ensure that positive signs are more strongly associated with When group differences are of interest, an alternative approach to principal components is through the intra-group covariance. As pointed out by Dryden and Mardia (2016), the T 2 statistic can be written as where the λ k denote the eigenvalues andv k1 ,v k2 the mean principal component scores, using the eigenvectors derived from an estimate of the common covariance matrix. Dimensionality reduction follows from the truncation to  This loses the property that the global statistic is a simple average of its components but the t-statistic scale is helpful, and there is no effect on the performance of the tests. The smaller facial images show the nature of the shape change associated with the individual components (3, 5, 7) where there is strong evidence of differences between males and females. There is no reason why the differences in mean shape should align with the axes of the common covariance matrix so, again, the individual components do not have special status. The larger facial image shows the shape change associated with the combination of these three components. This characterises the subspace where the evidence for difference is strongest and it is reassuring to see that this is very similar to the sub-space identified from the principal components which do not exploit group structure, as displayed in Figure 7.
This underlines the case for identifying and interpreting the sub-space as a whole, with the components simply providing particular indexing bases.

Affine/non-affine decomposition
A further sub-space approach is available through partitioning the variation in the data into affine and non-affine components. The former involves linear transformations which apply across the whole object of interest. The latter space contains non-linear transformations which describe local and more complex effects. Rohlf and Bookstein (2003) showed that these sub-spaces can be easily created from the Procrustes aligned shapes {X i ; i = 1, . . . , n} through the regression models where the α i denote 3 × 3 matrices of regression coefficients. The affine co-ordinates X Ai are then available as the fitted values while the non-affine Chinese British tangent affine non-affine Figure 9: From left to right, the images show the mean Chinese female face, the mean British female face, and the combined principal components of shape change in tangent, affine and non-affine spaces respectively. co-ordinates X N i are obtained by adding the residuals to the mean as whereα i denotes the least squares estimates. More formally, the algebra associated with linear regression, particularly the independence of residuals and fitted values, separates the space of the Procrustes shape co-ordinates X i into two orthogonal sub-spaces which capture the affine and non-affine behaviours. Analysis can therefore proceed separately within these sub-spaces to provide complementary descriptions of the variation in the dataset.
The comparison of British and Chinese female facial shapes provides a simple example. Visual discrimination between these two ethnic groups is usually straightforward but examination of mean shapes allows the distinctive features to be identified and quantified. Figure 9 shows the nature of It is interesting to explore whether these differences can be explained by affine transformation or whether non-affine transformations are required. In the affine sub-space only the first principal component shows clear evidence of difference between the groups and it is already clear that the lower brow of the British faces is not captured in this sub-space. This is confirmed by analysis in the non-affine sub-space where there are two principal components which exhibit clear evidence of differences between the groups and whose combined effects are displayed in the right hand image of Figure 9.

Visualising the shape of individuals
In the previous section, evidence for systematic differences between groups was considered, while allowing for the presence of individual variability. This section considers situations where interest lies in the evaluation of individuals.
Traits which can be expressed in single values are considered, as well as more general characterisation of the particularities of individual shapes.

Asymmetry
For shapes whose ideal form is symmetric, deviations which disturb this symmetry are important features. The left/right symmetry of human faces is a major example, where any strong departure from symmetry creates a striking visual impression. However, real faces are all asymmetric to some degree so, as part of the process of evaluating an individual shape, it is important to characterise the asymmetries found in an appropriate reference population.
When a shape is represented by a set of point locations, some of which are paired as left/right counterparts, quantification of asymmetry is generally based on the degree of post-registration mismatch between the shape and its reflection with the left/right labels swapped. Theoretical development of this idea was undertaken by Mardia et al. (2000) and Kent and Mardia (2001) in the context of landmarks, and many authors have exploited this thinking in biological contexts. Bock and Bowman (2006) proposed a decomposition of global asymmetry which allowed local sources to be identified and separated into contributions from individual features and their configurations.
The first step in computing a functional measure of asymmetry for a surface X(s) is to apply (functional) Procrustes matching of the mirror image onto the original surface, to create the new surfaceX(s). The mirror image is created in practice by reflecting and relabelling the configuration of points which express the shape model. The integrated comparison and its discrete approximation are then easily constructed as where, to be even-handed, S is the surface formed from the average of X(s) andX ( (Jackson, 2008). The global asymmetry scores, for both pre-surgical and post-surgical facial shapes of this patient, have been superimposed on the bottom density strip. These scores are entirely typical of controls and in particular they provide reassurance that surgery has not introduced any marked asymmetry overall. The scores have also been Figure 10: The four facial images show, in clockwise order from top right, a post-surgical patient, the sub-regions used to compute asymmetry scores, the colour-coded distance between the shape and its matched reflection, and the superimposition of the shape and its matched reflection. The density strips show the asymmetry scores from control faces, with the pre-and post-surgical scores for the individual superimposed.
computed for a variety of sub-regions, indicated by the top right hand image of the four facial images. The scores and density strips indicate strong nasal asymmetry, but this is apparent both before and after surgery and so it cannot be attributed to surgical intervention.

Closest Controls
For more general assessment of the shapes of individual cases, an approach analogous to the concept of a 'normal range' for univariate data is required.
In the surgical context, characterising any differences between a post-surgical patient and a control population could provide helpful guidance on the nature of any further surgery which may be required. Bowman and Bock (2006) outlined an approach based on the concept of a 'closest control'. This identifies the shape which is as close as possible to the individual of interest but which lies on the surface of a 95% prediction ellipsoid and so lies within the 'normal range' associated with controls. Any remaining shape differences then characterise the features of the individual shape which are different from controls. Bowman and Bock (2006) derived the algebra of this in a simple case involving curve data, using a principal component regularisation to reduce dimensionality across both cases and controls. The concept is applied here to surface data but the ideas are developed further in two important ways.
Firstly, principal components are constructed from the control data only.
This gives a clear interpretation of the components which is unaffected by the particular cases available. Secondly, variation unexplained by these components is also considered, in order to give a complete description of the observed data.
If a new shape Z, such as a post-surgical patient, is registered onto the control mean then it can be projected into the space of the first p principal components, denoted by C p , by computing the score vector v(Z) = vec(Z − X) T E p , where E p is the matrix whose columns contain the first p principal component vectors derived from the control data. Within this space, the Mahalanobis distance of the new shape from the mean control shape is whereΣ is a diagonal matrix containing the variances of the principal com-ponents. The Mahalanobis distance has a χ 2 p distribution approximately. If d(Z) is less than the 95th percentile of this distribution, denoted by χ 2 p (0.95), then the new shape falls within the 'normal range' of controls in this space.
If d(Z) > χ 2 p (0.95) then the closest control in this p-dimensional space can be found by shrinking v towards 0 until its Mahalanobis distance matches χ 2 p (0.95). The shrinking factor α 1 is easily found as α 1 = The scores of this new location α 1 v(Z) are then converted into tangent co-ordinates as α 1 v(Z)E T p , and expressed as a shape by reconfiguring the tangent co-ordinates into a three column matrix in the usual manner as This finds the closest control in C p . However, the case of interest may well have shape features which cannot be captured in this space so characterisation in the complementary space, denoted by R p , is also required. The so the relevant information is found in the residual shape R(Z) = Z −Z.
The length of the residual at each model location can be quantified in the vector L(Z) = R(Z) 2 1 3 , where here the square-root and the exponent 2 are applied element-wise. A measure of variation in the lengths of the residuals at each model location for controls is then available in the vector ν whose jth element is the standard deviation of {L(X i ) j ; i = 1, . . . , n}, where L(X i ) j denotes the jth element of L(X i ). A simple measure of variation is then This averages the lengths of the residuals across the model locations, standardised at each model location by the variation in control residual length.
The value of r(Z) may be regarded as atypical if it lies beyond q 95 , the 95 quantile of {v(X i ); i = 1, . . . , n}. A closest control in the residual space, R p , can then be constructed by shrinking the residual shape to α 2 Z, where α 2 = q 95 /r(Z). An overall closest control for z can now be constructed as which combining the closest controls in the sub-spaces R p and C p . Residual scale (npc = 9) count Figure 11: The histograms show the distances of the control shapes from the mean in the space of the first 9 principal components (left) and in the residual space (right). The distances of two post-surgical cases from the control mean are superimposed. The lower images compare the facial shape of case number 1 (green) with its closest control. both through superimposition and as normal distances from case to closest control.
imposition and by normal distances. This characterises the unusual features of the case as a slightly more prominent lower face than in controls, particularly in the mandible (lower jaw). This is potentially valuable feedback on surgery which involves repositioning of the underlying bones. A display of the closest control information for case 2 is deferred to the next sub-section.

An integrated patient assessment
The methods described in this section provide valuable tools for the characterisation of individual shapes of interest. The combination of these tools forms the basis of an integrated patient assessment. This is illustrated in Figure 12, using case 2 from Figure 11. This brings together the observed pre-surgical and post-surgical shapes, comparisons of this case with control shape both for closest control analysis and for asymmetry, and illustrates differences in shape through superimposition and normal distances. An interactive display would allow those reviewing the case to inspect the shapes in 3D and to query further information. However, this static display gives a helpful summary of the effects of surgery on this particular patient.

Discussion
This paper has proposed methods of analysis for high resolution surface data and corresponding models which give consistent descriptions of each observed shape. A strong emphasis has been on the adoption of functional forms of analysis and the difficulty of identifying a common sample space has been overcome by using the mean surface as an indexing shape. This enabled functional forms of registration, principal components analysis and group comparisons to be developed. In applying these methods, strong emphasis was also placed on the use of principal components to identify sub-spaces of interest rather than inspection of individual components. Graphical displays of shape change which jointly describe these sub-spaces were also adopted.
Particular attention was given to the comparison of individual shapes with a relevant control group. In additional to univariate measures such as asymmetry scores, the concept of a 'closest control' was developed in detail, to give a powerful means of identifying any unusual characteristics of an individual shape of interest.
These methods provide powerful tools for the assessment of surface shapes  Figure 12: An integrated assessment of patient 2 from Figure 11. and the practical implications of their use, particularly in surgical contexts, will be the subject of subsequent research.

Acknowledgement
The early stages of the research carried out by Adrian Bowman and Stanislav Katina was supported by a Wellcome Trust grant (WT086901MA) to the Face3D research consortium (www.Face3D.ac.uk), under whose auspices the facial data were collected.    Appendix A 3D warping

Supplementary materials
The technical details of warping are described here because the method is not widely used in 3D. We seek a function which maps X onto Y exactly. If an interpolant of a single co-ordinate of Y as a function of the three co-ordinates of X is considered, then the elegant functional analysis described by Duchon (1977) provides an immediate solution. The aim is to find the interpolating function f which has minimal bending energy, defined as The solution can be expressed in terms of radial basis functions which parameterise the relationship between points x and y in R 3 as where β jd are parameters and d denotes the three dimensions of R 3 . Fitting this functional form to the mapping from the observed locations in X to those in Y requires Y = Sβ 1 , where S is a J × J matrix, with S ij = φ (||x i − x j ||), and β 1 is a J × 3 matrix whose (j, d)th element is β jd .
It is helpful to separate the mapping into affine and non-affine components, with the former capturing the linear part of the transformation, including possibly different scalings in different co-ordinate directions (shear), and the latter describing non-linear bending. If Q denotes the matrix (1 J X), where 1 J is a column vector of 1's, then the transformation can be written in the multivariate form where β 2 is a 4 × 3 matrix filled with parameters. This system is now overparametrised, with (J + 4) × 3 parameters but only J × 3 defining equations.
This can easily be resolved by adopting suitable constraints, for example through the extended system where the 0 entries indicate matrices filled with 0's of the dimensionality required by the context. These constraints require the sum of the entries of each column of β 1 to be 0 and the sum weighted by the co-ordinates of each dimension of X also to be 0. By applying constraints to the affine component, the interpretation of the non-affine component is left undisturbed.
The system of equations (4) can be written in the condensed form Y e = X e β, with obvious definitions of X e and Y e . If the matrix S is invertible then so is X e and, after some standard matrix manipulations, the solutions emerge as When the bending energy matrix B e is post-multiplied by X, this generates the coefficients of the non-affine part of the transformation. The bending energy itself can be expressed as tr Y T B e Y . Finally, the optimal radial basis function is shown simply to be φ(z) = − 1 8π z.