SHREC’20: Shape correspondence with non-isometric deformations

Estimating correspondence between two shapes continues to be a challenging problem in geometry processing. Most current methods assume deformation to be near-isometric, however this is often not the case. For this paper, a collection of shapes of different animals has been curated, where parts of the animals (e.g., mouths, tails & ears) correspond yet are naturally non-isometric. Ground-truth correspondences were established by asking three specialists to independently label corresponding points on each of the models with respect to a previously labelled reference model. We employ an algorithmic strategy to select a single point for each correspondence that is representative of the proposed labels. A novel technique that characterises the sparsity and distribution of correspondences is employed to measure the performance of ten shape correspondence methods.


A B S T R A C T
Estimating correspondence between two shapes continues to be a challenging problem in geometry processing. Most current methods assume deformation to be nearisometric, however this is often not the case. For this paper, a collection of shapes of different animals has been curated, where parts of the animals (e.g., mouths, tails & ears) correspond yet are naturally non-isometric. Ground-truth correspondences were established by asking three specialists to independently label corresponding points on each of the models with respect to a previously labelled reference model. We employ an algorithmic strategy to select a single point for each correspondence that is representative of the proposed labels. A novel technique that characterises the sparsity and distribution of correspondences is employed to measure the performance of ten shape correspondence methods.
c 2020 Elsevier B.V. All rights reserved.

Introduction
With a decade passing since the release of the first Kinect, the last decade has seen a large increase in the number of low-cost 3D capturing devices available. As well as bespoke 3D scanning hardware, a combination of improvements in software solutions for photogrammetry and the pervasiveness of high quality cameras has enabled the creation of vast amounts of 3D data. With the increasing amount of new data captured, demand for methods that provide greater automation to understand relationships between shapes is increasing.
Accurately identifying correspondences between two or more surfaces automatically continues to be a challenging and relevant problem. It provides a basis to enable further analysis and applications in a variety of areas. As discussed by van Kaick et al. [1], the problem of shape retrieval is closely related to shape correspondence, as the correspondence between two shapes may be used to measure their similarity.
Recently, the problem of non-isometric shape correspondence has become increasingly popular. Strictly isometric and near-isometric deformation has been well studied. With stateof-the-art methods [2,3,4] achieving superior performance in current non-isometric scenarios, there is presently an absence of valuable benchmark datasets for non-isometric shape correspondence.
Additionally, in real-world scenarios where real objects are scanned, existing capturing techniques induce natural geometric errors (e.g., noise, self-occlusions, and fusion between parts-causing topological changes). The limitations and errors exhibited vary according to the particular scanning technique employed. Typically, most benchmark datasets consist of scans from one scanning source. When evaluating a method's performance this makes it unclear how well a method's performance may transfer to other technologies. Recently, Melzi et al. [5] published a dataset that sought to address the issue of incompatibilities between meshes, which arise when working with scans from multiple sources.
For this dataset, we have compiled a small database of quadruped shapes. Establishing correspondence between animals poses a pertinent challenge that-with shapes exhibiting extreme non-isometries-is not currently considered. As discussed in Section 2.1, most existing datasets address the problem of correspondence between humans, which has quite limited applications. From the perspective of comparative anatomy, being able to also establish correspondence with other mammalian vertebrates-focused on here-may be considered a generalisation of the human correspondence problem.
Previous work [6] has discovered deficiencies when reporting the performance of methods. Current error measurements fail to capture valuable quantitative information about the sparsity and distribution of correspondences. This is further discussed in Section 2.2.
Although topological variations exist, due to the common ancestry of tetrapod mammals, many parts are considered homologous structures-to correspond. Fig. 1 illustrates the homologous region of the hind leg between quadrupeds, all comprising primarily of a femur, tibia, fibula, and metatarsal. Therefore establishing a valid correspondence automatically is possible.
Methods capable of accurately finding correspondences between different mammals enable further avenues of research, e.g., statistical, behavioural analysis [7], and generative models [8].
For zoologists that use morphometrics-the study and development of techniques for the quantitative measurement of organisms-sparse, manually placed correspondences between animals are required to conduct statistical shape analysis.
Contribution. The main contributions of this work are as follows: • Generation of a novel dataset of quadrupeds with sparse ground-truth correspondences labelled by three specialists.
• Development of a new measure to evaluate the coverage of correspondences on a shape's surface-discussed further in Section 5.
• Systematic evaluation of the performance of a selection of recent shape correspondence methods, with additional quantitative insights into performance from our novel measure.
Organisation. This report is organised as follows: Section 2 discusses previous works on quadruped benchmarks and their relation to current human body research, as well as discussing correspondence evaluation techniques. Section 3 describes the contents of the dataset, as well as specifying the acquisition techniques used to capture each object. Section 4 describes the correspondence methods and parameters used for this dataset. Section 5 describes the measures used to evaluate the performance of methods, including our novel measure of correspondence coverage. In Section 6 results are presented and discussed. Finally, Section 7 contains concluding thoughts arising from the outputs of this work.

Correspondence datasets
There has been a great focus on anthropometric-the measurement of humans-surface deformation [9,10,11,12,13,14], with the many datasets produced opening up avenues to conduct further research. Because the field of anthropometry may be considered to generalise to the field of morphometry, we shall discuss existing datasets in both.
The notable FAUST dataset [14] contains a total of 300 real scans of 10 humans in 30 poses captured using the 3dMDbody.u System by 3dMD. Subjects were covered in sparse markers to enable shapes to be registered using a novel texture-based technique, which ensured quality alignment in areas with little geometric detail. Ground-truth correspondences between different individuals have been established, with a subset publicly released for training purposes. This was subsequently extended in a dataset that captured human body motion [15], containing 40,000 scans.
Vlasic et al. [12] propose a technique that uses multiple monocular cameras to capture a sequence of images of a human's performance from multiple angles. At each time step the image from each camera is segmented to separate the background from the actor. This is used to produce a silhouette. A template mesh rigged with a skeleton is used in combination with the silhouettes to reconstruct the human's pose. This approach enables all models using the same template to have the same connectivity and thus have dense correspondence. 10 performances have been published, in which sequences consist of between 150 and 250 watertight meshes.
The CAESAR dataset [9] is one of the largest human body datasets. 4,431 subjects were scanned in North America, the Netherlands, and Italy using laser scanning. 72 stickers were placed on each subject for use as landmarks; due to initial capturing limitations 110 subjects from the European subset do not have landmarks. Subjects were scanned in three poses: standing, sitting comfortably, and sitting with arms raised.
The CAESAR dataset is not publicly available without purchase of a license. The licensing and copyright of content is an issue present across computer vision, especially in research where it may be unclear as to whether one's work is considered to be for commercial or non-commercial purposes. Whilst still a grey area, subsequent human body datasets have been derived from the dataset [13,16,17,18]. Loper et al. [17] & Zuffi et al. [8] both use data that is not considered to be in the public domain to develop linear blend skinning algorithms to construct a model of the data. These have been subsequently used to produce synthetic datasets for evaluative and training purposes as ground-truths are easily established [19].
Several synthetic datasets have been derived from the Digital Art Zone (Daz Productions, Inc. or Daz 3D). Daz 3D is a digital model platform, as well as a software development company. They have produced a series of base (i.e., template) models which are rigged and include morphs to alter the appearance of a model (e.g., emaciated, muscular, etc.). Many datasets have been derived from these models [20,21,18]. Models that use the same template share the same connectivity, and therefore possess dense ground-truths.
Kim et al. [22] combined three existing datasets [11,20,23], a subset of the shapes are animals, for which a volunteer selected 21 corresponding points on the quadruped shapes. The dataset contains 51 quadruped shapes. Meshes contain between approximately 3,000 and 56,000 faces.
Other datasets that contain a subset of quadruped shapes [24,25] provide ground-truth correspondences between different shapes of the same class but not different mammals, limiting the degree of non-isometric deformation that can be quantitatively evaluated.
Previous SHREC tracks have used datasets that contain non-isometric deformation [5,6], however the degree of nonisometry exhibited is significantly more subtle. This work contains strictly highly non-isometric deformations. Furthermore, Dyke et al. [6] comprise of scans captured solely with one commercial device, while our new dataset spans a variety of sources. This introduces additional challenging geometric and topological variations.

Correspondence measures
We shall begin with a few definitions. Let C be a set of correspondences between the source surface X and the target surface Y. For a predicted correspondence between source and target surfaces The distance between the predicted point y i and the ground-truth point y * i is measured using distance function d Y (y i , y * i ), which may represent either the Euclidean or geodesic distance.
Often geodesic distance, or a normalised variant of it, is used to directly measure performance. The average geodesic error may be formulated as Many previous SHREC tracks have principally used geodesic distance as a measure for error [5,26,27,28]. Note that when measuring the overall error for a collection of target shapes, it may be necessary to normalise the computed error by the surface's properties such as the area of Y or farthest geodesic on Y.
Measures that require each predicted correspondence to be assigned a binary classification b(y i ) ∈ {0, 1} as either true positive (TP) or true negative (TN), e.g., precision, recall, and specificity, rely on an appropriate classification strategy b(y i ) = B(y i , y * i ). A popular strategy is to measure the distance between the predicted and ground-truth correspondence points d Y (y i , y * i ), and points that are below a specified error threshold are considered correct. This description is the basis of the popular correspondence error measure proposed by Kim et al. [22], where a correspondence is considered to be a TP when The value of is increased to measure the number of TP over larger radii, which can be used to produce a curve. This approach fails to characterise the distribution and sparsity of correspondences on shapes where a limited number of ground-truth correspondences are available.
The benchmark protocol described by Kim et al. [22] has been considered the standard error measure for correspondences. The normalised geodesic error may be used to produce further statistics through the use of the area under the curve [6]. For a collection of shapes, Kim et al. [22] also report the average of the maximal geodesic error. Rodolà et al. [29] & Cosmo et al. [30] report the average geodesic error over a dataset of shapes with gradually reducing surface areas to measure the robustness of methods on increasingly partial scans. For functional mapping approaches, Corman et al. [31] measure the quality of ground-truth and predicted functional basis. In the case of registration methods, where one shape is deformed to align with another, it is possible to measure fitting error using the Hausdorff distance [32,33]. van Kaick et al. [1] discuss a variety of other validation methods for shape correspondences.
Qualitative techniques using visual mappings between two shapes in which topological information is transported (e.g., texture transfer) [4,34] are also used. However, these techniques are not an effective way to succinctly summarise the performance of a method on larger datasets or for comparing the performance of multiple methods. Further evaluation may be done by using the proposed algorithm in an application that requires a correspondence mapping (e.g., shape retrieval [35], consistent quadrangulation [4,36,37]).

Dataset
For this track we have identified a set of synthetic models and real-world scans of 3D shapes, specifically four-legged animals, and produced a set of ground-truth correspondences. Shapes have been mended to remove major errors such as selfintersecting faces and handles which cause erroneous high genera. Ground-truth correspondences were acquired by asking specialists in geometry processing and animal studies to label the shapes manually using a bespoke labelling tool (see Fig. 2).
Because the dataset includes real-world scans, many of the shapes contain geometric inconsistency and topological change caused by self-contact. The real-scans also contain natural noise, varying triangulation and self-occluded geometry. Some examples of challenging cases are shown in Fig. 3. The dataset contains 14 models that have been acquired using a variety of techniques (see Table 2). Because the dataset is limited to quadruped mammals, many regions share a similar shape or function, it is therefore possible to establish correspondences between homologous loci with a reasonable degree of accuracy. While the dataset size might initially be considered to be quite small, for the purposes of computing and evaluating corresponding pairs, there are P(14, 2) = 182 permutations of shape pairs-or 149 pairs when excluding full-to-partial pairs. Our benchmark experiment participants were asked to complete a subset of these pairs comprising of matching pairs of full-tofull and partial-to-full models.
The ground-truths for this dataset are acquired using the originally sourced mesh. Three specialists labelled corresponding points on each shape based on a template shape that had initially been labelled with markers. For each point, multiple experts propose a correspondence on the surface and a consensus was found by selecting the medoid point. Approximately 50 marker positions were initially selected on the rhinoceros. The rhinoceros was selected as the template since, although it was reconstructed from multi-view camera array, the shape was subsequently corrected by a professional CGI artist.
For the benchmark, where models have an exceedingly high triangle count, the mesh is simplified to 100,000 triangles. Participants could also submit results using a low-resolution version of the meshes with 20,000 triangles that were also made available.
Ground-truth correspondences were not made available to participants during the track and were solely reserved for evaluative purposes.
Information about the data underpinning the results presented here, including how to access them, can be found in the Cardiff University data catalogue, where the dataset has been split into two parts based on the licenses associated with the data (http://doi.org/10.17035/d.2020. 0112373427 (Sketchfab), http://doi.org/10.17035/d. 2020.0112716358 (AIM@SHAPE)).

Test sets
Pairs of scans were carefully selected to ensure the nonisometry present in each test-set gradually increased. A description of the contents of each test-set may be found in Tab. 3.

Initial correspondences
For many shape correspondence and registration algorithms, a sparse set of correspondences is required for initialisation. A set of high quality sparse correspondences enables subsequent automatic refinement of the estimated non-rigid deformation. However, a poor set of initial correspondences may cause the algorithms to fail. For the purposes of establishing correspondences automatically, it is important to select a robust initial correspondence strategy.
To produce a set of candidate correspondences, SHOT signatures [39] at two radii (2% and 5% of the square root of the total triangle area) and IWKS [40]-a spectral descriptor-were examined, as well as a combination of SHOT and IWKS used together. SHOT was found to produce the most correct correspondences. A spectral pruning method proposed by Tam et al. [41] was used to remove noisy candidates and produce a set of globally consistent correspondences. For this method to work optimally input geometry must be locally isometric, however, this was rarely the case in our dataset. Due to memory limitations and computation time, correspondences were computed with the default parameters, except K = 5 (which specifies how many initial correspondences are found for each point in the source mesh) and d = 0.25 (which corresponds to the local neighbourhood size in diffusion pruning).

Baseline N-ICP
Bouaziz and Pauly [42] describe a naïve non-rigid registration method. The method computes an initial set of correspondences using nearest neighbours, and formulates a data term with point-to-point and point-to-plane metrics. Deformations are regularised by global and local rigidity measures. Local regularisation uses the as-rigid-as-possible formulation, proposed by Sorkine and Alexa [43]. An updated set of correspondences is estimated using the new nearest neighbours based on the present shape deformation. Shapes were registered after being resized by their total triangle area. The terms are combined as the following energy minimisation problem: The following parameters were used: w1=1, w2=1, w3=1, w4=1000, iter=100.

Non-rigid registration under anisotropic deformations
Dyke et al. [2] propose a two-stage iterative registration framework. In the first stage a correspondence mapping between surfaces is estimated by applying a variant of non-rigid  ICP with an r-ring as-rigid-as-possible constraint for regularisation of larger neighbourhoods. The second stage uses the computed mapping to estimate local anisotropy on the surface, represented by a discrete 2-tensor field. The anisotropy map is used to compute anisotropic geodesics for use in an extended spectral diffusion pruning method [41]. The authors observe that non-isometric areas may have few correspondences. This is rectified by interpolating between nearby correspondences that are considered to be good in order to provide correspondences for such problem areas.
The algorithm is initialised using the pre-computed sparse correspondences.

Robust Non-Rigid Registration with Reweighted Position
and Transformation Sparsity To address large-scale motion in non-rigid deformation, a non-rigid registration method with sparsity-regularised position and transformation constraints is proposed by Li et al. [44]. The distribution of positional errors and transformation differences for typical non-rigid deformation can be well modelled using the Laplacian distribution, or equivalently, the L 1 -norm should be used to measure both the positional errors and transformation differences. To promote the sparsity, a re-weighted sparse model is adopted, which is solved by the alternating direction method of multipliers (ADMM). The model is robust against outliers as the sparsity terms allow a small fraction of regions with larger deviations. The method is evaluated on both public datasets and real datasets, captured by an RGB-D depth sensor. The results demonstrate that the method obtains better results than other state-of-the-art non-rigid registration and correspondence methods [44,6].
This method requires an initial set of sparse correspondences, the pre-computed correspondences were provided to the participating authors.

Efficient Deformable Shape Correspondence via Kernel Matching
Vestner et al. [45] consider the importance of certain properties for establishing a quality correspondence mapping, namely: ensuring the predicted mapping is a homeomorphism (bijective, and both itself and its inverse are continuous) and promoting matches of similar points.
The method is controlled primarily by two parameters, α and t. α balances the data and regularisation terms. t is the time parameter for heat diffusion. In practice, changing parameter t changes the influence of distant points during propagation. Vestner et al. [45] state the importance of selecting a large value for t in scenarios where a large amount of noise is present in the initial correspondences.

Deblurring and Denoising of Maps between Shapes
Ezuz and Ben-Chen [46] identify that, while versatile, functional maps tend to recover low fidelity correspondences. To address this, they propose an approach that aims to improve the specificity of a given functional map by introducing a novel smoothness prior. The method is also designed to work in cases of highly non-isometric deformation. A regularisation term that promotes smooth mappings, which helps to remove noise from mappings, is incorporated to compute a functional mapping. Precise vertex-to-point correspondences are then recovered using an improved ICP-based recovery method.
Due to memory limitations a subset of 200 correspondences were selected using geodesic-based farthest point sampling.
For experiments we set the number of basis functions (k 1 & k 2 ) to 120. The method requires an initial set of landmarks, the precomputed SHOT correspondences with diffusion pruning were used for initialisation.

Partial Functional Correspondence
Rodolà et al. [29] propose a functional mapping method that is capable of robustly finding correspondence between non-rigidly deforming partial and full shapes. Observing that the functional mapping between two full near-isometric shapes should be approximately orthogonal and full rank, they investigate how partial shapes deviate from this. The authors take advantage of the low rank and sloped nature of functional maps in partial cases. Rodolà et al. [29] incorporate novel regularisation terms into a two-step optimisation process. The first step optimises the correspondence of the functional map based on an estimation of how partial the source shape is with respect to the target shape. The second step penalises any change in area and the length of the boundary of a part. This approach is not robust to non-isometric deformations.
Shapes were re-scaled to have surface areas between 1.5×10 4 and 2.0 × 10 4 . All other parameters remain as per their default, except n_eigen = 100.

Continuous and Orientation-preserving Correspondences via Functional Maps
Ren et al. [47] seek to address the problem of intrinsic symmetries when estimating correspondence using functional maps. The authors incorporate an orientation preserving constraint term into the optimisation function used to compute a functional mapping, incorporating surface normal information through the use of triple products, although this part of their technique was not used in experiments. The mapping is enhanced with a novel iterative refinement method to further ensure bijectivity and continuity. Outliers are efficiently detected and removed by measuring the Euclidean distance between two pairs of points. Unmapped points are reassigned a correspondence based on neighbouring correspondences. A further step improves point-wise mappings to promote the continuity of correspondences.
The region-level correspondence method of Kleiman and Ovsjanikov [48] was used to establish an initial correspondence between regions using the default parameters, except numComponentsRange = {10, 9, 8, 7}. An initial functional mapping was computed using Nogneng and Ovsjanikov [49], with k 1 = 120 & k 2 = 120. The refinement method proposed by Ren et al. [47] was run for 10 iterations to recover point-topoint correspondences.

CMH Connectivity Transfer
We use the CMH framework proposed in Marin et al. [50] and extended on animals in Melzi et al. [51] to establish correspondences by transferring the connectivity. The method relies on extending the standard Laplace-Beltrami Operator (LBO) basis by adding three additional bases that encode extrinsic information of the meshes. This combination of intrinsic and extrinsic information permits fully encoding the geometry of the models without information loss due to a low-pass representation. A functional map is then computed, as proposed by Nogneng and Ovsjanikov [49] using six hand-placed landmarks as probe functions. Finally, the connectivity is transferred using the point-to-point correspondence and refined using an asrigid-as-possible energy. The match is recovered by finding the nearest neighbour between the target model, and the source connectivity transferred over the model. This method assumes that the target and source shapes share the same pose, and does not use the coherent-point-drift local refinement as proposed in the original paper.

ZoomOut
Similarly to CMH [50], ZoomOut [52] computes standard LBO bases, which it then refines. Correspondence and functional map computations are iterated between, increasing the dimension of the mapping at each step. The following parameters were used: 20 as input and 360 as output for the dimension of the functional map, using an incremental step of 10 and 1,000 samples with farthest point sampling for the correspondence step. As with CMH, the connectivity is transferred and the result is refined using as-rigid-as-possible optimisation.

R3DS Wrap 3.4
Russian3DScanner [53] developed Wrap 3, a commercial tool to transfer shape topology through non-rigid registration. The tool uses a variant of coarse-to-fine N-ICP with the facility to provide initial correspondences in the form of hard constraints to further help. This commercial software was found to be highly performant in a previous benchmark with lesser degrees of non-isometry by Dyke et al. [6].
The default parameters were left unchanged, and the method was provided with initial pruned correspondences.
Classification. Based on the comprehensive survey paper of Sahillioglu [54], all methods have been categorised (see Table 4) based on the criteria described. Please refer to the original survey for the precise definition of each criterion. Table 5, most methods submitted results for each test-set, except [52] & [50]. This was mainly because these methods were designed to primarily handle cases where objects have the same genus. In this dataset, test-set 2 does not contain any topological changes.

Error measure
For convenience, we describe the protocol of Kim et al. [22] here. For an estimated correspondence (x i , y i ) ∈ X × Y and the respective ground-truth correspondence (x i , y * i ) ∈ X × Y. The geodesic distance between the corresponding points on Y is d Y (y i , y * i ) (an example of geodesics is shown in Fig. 4). The area of shape Y is used to normalise the distance. The error of the estimated correspondence is be measured as Cumulative error curves are subsequently produced by counting the number of correspondences with an error ε(x i ) less than a given threshold of normalised geodesic distance , i.e., ε(x i ) ≤ .

Surface coverage measure
To further assess the performance of methods we develop a coverage measure. The measure is derived by first segmenting a shape's surface into discrete regions and then summing the area of regions that contain a correspondence, this value is then normalised by the shape's total area. When few regions contain a correspondence the resulting value will be low, this indicates potentially poor overall correspondence between surfaces. By being able to numerically summarise the quality of a set correspondences, it is possible to gain valuable quantitative insights of a method's performance over a large dataset. Furthermore, as we demonstrate in this section, by varying the number of regions on a shape, it is possible to gain an even greater understanding of how corresponding points are distributed over the target shape. See Alg. 1 for a detailed description of the implementation used. given Voronoi cell r ← r + i ((Q i == s) · A i ) // sum area of the Voronoi cell end end r ← r/ i A i // normalise r by the total surface area Region segmentation. A set of seed points S on the target surface are selected using a geodesic-based farthest point sampling strategy. This helps ensure a reasonably evenly distributed sampling. To obtain discrete regions, a Voronoi segmentation Q is subsequently computed using the initial seeds S . Fig. 6 illustrates the segmentation of a shape using successively greater numbers of seed points n.
Segmentation density. By varying n such that {n ∈ N + | n ≤ |V|} a reasonably smooth and intuitive performance measure is extracted. When n = 1, the shape is unsegmented and a single region covers the whole mesh. If the surface has just one correspondence to any point, the output of Alg. 1 would be r = 1 (i.e., 100% coverage for n = 1). While when n = |V|, the barycentric cell of each vertex is a discrete region. To achieve 100% coverage for n = |V|, each vertex must have an associated correspondence. Precise correspondence. When handling correspondences with sub-vertex accuracy, each correspondence is associated with the closest discrete point on the shapes' surface. The barycentric cell of each vertex is treated as the vertex neighbourhood, as shown in Fig. 5. Points within a given neighbourhood are assigned to that respective vertex.
Segment weighting. Since the initial sampling of seed points does not guarantee that the area of each segmented region is uniform, the regions are weighted by their respective area, i.e., where s identifies a unique seed point/region Here, the segmentation classification Q is a list of indices where i refers to a specific vertex and Q i is the seed point s that is closest to that vertex.
Computation time. Whilst the initial segmentation may be costly to produce, this approach allows the resulting segmentation to be cached and used for further comparison of correspondences with little additional computation. The most costly operation is computing geodesics for the distance map, the complexity of the popular method proposed by Kimmel and Sethian [55] is O(|V| 2 log |V|).
In Fig. 7, an example shape is used to illustrate different ways in which correspondences may be distributed using a set of synthetic points. Fig. 8 complements this, demonstrating the response of the coverage measure as the number of segments is varied. It can be seen that the characteristics of the shape of each curve vary by the type of correspondence computed.
In the case of a bijective mapping, the coverage value will remain at 100% regardless of how fine the segmentation is. Note that this does not mean that the reported correspondences are correct, but that every point on the target surface has a pointto-point correspondence on the source shape. In the case that a part of a shape is not matched (e.g., a leg), the metric will drop off quickly, with no correspondence, then assuming the rest of the shape is successfully matched, the curve should have a gradient equal to zero. For methods that report evenly spaced sparse correspondences, the coverage should remain high until the frequency of the Voronoi cell samples is sufficient to cover areas in-between the sparse correspondences.

Results
In this section we discuss the performance of correspondence methods with respect to the two measures described in Section 5.

Surface coverage
In Fig. 9 we measure the coverage achieved by each method that completed all test-sets. It is important to note that since a subset of the shapes in the dataset are partial, as the number of Voronoi cells on the surface increases a coverage score of 100% cannot be maintained. The curves of all methods monotonically decrease, this is because all methods report dense correspondences; therefore the sparsity of correspondences is not a factor in these results. Based on the curve characteristics discovered in the example of the coverage measure in Section 5, Vestner et al. [45] exhibit the closest performance to full and dense coverage, with Ren et al. [47] performing second best. This is understandable as both methods promote bijectivity and therefore produce both dense and well distributed correspondences. There is a significant performance gap between the other methods. This may indicate that the other methods do not promote, or do not strongly promote, bijectivity.
Figs. 10 & 11 report the performance of methods measured by coverage in each respective test-set. In test-set 2 (Fig. 10c) results for the methods of Melzi et al. [52] and Marin et al. [50] are shown. Both methods perform exceedingly well with a relatively high level of surface coverage. The results for test-sets 1-4 suggest that most methods failed to establish correspondences for one or more parts of each target shape. Fig. 12 reports cumulative geodesic error curves for methods that have completed all test-sets. reporting the area under the curve (AUC) of each respective method. Overall, most methods appear to perform similarly, with Ren et al. [47] and Bouaziz and Pauly [42] performing the best. However, also taking into account the results of the coverage measure, Ren et al. [47] produce more desirable results with greater coverage of the target shape.
In Fig. 13 results for each test-set are presented. Results for Melzi et al. [52] and Marin et al. [50] on test-set 2 are shown in Fig. 13c. These methods achieve superior correspondence accuracy in comparison to the fully-automatic methods.
Test-set 0 contains only partial-to-full scans. Bouaziz and Pauly [42] perform particularly well, this may be due to shapes having a similar initial orientation, which is important for N-ICP-based methods. Several methods achieve higher levels of accuracy on test-set 2, this may be due to this test-set containing little topological change. With the exception of Ren et al. [47], most methods perform poorly on test-set 4. This is likely to be due to the higher degrees of non-isometry exhibited. The performance of Ren et al. [47] may be due in part to the use of a region-level correspondence method [48] for initialisation, since the correspondence method works particularly well on homogeneous shapes.

Conclusion
In this paper, a new benchmark dataset of non-isometric nonrigidly deforming shapes has been proposed to evaluate the performance of shape correspondence methods. To ensure greater accuracy, ground-truth correspondences were established by asking multiple specialists to annotate the animals. The performance of a variety of methods was evaluated using this dataset. Whilst traditional measures of correspondence accuracy are useful, they do not show the full picture of a correspondence method's performance. To address this, a new measure of correspondence coverage has been developed. The coverage measure helps quantitatively indicate the sparsity and distribution of correspondences. We find that Ren et al. [47] achieve the greatest accuracy, as well as a high degree of surface coverage, making it the overall best method in this scenario.
Both Melzi et al. [52] and Marin et al. [50] present semiautomatic methods that achieve superior accuracy compared to the fully-automatic methods. Though not evaluated here, in the paper of Groueix et al. [19]-a data-driven shape correspondence method-the authors make use of an out-of-the-box blend skinning model [8] for training. It would be interesting to see how well this method performs on this dataset where there are animals that have not been observed in the trained model.
As the accessibility of 3D scene capturing tools increases, there is a greater need for high-quality datasets that may be used for benchmarking and training purposes. Restrictive copyrights on existing works make the curation of such datasets challenging. Websites such as Sketchfab and AIM@SHAPE-VISIONAIR Shape Repository provide simple licenses that make it clear what the original creator permits and may enable further datasets to be produced.