EUCLIDEAN SKELETONS USING CLOSEST POINTS

. In this paper, we present an eﬃcient algorithm for computing the Euclidean skeleton of an object directly from a point cloud representation on an underlying grid. The key point of this algorithm is to identify those grid points that are (approximately) on the skeleton using the closest point information of a grid point and its neighbors. The three main ingredients of the algorithm are: (1) computing closest point information eﬃciently on a grid, (2) identifying possible skeletal points based on the number of closest points of a grid point and its neighbors with smaller distances, (3) applying a distance ordered homotopic thinning process to remove the non-skeletal points while preserving the end points or the edge points of the skeleton. Computational examples in 2D and 3D are presented.

1. Introduction. The skeleton is a compact representation of an object. It has been used in object representation, identification and recognition. The interest of the skeleton as a representation for an object rises from several nice properties: [23,31] (1) it is homotopic to the original shape, (2) it is invariant under Euclidean transformations, and (3) the object can be reconstructed based on the distance associated with each skeletal point on the skeleton.
Several definitions of the skeleton of an object have been proposed in previous work. For examples, 1. the skeleton consists of the points which have at least two closest points from the boundary of this object [1]. 2. the skeleton consists of the centers of all the interior maximal circles (2D) or maximal spheres (3D) [10]. 3. the skeleton consists of the points where different firefronts (from the boundary) intersect [8]. 4. the skeleton is the loci of the singularities of the distance transform [8].
A lot of work has been done to compute the skeleton based on different approaches. Some approaches are based on a discrete representation of the geometry, i.e., point clouds. The Voronoi diagram based methods [26,28,29,30] and the Delaunay triangulation based methods [15,25] are typically used in this situation. These methods usually preserve the homotopic property pretty well and locate the skeletal points accurately when the data set is dense enough. However, in practice, the results are not invariant under Euclidean transformations, and the computational complexity can be relatively high. Other approaches are based on a continuous representation of the geometry, i.e., curves or surfaces, and the skeleton is defined in terms of the singularities of the distance map one way or the other. For instance, methods based on the Blum's grassfire formulation [3,9,19,24] attempt to simulate the grassfire transform, peeling off layers from an object while keeping the potential skeletal points, but they fail to localize the skeletal points accurately. Methods based on the internal flux or the singularity [4,16,21,20] attempt to detect the local maxima or the corresponding discontinuous derivative of the distance map. The Hamilton-Jacobi skeleton [31,33] uses the limiting property of the average outward flux of the vector field or the momentum field corresponding to the underlying Hamiltonian system, and combines it with a flux-ordered homotopic thinning process by thresholding the flux. The Euclidean skeleton [23] uses two measures to characterize the singularities and combines them with a topological reconstruction. In these methods the skeleton is defined as the singularities of the distance map to the boundary or similar formulations. Typically, the distance map is computed on a grid and the singularities of the distance map have to be detected through the gradient of the computed distance map. Although the singularities of the distance map are well defined theoretically as the discontinuities of the gradient, it is quite subtle to numerically detect the singularities from the computed distance map and its gradient. In practice, this makes it difficult if impossible to develop numerical procedures for both accurate and robust detection of the singularities. Different formulations are proposed in the above referenced works and the references therein to address this difficulty. Algorithms based on the potential field, the repulsive force field or the gradient vector flow for extracting curve-skeletons were proposed in [11,12,17] and references therein.
In this work we develop an algorithm for computing the Euclidean skeleton directly from point clouds on a rectangular grid. Here is a brief summary of our motivations and some key ideas: • Point cloud representations have become more and more popular in real applications. Our method deals with discrete data sets directly to avoid an ill-posed reconstruction of a continuous representation, which can be expensive, difficult and can introduce artifacts. • The skeleton is computed on an underlying grid. The data structure and algorithm are very simple. • The detection of the skeletal points is based on the closest point information of a grid point and its neighbors with smaller distances. No gradient information of the distance map is used. An efficient algorithm based on the fast sweeping method [35,36] is available to compute the distance and the closest point information on a grid. • A distance ordered homotopic thinning process [27] is developed. The information of the closest points is used in removing the non-skeletal points while preserving the end or the edge points of the skeleton.
Another direct application of our method is to compute the skeletons for shapes obtained from image segmentation. A binary segmentation, i.e., a classification of all pixels into interior or exterior points, can be used. Our algorithm can take either interior boundary pixels or exterior boundary pixels as the discrete data points and construct the skeleton based on the pixels. Examples are shown in Section 4.
Here is the outline of this paper. In section 2 we present the fast sweeping algorithm for computing the distance and the closest point information on a rectangular grid. In section 3 we present our algorithm with some motivations and explanations. In section 4 we present computational examples.
Before continuing on, we recall some definitions and notations from digital topology [34,22,18  2. The closest point solver. In this section, we present an efficient algorithm [13,35,36] for computing the closest points and the distance map to a data set on a rectangular grid. Our algorithm for computing the skeleton from point clouds relies on the closest point information and the distance on the grid (see Section 3). Let Γ be a set of discrete points that represents the boundary of an object Ω. For a point p ∈ Ω, a point p ∈ Γ is closest to p if dist(p, p ) = min q∈Γ dist(p, q) p may not be unique if p is a skeletal point. Once such a closest point is found, the distance u(p) can be computed easily as Theoretically, the distance map can also be defined as the viscosity solution of the Eikonal equation: For the above nonlinear hyperbolic partial differential equation (PDE), information (e.g. boundary conditions and closest points) propagates through the characteristics that are defined through the characteristic system for the Eikonal equation (1) [14], and the distance value increases along the characteristics from the boundary. In a discrete setting, the fast sweeping algorithm utilizes this observation to compute the viscosity solution and the closest points. All propagation directions can be divided into a few groups which can be followed efficiently by a few alternating orderings. Here is the algorithm [13,35,36]: Algorithm (the closest point solver)

Initialization:
Put each data point into a grid cell. Assign one closest point and the corresponding distance to each vertex of all the grid cells containing at least one data point.

Sweeping:
Go over the whole grid and update the closest point and the distance at each point in the following way: the closest point information and the distance at a grid point is updated through the closest point information and the distance of its neighbors. Iterate this sweeping using different orderings until no further change is detected.
Here are more detailed explanations, referring to Figure 1. We give explanations in 2D, extension to 3D is straightforward. In the initialization step, for a grid cell  containing some data points, each vertex of this cell is assigned a closest point and the distance to this closest point. See Figure 1(a), we only needs to go through the list of all data points once in the following way: when going through the list, for a data point s 0 in a grid cell with vertices s 1 , s 2 , s 3 and s 4 , each of the four vertices may have already been assigned a closest point, say s 1 , s 2 , s 3 and s 4 respectively. We determine for each vertex, whether s i (i = 1, 2, 3, 4) should be replaced by s 0 simply by calculating the distance to s 0 and choosing the minimum one compared to old distance that has already been assigned at this vertex. Hence the complexity of this step is proportional to the number of the data points.
In the sweeping step, see Figure 1(b), for a center grid point p 0 and all its neighbors p 1 , p 2 , ... and p n , each of them may have already been assigned a closest point p 0 , p 1 , p 2 , ... and p n respectively. In order to update the information at p 0 , we calculate the values u i = dist(p 0 , p i ) for i = 0, 1, 2, ..., n, if u j = min i=0,1,2,...,n u i for some 0 ≤ j ≤ n, we update p 0 = p j and u(p 0 ) = u j .
The sweeping will converge in a finite number of iterations [36]. Typically, 4 iterations in 2D and 8 iterations in 3D are enough. Hence the complexity of this step is proportional to the total number of the grid points.
A key idea of the above algorithm is to propagate the closest point relation, which is a global information with respect to the data set, through neighbors efficiently from those grid points near the data set to the whole grid by alternating sweepings. However, the above closest point solver may not be exact for two reasons: • The data set is not well resolved by the underlying grid. For example, if there are more data points than the vertices in a grid cell, some data points will be missed during the initialization step. • A finite number of neighboring grid points aligned with a finite number of the directions cannot cover all closest point configurations [13]. The error analysis of the computed closest point and distance value has been given in [35,36].
With the information of the closest point and the distance value available for each grid point we will design criteria to select skeletal points based on this information in the following section.
3. Motivation and algorithm. The starting point of our algorithm is: for a given point, by counting the number of distinct closest points of its neighbors with smaller distances and itself, one can detect the convergence of different characteristics and hence the possible skeletal points. This observation is directly related to the definition of the skeleton. The reason why we only choose neighbors with smaller distances is because of the causality of the distance map observed through the characteristics, i.e., information propagates from points with smaller distances to those with larger distances along the characteristics.
However, since the shape is described by discrete points, we need to distinguish the spurious skeletal points, which are of equal distance to neighboring data points, from the real skeletal points. We use an example in 2D for the illustration purpose. By the definition, each skeletal point has at least two closest points from the boundary. For example, in Figure 2  are s and s . The characteristics emanating from s and s meet at s. In other words, any point on these two characteristics also has s or s as one of its closest points respectively. This information can be passed along the characteristics from the boundary to the interior. One natural criterion for a true skeletal point is that the separation distance between s and s should not be too small, e.g., larger than the distance between two neighboring data points. This partially describes a skeletal point. But the separation distance is not a dimensionless quantity, i.e., it is not scale invariant. For example, Figure 2(b) shows that it may not work well. The separation distances are very different at points s 0 and s 1 . In practice, only the separation distance is not enough to distinguish the skeletal and non-skeletal points. We also apply the ratio d/2 u to classify skeletal points, where d is the largest distance between any two closest points and u is the distance at s. This ratio is an approximation to the so-called bisector function in [23,6,5]. As in Figure 2(b), the ratio at s 0 is comparable to that at s 1 . The above argument also implies that the algorithm can detect a possible skeletal point that has more than two closest points as long as we can detect at least two of them that are well separated compared to the grid size. Hence our closest point solver, which may not be exact, should be a good approximation for this purpose.
After we apply the closest point solver, each point will be assigned a closest point and a distance value. Then we define the Neighbor-Closest Point (N CP ). 2 ) (k ≤ 5 in 2D, k ≤ 7 in 3D).
As discussed above, at a skeletal point s, #N CP (s) ≥ 2. Therefore the number, #N CP , can be used to detect possible skeletal points. All the points with #N CP ≥ 2 will be located on or near the Voronoi boundary of Γ. Therefore, by identifying all points with #N CP ≥ 2, we locate the real skeletal points as well as Voronoi boundaries of Γ in a discrete sense. In 2D, the set of all Voronoi vertices is a good approximation of the skeleton when the data set is dense [1,2]. Referring to Figure  3, if we remove all secondary branches (shown in black), the rest of the Voronoi boundary is a well-defined skeleton (principal branches shown in red). In 3D, not all Voronoi vertices contribute to the approximation of the skeleton, instead, only a subset of the Voronoi vertices gives an approximation of the skeleton [1,2].
In a discrete setting, simply locating all points with #N CP ≥ 2 results in a thick set containing spurious branches, holes and cavities. So in order to avoid getting holes or cavities by locating points with #N CP ≥ 2, we apply a distance ordered homotopic thinning process [27] and design rules to remove non-skeletal points and preserve skeletal points.  Figure 4 shows a few important observations after we apply the closest point solver for a data set on a rectangular grid: (a) shows the "Voronoi diagram" after locating all points with #N CP ≥ 2, which has holes (enlarged in (b)), and the #N CP at each grid point. We see that points on secondary branches as well as points on the skeleton may have #N CP ≥ 2 due to discrete nature of the data set. So #N CP ≥ 2 is not a sufficient criterion for skeletal points. (c) shows the separation distance d at each point. The separation distance d of a point on a secondary branch is much smaller than that of a point on a principal branch, which depends on the sampling of the data points. And (d) shows the ratio d/2 u at each grid point. In particular we have (1) each point in the "interior" of a Voronoi cell has #N CP = 1. (2) each Voronoi vertex or end point of the skeleton has #N CP ≥ 3 and relatively high ratio d/2 u , (3) each point on a principal branch has #N CP ≥ 2 and the ratio d/2 u is relatively high and pretty uniform, (4) most points on a secondary branch have #N CP ≤ 2 and the separation distance d (∼ the separation distance of neighboring data points) or the ratio d/2 u is small. More subtle points are those close to a Voronoi vertex (Figure 4(a)). Due to converging characteristics, a fixed resolution of the underlying grid and the property of the closest point solver on a discrete setting, some of these points may have #N CP ≥ 3 and relatively large separation distance d as well as large ratio d/2 u . Our strategy is to efficiently locate the Voronoi boundary, and remove secondary branches while retaining principal branches by preserving Voronoi vertices when they become end points. For this purpose we construct a distance ordered homotopic thinning process [27] which removes the non-skeletal points while preserving the end points of the skeleton and hence the skeleton. The thinning process consists of two steps. The first step is to remove simple points that are in the "interior" of Voronoi cells (#N CP = 1), and simple points on or near secondary branches with either #N CP ≤ 2, small d or small d/2 u . The remaining set consists of points on or near the skeleton, i.e. a little bit thicker set containing the skeleton. So the next step is a post-process to further remove non-skeletal points. Here is our 2D algorithm with explanations.
2D Algorithm step 1 All the points are ordered according to the distance. This set is denoted as S.
step 2 According to the ascendent ordering, go over all the points in S. For each point p in S, if p is simple: • repeat the process until no further points are removed. Then move to next step.  u and d. The set of Voronoi vertices is a good approximation of the skeleton if the data set is dense [1,2]. Theoretically, an end point has at least three closest points, and a Voronoi vertex has at least three closest points too. Therefore, when an end point has #N CP ≥ 3, relatively large d/2 u and d, it is a possible skeletal end point. The difference between step 2 and 3 is how to deal with simple but not end points. In step 3, if a point is a simple but not an end point, it is removed immediately. But in step 2, the removal of such a point is more cautious. It also depends on its #N CP , d/2 u and d. When a point has #N CP ≥ 3, relatively large d/2 u and d, it is very likely to be a skeletal point. In particular, if it is an end point, removal of it may cause damage of the whole skeleton in successive thinning process. For example, see Figure 5, p and q are two possible skeletal points, u(p) ≤ u(q), but #N CP (p) ≥ 3, #N CP (q) ≤ 2. The thinning process reaches p first since it is distance-ordered. p is a simple point but not an end point. It should be removed immediately if without checking its #N CP , d/2 u and d. After p is removed, q is removed because #N CP (q) ≤ 2, then the whole branch may be removed by #N CP ≤ 2. The purpose of step 2 is to remove q but retain p, thus the whole branch is preserved. The resulting set of step 2 may still have simple but not end points left, so step 3 serves as a post-process of step 2 to remove such points (e.g. Example 1, Figure 7 in section 4). Step 2 and 3 may be repeated. Remark 1. One important issue of our grid based skeleton construction for point clouds is the grid size, which affects the resolution, accuracy and complexity of the algorithm. To construct the skeleton for a uniformly distributed data set with best possible resolution and accuracy, the grid size should be totally determined by the sampling density of the data set, i.e, the grid size should be comparable to separation distance between neighboring points. However, an advantage of our approach is that coarser grid may be used for faster construction of a low resolution skeleton approximation. If the data is non-uniform the grid size should be adaptive to the local sampling density ideally, which is not done in this paper. For applications to image segmentation, a natural choice is the underlying image grid/pixel.

Remark 2. Our measures d/2
u and d are similar to those used in [23,6,5]. In general the thresholds for d/2 u and d depend on sampling density, feature size and noise level. For example, d should be comparable to the separation distance between neighboring data points or feature/perturbation size we would like to filter out from the data set. d/2 u is an approximation of the bisector angle. An ideal choice should be adaptive to local sampling density, feature size and noise level. If the data contains noise, these thresholds can be used also as low pass filter and should depend on noise level. For example, small features as well as noise will be smeared if we choose large threshold for d, i.e., small shocks due to small perturbations will be absorbed into large shocks a little bit further away. Computational examples are presented in section 4 to show the results with different thresholds.
Remark 3. In 2D, #N CP is very useful for the removal of most points on or near secondary branches of the Voronoi diagrams. In particular, when data points are non-uniform, the separation distance criteria d varies a lot and d/2 u is always large for points on secondary branches that are close to the boundary. The characterization of the end and the edge points in 3D is more complicated than that in 2D. In [27], criteria are designed to detect the end points of a curve and the edge points of a surface: a point is an end point of a curve if it has less than two 26-neighbors, and a point is an edge point of a surface if it is an end point of a curve on any of the 9 digital planes shown in Figure 6. For details please refer to [27] and references therein. The following is our 3D algorithm. The idea and strategy are similar to those in the 2D algorithm: remove the non-skeletal points but preserve the skeletal points by keeping the skeletal end and edge points during the thinning process.
3D Algorithm step 1 All the points are ordered according to the distance. This set is denoted as S. step 2 According to the ascendent ordering, go over all the points in S. • else if p is an edge point, • repeat the process until no further points are removed. Then move to next step. step 3 According to the ascendent ordering, go over all the points in S. For each point p in S, if p is simple: • if p is an end point or has 2 26-neighbors in S: • repeat the process until no further points are removed. The 3D algorithm is an extension of the 2D algorithm. In 3D, the classification of simple points including end and edge points is more complicated. Moreover, only a subset of the Voronoi vertices contributes to the skeleton [1,2]. Theoretically, an edge point of the skeleton has at least three closest points and an end point of the skeleton has at least five closest points. However, some Voronoi vertices that are not on the skeleton may have the same properties. So, the way to distinguish such points depends on d/2 u and d.  Remark 5. Since only a subset of the Voronoi vertices contributes to the approximation of the skeleton in 3D, #N CP doesn't serve as a strong measure as in 2D to distinguish skeletal and non-skeletal points anymore, especially for edge points. 4. Numerical examples. In this section, we use a few examples to demonstrate our algorithm 1 . Both clean data and noisy data are used. For clean data we give points {p i } on the real boundary of an object. For noisy data we use perturbed points, {q i } = p i + e i where e i is some random perturbation. We show both computed skeleton and reconstruction from the skeleton, i.e., the union of disks (2D) or balls (3D). Reconstruction from noisy data shows some denoising effect which is due to thresholding, e.g., the threshold for the separation distance. Programs are coded with VC++ 6.0, MS Windows XP, Intel(R)Core(TM)2CPU T5500 @ 1.66GHz 980MHz, 1.00GB of RAM.   Figure 8 shows the same example with perturbations on the clean data, that is, the noisy data points (x, y) = (x 0 , y 0 (1 + 0.05 * rand{−1, 1})) with (x 0 , y 0 ) the clean data points, where rand{−1, 1} represents random uniform distribution on [−1 1]. Different thresholds are chosen to show the comparisons. The closest point solver and the 2D skeleton algorithm take CPU time much less than one second. Example 2: Another example with both clean and noisy data points is presented. The object is represented by 200 points. The computation is performed on a 200 × 200 grid. Figure 9 shows the results. First we use clean data (left), then we add noise to this data set with above method. Two different sets of noisy data points are used (middle and right). We show the computed skeletons (top row) and the corresponding reconstructions (bottom row). Example 3: In this example, we use two shapes from image segmentation to illustrate our 2D algorithm. Figure 10 and 11 show the results. The image segmentations 2 are obtained by Chan-Vese active contour model. For our method we don't need an explicit representation of the contour. Instead we only use a binary  classification of pixels based on any segmentation method and use those exterior (or interior) boundary pixels as our discrete data points. The computation is performed on sub-images that consist of 500 × 500 pixels and 300 × 300 pixels respectively. For the two images, both the closest point solver and the 2D skeleton algorithm take CPU time much less one second.
Note that the skeleton for the horse in Figure 10 has feet connected due to shadows in the original image and the particular image segmentation. On the other hand, some thinner shadows are removed by the thresholds.  Example 1: Figure 12 shows an example with clean data (7510 points). Figure  15 shows the same example with noisy data (7510 points). Different thresholds are chosen for comparisons. CPU times for both the closest point solver and the 3D skeleton algorithm are presented. The number of skeletal points left after each step is recorded. Table 1 shows the corresponding choices of different thresholds, number of points after step 2 and 3, CPU times (seconds) for closest point solver and 3D skeleton solver. The computation is performed on a 100 × 100 × 100 grid. Example 2: In this example, we show another object represented by 6450 scanned data points. Different thresholds are chosen to compute the skeletons. Figure 13 shows the computed skeletons and corresponding reconstructions. The computation is performed on a 23 × 83 × 257 grid. Example 3: Figure 14 shows a computational example with both clean and noisy data sets (9070 points). The computation is performed on a 100 × 100 × 100 grid.  the object, skeletal points after step 2 and step 3, and reconstruction.

5.
Conclusion. An efficient algorithm is proposed for computing the Euclidean skeleton of an object represented by point clouds on an underlying rectangular grid. The algorithm is based on a distance ordered homotopic thinning process. Information of the closest points is used to determine possible skeletal points and design rules to remove non-skeletal points but preserve skeletal points, especially the    Table 1. Data of Figure 15: choices of different thresholds, points after step 2 and 3, CPU times (seconds) for closest point solver and 3D skeleton algorithm.   Table 1.