Abstract
This paper presents an efficient feature point descriptor for non-rigid shape analysis. The descriptor is developed based on the properties of the heat diffusion process on a shape. We use, for the first time, the Heat Kernel Signature of a particular time scale to define the scalar field on a manifold. Then, motivated by the successful use of a local reference frame for rigid shape analysis, we construct a repetitive local polar coordinate system, which is invariant under isometric deformations. Finally, a binary descriptor is derived by comparing the intensities of the neighboring points for each feature point. We show that the descriptor is highly discriminative and can be computed simply using ‘intensity comparisons’ on a shape. Furthermore, its similarity can be evaluated using the Hamming distance, which is very efficient to compute compared with the commonly used \(L_{2}\) norm. Our experiments demonstrate a superior performance compared to existing techniques on the standard benchmark TOSCA.
This research is supported by China Scholarship Council (CSC No.201406070059) and Australian Research Council grants (DE120102960, DP150100294 and DP150104251).
You have full access to this open access chapter, Download conference paper PDF
1 Introduction
In 3D shape analysis, the extraction of feature descriptors aka local shape descriptors is a fundamental step [18]. Early research on feature based methods mainly focused on rigid shape analysis [10, 11]. A large number of descriptors have been developed, such as the local surface patch [9], spin image [14] and rotation projection statistics [12]. The development of non-rigid descriptors is more challenging due to the large degrees of freedom resulting from the local deformations. Several methods have been proposed, such as the geodesic mapping [13] and conformal factors [17]. However, these methods are sensitive to the topological noise and the geometric noise, which are inevitable in these applications. In the recent years, an intrinsic geometric property known as diffusion geometry has become popular and achieves the best performance [2, 5, 15, 18, 20, 22, 23, 27]. It is based on the spectral decomposition of the Laplace-Beltrami operator associated with a shape, and uses eigenvalues and eigenvectors to construct the diffusion distance, which provides an intuitive interpretation of the shape properties in terms of spatial frequency [15]. This work falls in the category of diffusion geometric framework.
The Local Binary Descriptor (LBD) has attracted a significant interest in the analysis of 2D images due to its computational simplicity and discriminative power [8, 28]. However, little efforts have been made to extend the LBD framework to the field of 3D shape analysis [26]. The key to the development of an LBD for non-rigid shapes is the construction of a repetitive Local Reference Frame (LRF), because the LBD requires an intrinsic order for computation. Several spatial structures have been proposed to facilitate feature descriptors for non-rigid shape analysis. In [29], an LRF is constructed using the surface normal and two vectors which lie on the tangent plane (a common method used for rigid shape descriptors). This LRF is not invariant under non-rigid transformations, because the relative positions of the points will change. In [7], a ‘multiple circular geodesic pathways’ is defined on the local surface using a fixed number of increasing geodesic distances. Within each circle, points are sampled in a clockwise direction with respect to the surface normal. In [15], the local surface is charted by shooting geodesic outwards from the feature point to form a polar coordinate system, where the ‘angle’ is defined as tantamount to the geodesic shooting direction, and ‘radius’ as the geodesic distance. However, neither of the methods solves the problem of orientation ambiguity during the construction of the LRF. In [26], the local surface is modeled as a structure of ordered and concentric rings around the central facet based on the categorization of the facets on its contour. Since it is fully dependent on the structure of the mesh, the LRF is not intrinsic and robust.
In this paper, we develop a local descriptor for non-rigid shape analysis, called Heat Diffusion based Local Binary Descriptor (HD-LBD). There are two main contributions in this paper. First, we construct a new repetitive Local Polar Coordinate System (LPCS) which is invariant under isometric transformations. Second, we develop a binary descriptor facilitated with the LPCS for non-rigid shape analysis. Experiments were performed to demonstrate the effectiveness of the proposed method.
2 Background
Diffusion geometry is one of the most successful approaches for non-rigid shape analysis. Reuter et al. [20] exploited the Laplace-Beltrami spectra as an intrinsic shape descriptor called Shape-DNA. Rustamov et al. proposed the Global Point Signature (GPS) [22] by associating each point with an \(l^{2}\) sequence formed by the eigenfunctions and the eigenvalues of the Laplacian. Sun et al. [23] developed the Heat Kernel Signature (HKS) based on the analysis of heat diffusion process. Kokkinos et al. [5] introduced the Scale Invariant Heat Kernel Signature (SI-HKS) using Fourier transform. Kokkinos et al. [15] developed the Intrinsic Shape Context (ISC) by aggregating HKS of the feature point’s local neighborhood to further improve its descriptiveness. In [2], the Wave Kernel Signature (WKS) was proposed based on a different physical model, in which one evaluated the probability of a quantum particle with a certain energy distribution to be located at a point. Litman et al. [18] and Windheuser et al. [27] used machine learning techniques to learn the spectral descriptor for a specific task (e.g. human recognition). Our work follows on the idea of diffusion geometry. Unlike [5, 15, 23], we use the HKS of a particular scale to define the scalar field on a manifold.
A large amount of LBDs exist in the field of 2D image analysis. Most of the comparison-based descriptors can be considered to be variants of the local binary pattern proposed in [19], where the intensities of some predefined pairs of neighboring pixels are compared to form a binary string for a feature point. In Binary Robust Independent Elementary Feature (BRIEF) [6], Binary Robust Invariant Scalable Key-point (BRISK) [16], Oriented FAST and Rotated BRIEF (ORB) [21] and Fast Retina Key-point (FREAK) [1], various ways of pixel pairs sampling were proposed. Ordinal Spatial Intensity Distribution (OSID) [24] and the Local Intensity Order Pattern (LIOP) [25] incorporated spatial information to improve the LBD’s discriminative ability. The works in [8, 30] proposed to use the full ranking of a set of pixels as a local descriptor, which is expected to encode the complete comparative information among pixels. The recent work in [26] proposed a framework to compute the local binary-like-patterns directly on a triangular-mesh. However, their work is specially developed for shapes with photometric and geometric information. To the best of our knowledge, we are the first to develop a local binary descriptor for non-rigid shape analysis.
3 Proposed Method
Suppose we are given a discrete representation of shape as a triangular mesh (V, E, T) with \(n_{V}\) vertices \(\{v_{1},\ldots ,v_{n_{V}}\}\), \(n_{E}\) edges \(\{(v_{i_{1}},v_{j_{1}}),\ldots ,(v_{i_{n_{E}}},v_{j_{n_{E}}})\}\) and \(n_{T}\) faces \(\{(v_{i_{1}},v_{j_{1}},v_{k_{1}}),\ldots ,(v_{i_{n_{T}}},v_{j_{n_{T}}},v_{k_{n_{T}}})\}\). Our goal is to derive a local shape signature that is invariant under isometric deformations.
An illustrative example of the proposed binary descriptor is given in Fig. 1. Basically our Heat Diffusion based Local Binary Descriptor (HD-LBD) is an extension from 2D images to 3D shapes. Therefore, the first step is the definition of scalar functions on manifolds [29] (see Sect. 3.1). Based on the definition of the scalar fields, an intrinsic Local Polar Coordinate System (LPCS) is constructed around the feature point (see Sect. 3.2 for details). Finally, in Sect. 3.3, we aggregate the information of the neighboring surface to form the binary string.
3.1 Scalar Field Definition
We model shapes as Riemannian manifolds M (possibly with boundary) embedded in \(R^{3}\). Let g be the scalar field defined on M. The real valued function g represents the geometric or photometric information of the shapes. In this paper, we consider the heat diffusion property, explained below.
The heat diffusion process over M is governed by the heat equation,
where \(\triangle _{M }\) denotes the positive semi-definite Laplace-Beltrami operator of M, a Riemannian equivalent of the Laplacian. The solution u(v, t) describes the amount of heat on the manifold at point v in time t with an initial condition u(v, 0). Since M is compact, \(u(v,t) = \int _{M }^{\infty }h_{t}(v,v')u(v')dv'\). \(h_{t}(v,v')\) is called heat kernel, and can be thought of as the amount of heat transferred from v to v’ in time t given a unit heat source at v. According to the spectral decomposition theorem, the heat kernel can be presented as
where \(\lambda _i\) and \(\varPhi _i\) are the \(i^{th}\) eigenvalues and corresponding eigenfunctions of the Laplace-Beltrami operator. Its restriction to the temporal domain results to
known as the heat kernel signature. This signature is not only concise and commensurable, but it is still informative and invariant to isometric deformations [23]. More importantly, the descriptor captures the geometric information of the local surface over a number of scales (multi-scale), which is determined by the time parameter t, as shown in Fig. 2. Particularly, for small values of t, it is related to the manifold curvature according to
where \(K(v) \) denotes the Gaussian curvature at point v.
Therefore, we adopt HKS over a small t to define the scalar field, which is supposed to reflect the intrinsic property of the local surface around the point v.
3.2 Intrinsic Local Reference Frame
Given a feature point v and a support radius r (defined using the geodesic metric), a local surface M’ is cropped from the mesh M. \(V_{N}=\{v_{i1},\ldots ,v_{ik}\}\) are the points lying on M’, and \(N_{1}(v)\) is the set of directly connected vertices to v, called 1-ring neighborhood. The construction of the Local Polar Coordinate System (LPCS) involves two steps: first to find the reference direction and then to chart the surface (see below for details).
Reference Direction. The first and the key step to construct the LPCS is to find its reference direction. Its accuracy directly determines the repeatability of the coordinate system. We adopt the method of intensity centroid [16], which is used to describe the orientation of a key point in the image domain. This method assumes that a corner’s intensity is the offset from its centroid, and this vector can be used as an orientation. Specifically, the moments of a patch P are defined as
where (x, y) is the cartesian coordinate of a pixel, and I is its intensity. With these moments, the centroid can be found at
The orientation of the patch is assumed to be
In our method, we use \(N_{1}(v)\) to compute the reference direction. In our case, the intensities of the pixels are replaced by the values of \(h_t(v)\) (the previously defined scalar function on the shape). The coordinates of \(N_{1}(v)\) are approximated as follows: first, we map v’s 1-ring triangles onto the plane partitioning it into several segments with angle ratios remained; then a rectangular coordinate system is constructed around the feature point centred at v. Thus, coordinates of all vertexes belonging to \(N_{1}(v)\) are obtained.
Finally, we derive the reference direction of the local polar coordinate system (the face \(T_{i}\) on which the reference direction lies and the deviation angle \(\theta _{i}\) from the triangle edge), denoted as R on the 1-ring triangles and R’ on the mapped plane.
Surface Charting. A mesh can be viewed as a piece-wise planar approximation of the underlying smooth surface. Using the standard unfolding procedure in [3], the local surface made up of triangles can be transformed into an image patch that is unevenly sampled. Similar to [15], the construction of the local polar coordinate system consists of 2 steps: directions initialization and propagation. The initial directions are established by first mapping the 1-ring triangles onto the plane, partitioning the plane into several segments of equal angles with respect to the reference direction, and finally mapping back to the mesh. The order of the directions can be clockwise or counter-clockwise. In order to resolve ambiguities, we adopt a simple yet practical solution similar to [26], that the direction on the mapped plane nearest to the reference frame is chosen as the next. In this way, all the directions (Fig. 1(c)) are ordered in a uniform way. Afterwards, the initial directions are propagated outwards from 1-ring (using the standard unfolding procedure [3]) until they reach the boundary of the ‘image patch’ defined by the radius r. Thus, the LPCS (Fig. 1(d)) is constructed, where the ‘reference direction’ is R and its extension, ‘angle’ is the angle between the geodesic shooting direction and R, and the ‘radius’ is the geodesic distance from v.
4 Local Binary Descriptor
With the previously defined scalar function (Sect. 3.1) and the constructed local polar coordinate system (Sect. 3.2), the local surface M’ around vertex v can be regarded as an image patch, on which the local binary descriptor is defined. In the case of 2D image analysis, a number of ways are used to extract point pairs for the construction of a local binary descriptor, such as [1, 6, 16, 21]. In our method, we propose the bit vector (defined in Eq. 9) based on all pairwise intensity comparisons, which turned out to be highly discriminative (Sect. 5.3).
To be specific, we define a test \(\tau \) of a point pair \(N_{i}(p_{i1},p_{i2})\) on the local surface M’ as
where \(h_{t}(p_{i})\) is the value of heat kernel signature with parameter t at \(p_{i}=(\rho _{i},\theta _{i})\).
The test in Eq. 8 considers only the information at a single point \(p_{i}\) in the neighborhood of v, and is therefore quite noise-sensitive. In order to increase the stability and repeatability of the descriptor, we include all the points falling into each bin of the local polar coordinate system, and use the average of their intensities as a unit for test (the same approach used in [28]).
The choice of the set of location pairs \(N_{i}(p_{i1},p_{i2})\) uniquely defines a set of binary tests. In our method, we propose a bit string (binary) descriptor with dimension \(n_{d}\) equal to the cardinality of \(N_{i}(p_{i1},p_{i2})\) as
5 Experiments
The experiments that we carried out have three main goals. First, we examined the repetitiveness of the proposed local reference frame. Second, we compared our proposed descriptor with other state-of-the-art techniques to show its effectiveness. Finally, we examined the effect of the parameters on the performance of the descriptor.
5.1 Dataset
The performance of our binary descriptor was evaluated on the TOSCA dataset [4]. We followed the experimental protocol in [18] using human shapes (12 female shapes in class ’vitoria’, and 2 different male figures containing 7 and 20 poses in class ’david’ and ’michael’ respectively). In each class, an extrinsically symmetric ’null’ shape undergoes near-isometric deformations. Objects within the same class have the same triangulation and an equal number of vertices numbered in a compatible way, which can be used as a per-vertex ground truth correspondence. A typical number of vertices on each shape is about 50000. In order to reduce the computational load and storage complexity, all the shapes were downsampled to 10000 vertices, maintaining compatible triangulations and ground-truth correspondences. We used the finite elements scheme in [20] to obtain the first 400 eigenvalues and eigenvectors of the Laplace-Beltrami operator on each shape. Then, the heat kernel signature was computed in a particular scale according to Fig. 2.
5.2 Performance of the Local Polar Coordinate System
To assess the uniqueness of the LPCS, we measured the repetitiveness of its key component - the reference frame (because it determines the initial directions, and the coordinate system is the propagations of initial directions across adjoint triangles in a standard unfolding way). The performance is evaluated in two aspects: the percentage of reference frames lying in the same face, and if so the errors. We randomly sampled 1000 points on the ’null shape’, and extracted the corresponding points within each class. The reference directions were computed and compared for each points. According to Sect. 3.1, we choose small t for the computation. The results are summarised in Table 1 and Fig. 3 respectively. When t = 10, it achieved the best performance, which we choose to compute the scalar field in the rest of the experiments.
5.3 Performance of the Binary Descriptor
We used a quantitative criteria to evaluate the performance of the descriptor, called Cumulative Match Characteristic (CMC). The CMC curve evaluates the probability of finding the correct match within the first k best matches. The hit rate at k is calculated by sorting all of the distances in ascending order, and calculating the fraction of correct match. We first extracted 500 furthest point samples from the null shapes using the geodesic metric, and then found corresponding points on the deformed shapes. We set rad = 2 (‘rad’ is the geodesic distance) between rings, 5 rings and 8 rays respectively, and finally compared with the method in [18] using the code made available by the authors. From Fig. 4 we can observe that, our proposed descriptor greatly improves the performance especially at the first few shoots.
5.4 Parameter Selection
Our descriptor has three free parameters: the number of rings/rays and the geodesic distance between rings (rad), which determine the total area of each bin in the Local Polar Coordinate System. Here, we set rad = 2. We simulated various parameters to study the effect of these parameters on the descriptor’s performance. The performance evaluations are shown in Fig. 5 with varying numbers of rings (nRings = 2, 3, 4, 5 and 6) and rays (nRays = 5, 6, 7, 8 and 9), while in the first experiment nRays = 8 and in the second nRings = 4. It can be observed that the performance improves as the number of rings increases, and then remains almost the same. On the other hand, with an increasing number of rays, the performance also improves to some extent, but drops afterwards. In general, with more rings and rays, more information can be captured by the descriptor. However, the performance will degrade if there are too many rays due to a low mesh resolution and sensitivity to noise. A high number of rings will include redundant information of the local surface and will greatly increases the descriptor’s dimension.
6 Conclusion
We introduced a binary descriptor equipped with an intrinsic local polar coordinate system to the field of non-rigid shapes. The descriptor is developed based on the analysis of the heat diffusion process defined on manifolds. Since the binary descriptor requires an ordered support for its computation, we construct a local reference frame which is supposed to be intrinsic on the shape, robust and repetitive. Our experiments reveal its effectiveness.
References
Alahi, A., Ortiz, R., Vandergheynst, P.: Freak: fast retina keypoint. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 510–517. IEEE (2012)
Aubry, M., Schlickewei, U., Cremers, D.: The wave kernel signature: a quantum mechanical approach to shape analysis. In: ICCV Workshops, pp. 1626–1633. IEEE (2011)
Bronstein, A.M., Bronstein, M.M., Kimmel, R.: Efficient computation of isometry-invariant distances between surfaces. SIAM J. Sci. Comput. 28(5), 1812–1836 (2006)
Bronstein, A.M., Bronstein, M.M., Kimmel, R.: Numerical Geometry of Non-rigid Shapes. Springer Science & Business Media, Berlin (2008)
Bronstein, M.M., Kokkinos, I.: Scale-invariant heat kernel signatures for non-rigid shape recognition. In: CVPR, pp. 1704–1711. IEEE (2010)
Strecha, C., Fua, P., Lepetit, V., Calonder, M.: BRIEF: binary robust independent elementary features. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 778–792. Springer, Heidelberg (2010)
Castellani, U., Cristani, M., Murino, V.: Statistical 3D shape analysis by local generative descriptors. IEEE Trans. Pattern Anal. Mach. Intell. 33(12), 2555–2560 (2011)
Chan, C.H., Yan, F., Kittler, J., Mikolajczyk, K.: Full ranking as local descriptor for visual recognition: a comparison of distance metrics on \({\rm s}_{\rm n}\). Pattern Recogn. 48(4), 1328–1336 (2015)
Chen, H., Bhanu, B.: 3D free-form object recognition in range images using local surface patches. Pattern Recogn. Lett. 28(10), 1252–1262 (2007)
Guo, Y., Bennamoun, M., Sohel, F., Lu, M., Wan, J.: 3D object recognition in cluttered scenes with local surface features: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 36(11), 2270–2287 (2014)
Guo, Y., Bennamoun, M., Sohel, F., Lu, M., Wan, J., Kwok, N.M.: A comprehensive performance evaluation of 3D local feature descriptors. Int. J. Comput. Vis. 1–24 (2015)
Guo, Y., Sohel, F., Bennamoun, M., Lu, M., Wan, J.: Rotational projection statistics for 3D local surface description and object recognition. Int. J. Comput. Vis. 105(1), 63–86 (2013)
Hamza, A.B., Krim, H.: Geodesic object representation and recognition. In: Nyström, I., Sanniti di Baja, G., Svensson, S. (eds.) DGCI 2003. LNCS, vol. 2886, pp. 378–387. Springer, Heidelberg (2003)
Johnson, A.E., Hebert, M.: Using spin images for efficient object recognition in cluttered 3D scenes. IEEE Trans. Pattern Anal. Mach. Intell. 21(5), 433–449 (1999)
Kokkinos, I., Bronstein, M.M., Litman, R., Bronstein, A.M.: Intrinsic shape context descriptors for deformable shapes. In: CVPR, pp. 159–166. IEEE (2012)
Leutenegger, S., Chli, M., Siegwart, R.Y.: Brisk: binary robust invariant scalable keypoints. In: ICCV, pp. 2548–2555. IEEE (2011)
Lipman, Y., Funkhouser, T.: Möbius voting for surface correspondence. In: ACM Transactions on Graphics (TOG), vol. 28, p. 72. ACM (2009)
Litman, R., Bronstein, A.M.: Learning spectral descriptors for deformable shape correspondence. IEEE Trans. Pattern Anal. Mach. Intell. 36(1), 171–180 (2014)
Ojala, T., Pietikäinen, M., Harwood, D.: A comparative study of texture measures with classification based on featured distributions. Pattern Recogn. 29(1), 51–59 (1996)
Reuter, M., Wolter, F.-E., Shenton, M., Niethammer, M.: Laplace-beltrami eigenvalues and topological features of eigenfunctions for statistical shape analysis. Comput.-Aided Des. 41(10), 739–755 (2009)
Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: Orb: an efficient alternative to sift or surf. In: ICCV, pp. 2564–2571. IEEE (2011)
Rustamov, R.M.: Laplace-beltrami eigenfunctions for deformation invariant shape representation. In: Proceedings of the Fifth Eurographics Symposium on Geometry Processing, pp. 225–233. Eurographics Association (2007)
Sun, J., Ovsjanikov, M., Guibas, L.: A concise and provably informative multi-scale signature based on heat diffusion. In: Computer Graphics Forum, vol. 28, pp. 1383–1392. Wiley Online Library (2009)
Tang, F., Lim, S.H., Chang, N.L., Tao, H.: A novel feature descriptor invariant to complex brightness changes. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 2631–2638. IEEE (2009)
Wang, Z., Fan, B., Wu, F.: Local intensity order pattern for feature description. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 603–610. IEEE (2011)
Werghi, N., Berretti, S., Del Bimbo, A.: The mesh-lbp: a framework for extracting local binary patterns from discrete manifolds (2015)
Windheuser, T., Vestner, M., Rodola, E., Triebel, R., Cremers, D.: Optimal intrinsic descriptors for non-rigid shape analysis. In: BMVC (2014)
Yang, X., Cheng, K.-T.: Local difference binary for ultrafast and distinctive feature description. IEEE Trans. Pattern Anal. Mach. Intell. 36(1), 188–194 (2014)
Zaharescu, A., Boyer, E., Horaud, R.: Keypoints and local descriptors of scalar functions on 2D manifolds. Int. J. Comput. Vis. 100(1), 78–98 (2012)
Ziegler, A., Christiansen, E., Kriegman, D., Belongie, S.J.: Locally uniform comparison image descriptor. In: Advances in Neural Information Processing Systems, pp. 1–9 (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Wang, X., Sohel, F., Bennamoun, M., Lei, H. (2016). Binary Descriptor Based on Heat Diffusion for Non-rigid Shape Analysis. In: Bräunl, T., McCane, B., Rivera, M., Yu, X. (eds) Image and Video Technology. PSIVT 2015. Lecture Notes in Computer Science(), vol 9431. Springer, Cham. https://doi.org/10.1007/978-3-319-29451-3_59
Download citation
DOI: https://doi.org/10.1007/978-3-319-29451-3_59
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-29450-6
Online ISBN: 978-3-319-29451-3
eBook Packages: Computer ScienceComputer Science (R0)