Segmentation of 3D meshes combining the artificial neural network classifier and the spectral clustering

3D mesh segmentation has become an essential step in many applications in 3D shape analysis. In this paper, a new segmentation method is proposed based on a learning approach using the artificial neural networks classifier and the spectral clustering for segmentation. Firstly, a training step is done using the artificial neural network trained on existing segmentation, taken from the ground truth segmentation (done by humane operators) available in the benchmark proposed by Chen et al. to extract the candidate boundaries of a given 3D-model based on a set of geometric criteria. Then, we use this resulted knowledge to construct a new connectivity of the mesh and use the spectral clustering method to segment the 3D mesh into significant parts. Our approach was evaluated using different evaluation metrics. The experiments confirm that the proposed method yields significantly good results and outperforms some of the competitive segmentation methods in the literature.


Introduction
3D mesh segmentation is a crucial preprocessing step in many fields of 3D object understanding and analysis. It has received significant attention recently as a suitable solution to many problems in the area of computer vision. In general, the process of segmentation subdivides an object into its constituent parts sharing some common properties that can be geometric or semantic. Mesh segmentation methods are classified into two main categories; the first one is the geometric methods (surface-type) uses usually surfaces geometric properties of the mesh to extract the surface parts. The second type called semantic methods (part-type) aims to obtain the meaningful or semantic regions of the mesh [1,2,3].
In the last decades, several techniques of 3D segmentation have been developed [3]. Among different 3D segmentation approaches, spectral clustering methods [4,5] and learning approaches [6] are the most relevant and have several beneficial features in practical applications. They make the formulation of the problem more flexible and the computation more efficient. The graph cut based segmentation methods project the problem of the segmentation in the language of graph partitioning. These approaches are based on computing the affinity matrix which encodes the local connectivity of the input mesh, and then they rely on eigenvalues and eigenvectors of this later or the Laplacian matrix to resolve the partitioning problem. While the learning approaches aim to segment an object in a way, similar to human brain based on training step.
The machine learning methods are performed in two distinct steps. The first one is the offline step in which a classifier is trained by a set of already segmented objects. While during the online step, an input mesh is segmented based on a knowledge produced in the offline step. An extended study of these methods can be found in [3]. Despite many years of research, 3D mesh segmentation is still a very challenging task. In this regards, we propose a new approach of 3D mesh segmentation which combines the artificial neural networks classifier which detect from a set of geometric criteria of a ground truth boundary faces which criteria are the most relevant to select candidate boundary faces. These boundaries are used after to encode a new connectivity of the mesh and use the spectral clustering method to segment the 3D mesh into significant parts.
The rest of this paper is structured as follows: In section II, we will present a brief state-of-the-art of 3D mesh segmentation algorithms. Section III describes the proposed method. Finally, we will discuss the experimental results in the section VI. By the end of this paper, we will present a conclusion and some perspectives.

Related works
The problem of 3D segmentation has been well studied in several fields. Many methods have been proposed in the literature to resolve the need of the 3D mesh segmentation based on a diversity of algorithms. A recent extensive review of the field can be found in [3]. In the remainder of this section, we review some of the existing 3D mesh segmentation in the literature briefly.
In [7] the authors proposed a mesh segmentation method based on improved region growing. The sharp edges are detected, and feature lines are extracted using the dihedral angle. Then the region growing process is applied, and a post-preprocessing step is used based on geometric criteria to merge the resulted regions. Zucker-berger et al.
[8] also proposed a method for mesh segmentation based on region growing. After the construction of the dual graph of the mesh, a region growing process for computing new segment starts by selecting a seed element randomly and continues to collect nodes or faces, which form a convex part until the violation of the convexity. Finally, the authors add the last step in which the small parts are merged to larger ones. The authors also presented in the same work another watershed decomposition algorithm. Firstly, all the local minima are found and labeled, and the plateaus are defined. Then, a watershed process is used to loop through the plateaus and allow each one to descend until a labeled region is found. The unlabeled vertices similarly descend until joining labeled parts. The authors use a post-processing merging step to handle the over segmentation. The work in [9] describes an approach based on Gaussian curvature and concaveness. The authors develop an extended multiring neighborhood and fast marching watershed-based segmentation algorithm. In the last step, a region-merging scheme is applied based on region size and the boundary length of the adjacent patches. The method in [10] presents a clustering approach based on K-means algorithm. Firstly, the authors define the K seed faces by maximizing the pairwise distances between them based on the dihedral angle and the geodesic distance. After the termination of the cluster centers, all faces are attached to nearest seed faces, and the cluster centroids are recalculated by minimizing the sum of distances between the faces of the segment and its centroid. This process is repeated iteratively until convergence. The result of this approach is sensitive to the initial choice of the cluster centers.
To overcome the limitation of the previous methods, researchers have turned to use the spectral clustering algorithms in the 3D segmentation field. Spectral analysis is one of the most widely used methods for data analysis. Consequently, graph-clustering methods are invested in the 3D segmentation. The main idea of these methods is to partition an object based on computing the eigenvectors and the eigenvalues of the Laplacian matrix of the graph exploited for the input mesh expressing shape properties. Liu and Zhang et al. [11] was firstly included this technique to segment 3D meshes. Based on the curvature and geodesic distances, the affinity matrix is constructed to encode the concavity between faces. Then a spectral method is applied on the eigenvectors of this matrix to generate the segmentation of the mesh. Another spectral method in [12] introduces a new definition of weak convexity, which is based on a computation of inner visibility between points on the surface of the shape. Based on this definition of weak convexity, the authors define a spectral method that partition a given mesh into weakly convex parts. In addition, Сhahhou et al. proposed another approach [4] based on the minima rules [13] to encode the local connectivity between faces and then the authors applied the Normalized Cheeger Cuts to determine the best cut of the object. In [14] the authors introduce a fully unsupervised method for Mesh Segmentation Driven by Heterogeneous Graphs in which the au-thors develop a spectral technique where local geometry affinities are linked with surface patch affinities. Then a heterogeneous graph is created by merging the weighted graph based on adjacency of patches of an initial oversegmentation and the weighted dual mesh graph. Recently in [5], our research team propose a new approach for 3D mesh segmentation that takes into account the concave and convex regions, based on the dihedral angles and negative curvatures for generating the adjacency matrix and the spectral clustering as a criterion of partitioning.
Encouraged by the success of the learning algorithms in different fields, the Learning methods are included in the 3D segmentation area. These approaches learn from a number of already segmented objects, which can be found in the existing ground-truth datasets, and then this knowledge can be used to segment an input mesh in the online step. The authors in [15] propose a learning 3D mesh segmentation and labeling method in which, each face is labeled using Conditional Random Field model. An objective function is learned from a collection of labeled training meshes using the Jointboost classifier, and the labeling is determined by maximizing the total energy using Alpha expansion graph-cuts and then prepares the extracted parts. Benhabiles et al.
[6] introduce a method based on a learning approach. The authors apply the AdaBoost classifier to define candidate boundaries trained with a set of already segmented meshes and a set of geometric properties of the mesh. Then a post-processing is done in which the resulted boundaries are closed and optimized using a snake movement algorithm. The work in [16] presents another method for machine learning 3D segmentation and labeling. The difference with the previous approach is that the training step takes into account not only segmented labeled meshes, but also unsegmented meshes. The learning step is based on Virtual Evidence Boosting, which involves belief propagation taking into consideration neighborhoods and dihedral angles to diffuse labels, and Logitboost to perform feature selection.
The proposed work is also related to learning 3D mesh segmentation. Firstly, as an offline step, a function is learned using Artificial Neural networks and a set of characteristic criteria of already segmented 3D meshes gotten from the existing ground-truth databases, then this knowledge is used to determinate the candidate boundaries of the input mesh in the online step. Finally, the Normalized Cheeger Cuts is applied to get the best cuts of the resulted future faces.

Background
In this section, we will present an overview of the two used techniques which are the Artificial Neural Network classifier and the spectral clustering method.
The Neural network classifier: Artificial neural networks are a computer-based algorithm and a branch of artificial intelligence closely modeled on the human brain. They are constructed on the behavior of biological neurons that can be trained to execute tasks. ANNs are used in many application areas such as clustering, pattern recognition, classification, and many others [17]. They are composed of units called neurons, connected to each other to determine the behavior of the network. The learning process requires updating network connection weights so that the network can efficiently achieve a specific clustering/classification task.
It exists multiple types of neural network, the most popular is the feedforward network, and it comprises multilayer perceptron and Radial-Basis Function (RBF) networks [18]. This type of neural network use a supervised training process which means that the desired output is known and the weight coefficients are adjusted in such way, that the calculated and the desired outputs are as close as possibles. Another well-known type of neural networks employed for clustering is the Kohonen -Network, called the Self-Organizing Map (SOM) [19], which use an unsupervised training where the output is not available, and learning rely on guidance obtained heuristically by the system examining different sample data.
The Spectral clustering method: Given a set of points, and a similarity measure between pair of points p i and p j , this data points can be transformed into weighted graph G, where vertices V are points and edges are similarity between pairs of points. The goal of clustering is to partition V into subsets where points in the same cluster are similar and those in different clusters are dissimilar. Furthermore, another point to respect is that clusters should be balanced in terms of size.
Let consider the similarity matrix as: if and are adjacent , Where w ij represent the weight for the edge connecting two vertices v i and v j .
Graph Laplacians: The graph Laplacian is a matrix representation of a given graph and the spectral graph theory consist on studying this matrix since it holds many useful information about the graph. The graph Laplacian L is calculated by: -Unnormalized graph Laplacian: L = D -W; -Normalized graph Laplacian: L = I -D -1 W.
With D and I are respectively the degree and the identity matrices. The goal is to cluster the mesh into dissimilar regions C 1 and C 2 by minimizing the following function [20,21]: A graph partition is efficient when the function (2) is minimized. But this function only takes into account connections of clusters, ignoring the density distribution inside each cluster and consequently leads to unbalanced clusters. To overcome this problem and get more balanced clusters, Hagen and Kahng presented radio cut criterion [22] which introduce cluster size to balance clusters and minimize similarity between clusters while Shi and Malik [23] proposed the normalized cut using the volume of the clusters: Where |C | represent the cardinality of the cluster C and vol (C ) is the volume of the set C calculated as the sum of the weights of all edges attached in C. A small difference in the balance characteristics is provoked by the ratio Cheeger cut RCC (C 1 , C 2 ) and the normalized Cheeger cut NCC (C 1 , C 2 ) pressed by: It is known from graph theory that the optimal solution to the Laplacian graph-partitioning problem is given by the eigenvector to the second smallest eigenvalue of L. the eigenvalues can be found by solving the linear equation system Lv = ƛDv [4,22]. As the spectral clustering is connected with Laplacian matrix, Hein et al. defined the standard graph Laplacian ∆ 2 as follows: Where f represents the eigenvector of Laplacian matrix. When generalizing the Laplacian operator to ∆ p then it can be denoted as: ∆ p is the p-Laplacian, as the graph Laplacian we have the normalized and the un-normalized p-Laplacian n p ∆ and u p ∆ as follows: Where φ p is defined for each x∈R by: Our main interest behind using the graph p-Laplacian is the generalized isoperimetric inequality of Amghibech [24] which relates the second eigenvalue of the graph p-Laplacian to the optimal Cheeger cut.

Our proposed approach
In this section, we will detail the mainly two phases, which consists our approach. Firstly, we will extract the candidate boundary faces where the cut should occur using the Artificial Neural Network classifier, which we will train on a set of segmented models using several geometric criteria to select the ones which detect the boundary faces. Secondly, we will use these resulted boundaries given by the trained artificial neural network to encode a new connectivity of the mesh and use the spectral clustering to segment the 3D mesh.
In our approach, we will use the feedforward networks, which have many advantages. Firstly, it can give good results when trained on a relatively sparse set of data, which make this kind of neural network faster and often provide the right output for input, not in the training set. Secondly, the use of the backpropagation training algorithm can often find a good set of weights since it consists on propagating the output error from the output layer through the hidden layers to the input layer to adjust connection weights until the error value is minimized.
We have evaluated different network architecture to find an optimum solution to our problem. As mentioned above, the used networks are all based on multilayer feedforward backpropagation model and were made up of four layers including input, 2 hidden, and output layers. The number of the input neurons of this network simply map the dimension of the input features which correspond to the values of the chosen geometrics criteria for each face. The number of neurons in the two hidden layers are chosen empirically after several tests while the output is defined with a combination of 1 or 0 to represent the detected boundaries. Each layer is fully connected to the next, and each unit uses a sigmoid function for activation.
We have used the benchmark of Chen et al. to train and test our Artificial Neural Network. First of all, we took a training set of 3D meshes got from the corpus of Chen et al. to construct the input of our ANN. For each 3D mesh, we firstly, extract the segmentation boundaries of its given ground truth segmentations, done by human operators, figure 1 shows some examples of these reference segmentations, then we create for each face a vector of geometric criteria that we associated with its specific output value. The output in our case is a value of 0 or 1, where 1 is interpreted as boundary face, and 0 indicates a non-boundary face. After training, the ANN will take as input vectors of geometric criteria of each face of the 3D mesh and will be able to detect its boundary faces. The input parameters are computed for each face of the 3D mesh; we choose seven geometric criteria which are: shape diameter [25] , dihedral angle between two adjacent faces that we divide by two to get a scalar value for each face, and the different kind of curvatures [26] which are: principal curvatures, k 1 and k 2 that represent fundamental definition of surface curvature, shape index computed by 2/π × arccos (k 2 -(k 1 /k 2 )k 1 ), curvedness calculated by 2 2 1 2 ( ) / 2 k k + and the Gaussian curvature got by k 2 × k 1 .
We have trained our ANN on three types of datasets got from the benchmark of Chen et al. which is constructed by 19 classes each class includes 20 3D objects. The first training dataset is constructed by 5 3D-models choosing randomly from each category to be used to train the ANN, the rest of the meshes are used for the test, and the second dataset is constructed by 10 objects from each class while the last comprised 15 objects. We have found that the results of the first dataset are less relevant compared by those of the second datasets. Whereas the second dataset give a relevant results compared by those of the third dataset even if the ANN is trained on a large number of 3D-models compared by those of the second dataset, thus we kept the results of the ANN trained on 10 3D objects got from each class to reduce the time of execution and have significant results. The figure 2 shows some example of the resulted 3D meshes boundaries got by our artificial neural network.

Fig. 2. An example of the boundaries detected by our artificial neural network classifier
The Spectral clustering: The Spectral clustering is one of the well-known segmentation technique of 3D meshes since it underlines global shape properties using the local connectivity, it relies on eigenvalues and eigenvectors of the adjacency matrix constructed for the graph representing the input mesh. In our previous work, we have proposed a new segmentation method using spectral clustering where the affinity matrix is constructed by combining the minimal principal curvature and dihedral angles to detect both concave and convex properties of each edge. In this work we will use the resulted boundaries given by our proposed neural network to construct the adjacency matrix of the spectral clustering.
Defining a new adjacency matrix: In our proposed method we will create the adjacency matrix by grouping faces instead of vertices. Taking into account that the affinity matrix denotes the likelihood that faces i and j can be grouped into the same segment, we will construct the affinity matrix from the dual graph got by the results of the proposed ANN which detects the boundary faces of a given 3D mesh. The use of these resulted boundaries to encode the affinity between faces will lead to a meaningful segmentation.
Let consider M (V, F, E) as a mesh consisting of a set of vertices V, faces F and edges E between faces.
By considering the results of the ANN, we build our new affinity matrix as follows:

Experimental results
In this section, we evaluate the segmentation results of our proposed method by applying several experimental tests. For this task, we use the benchmark of Chen et al. [27] which comprises 19 classes of 3D meshes, each category contains 20 3D-models with their multiple groundtruth segmentations done by human operators. As mentioned before, we have trained our neural network on this benchmark using 10 objects choosing randomly from each class. We have performed several tests to evaluate the performance of the proposed approach. Firstly, we will begin by a qualitative evaluation. Secondly a quantitative evaluation is done to quantify the segmentation quality of our results. The final test is done on objects with different poses to verify the ability of our method against pose-variation.
We begin by a qualitative evaluation where we show some results got by our algorithm. The figure 3 shows some segmentation results of different 3D objects taken from different classes. As we can deduct from the visual results, our approach succeeds to give a meaningful segmentation for almost all categories of 3D meshes. The second test consists in applying different quantitative evaluation metrics. An overview of the evaluation measures used to evaluate the results of our proposed approach along with other segmentation methods are presented in what follows: AEI [28]: This method is based on the entropy concept from information theory. The method starts by calculating a baseline, which corresponds to the entropy of all the different ground-truth, then the automatic segmentation is added, and the entropy is recalculated. The increment from the baseline to the new value is adopted to evaluate the automatic segmentation.
Recently, our research team proposed six quantitative evaluation metrics which are: WDC [29], WKD [30], WSSD [31], Dj3D [32], WOI [33], NWLD [34]. The proposed evaluation measures are based respectively on: The Dice's coefficient, the Kulczynski similarity index, Sokal -Sneath distance, The Jaro distance, the Ochiai index and the final one is based on the Levenshtein Distance. The main two advantages of these metrics are that they take into account the regularity of the 3D meshes by introducing the surface of each face in the calculation of the segmentation quality to give a relevant evaluation for both regular and irregular 3D meshes. While the second advantage is that, they compare an automatic segmentation with a set of ground truth segmentations by doing a mapping between the segments of the automatic segmentation and all segments of the available references. Figure 4 shows the evaluation of our proposed method by all these cited assessment metrics along with the other segmentation algorithms which are: Randomized Cuts (RC) [35], Normalized Cuts (NC) [35], Fitting primitive (FP) [36] and KMeans (KM) [10] on the whole benchmark of Chen et al. [27].
From the figure 4, we can observe from the given results that the used metrics indicate that our proposed approach outperform all the others, since it got the best scores for almost all the metrics followed by the RC, NC while the worse results are given by FP and KM.
The last test is applied on objects with different poses, figure 5 shows the results obtained by our segmentation method for the objects hand and armadillo. We can see that the segmentation achieved by our algorithm is relatively good which highlights the ability of our approach to segment objects with different pose variations.

Conclusion
3D segmentation is considered as one of the main steps in many applications in computer vision. Spectral clustering methods and learning approaches have been proven to be more suitable solutions for 3D mesh segmentation. This paper has presented a new 3D mesh segmentation method based on learning approach which allows users to generate a significant segmentation of an input mesh. A training step was done using the artificial neural network and a set of characteristic properties of the mesh. Then, the extracted knowledge was used to define the future faces, and The Normalized Cheeger Cuts was applied to get the best cuts of the mesh. The experimental results demonstrated the utility and the efficiency of the proposed approach. Further work includes more training datasets and adding others characteristic properties of the meshes to improve and increase the resulted knowledge. WDC [29], WKD [30], WSSD [31], Dj3D [32], WOI [33], NWLD [34], and AEI [28]