AGSF: Adaptive Graph Formulation and Hand-Crafted Graph Spectral Features for Shape Representation

Addressing intra-class variation in high similarity shapes is a challenging task in shape representation due to highly common local and global shape characteristics. Therefore, this paper proposes a new set of hand-crafted features for shape recognition by exploiting spectral features of the underlying graph adaptive connectivity formed by the shape characteristics. To achieve this, the paper proposes a new method for formulating an adaptively connected graph on the nodes of the shape outline. The adaptively connected graph is analysed in terms of its spectral bases followed by extracting hand-crafted adaptive graph spectral features (AGSF) to represent both global and local characteristics of the shape. Experimental evaluation using five 2D shape datasets and four challenging 3D shape datasets shows improvements with respect to the existing hand-crafted feature methods up to 9.14% for 2D shapes and up to 14.02% for 3D shapes. Also for 2D datasets, the proposed AGSF has outperformed the deep learning methods by 17.3%.


I. INTRODUCTION
Object recognition in terms of shape analysis has recently received a great attention in the field of computer vision [1] and applications, such as, security [2], medical imaging [3] and human activity and pose understanding [4]. The detection of shape appearance, part-structure, occlusion, articulation, and local details play an important role in the ways of shape classification. Representation of these characteristics is particularly significant when it comes to distinguishing highly similar shapes. This is often the case in many existing shape data-sets which consist of similar and complex shapes leading to ambiguity in shape recognition [5]. For example, although, various shapes in FIGURE 1 can be easily distinguished by human vision, it is challenging for shape classification algorithms due to the similarity in global structures and indistinguishable local variations of these shapes. Thus, the capturing small local details and prominent parts as well as the global structure into shape models is an important factor in The associate editor coordinating the review of this manuscript and approving it for publication was Donato Impedovo . distinguishing between different objects. Further, this becomes even more difficult for 3D shapes due to the complexity and different view-points of shapes. This challenge has motivated us for this work to exploit the shape structure in terms of protrusions and fine details present within the global shape to propose a novel model for shape representation.
Previous work on shape classification include a wide range of methods such as graph matching [6], inner-distance [7], complex-network [8], short-cut [1], shape contexts [9], variable-dimensional local shape descriptors (VD-LSD) [10], point feature histogram (PFH) [11] and fast point feature histogram (FPFH) [12]. A comprehensive review of these methods can be found in [13] and [14]. These studies aimed to achieve the optimal representation of shapes by considering their outlines. However, main limitations of these approaches are high computational complexity, restrictions on some shape sizes [6] and sensitivity to noise [7]. Psychophysical and neuro-physiological studies have proposed a hypothesis for a structural representation of shapes in terms of object structures, parts and their positional relationships [15]. Further, studies on human vision have highlighted the importance of capturing the local details of the shape surface for the human visual perception of shapes [16]. More importantly, another study on human vision suggests that the visual cortex perceives and understands shapes by representing the shape boundary as a connected set of nodes [17], which has inspired us for our proposed method in this paper.
Inspired by the human vision literature, in this paper we propose a novel approach for shape representation by considering the shape as a connected graph, whose node connectivity is formulated adaptively, and analysing the spectral properties of the resulting graph. The proposed concept of adaptive formulation of connectivity, firstly computes a threshold to build a graph from shape nodes to capture complex shape structures and details. As shown in the bottom row of FIGURE 1, our method forms a graph with adaptive connectivity of nodes with connections (shown in red) highly concentrated at protrusions. Early stages and results of our proposed method [18] and a case study on application of it in 3D shape recognition [19] were presented as recent conference publications. In the present paper, we present a methodology representing both 2D and 3D shapes, by incorporating a new method to determine the threshold to build an adaptively connected graph and proposing additional features for classification. We also show rigorous experimental evaluation of the method using several 2D and 3D datasets leading to analysis and discussion of the recognition accuracy and system complexity including ablation studies. By 3D shapes, we refer to the shapes perceived by point clouds of 3D objects. The main contributions of this paper are: 1) Proposal of a novel graph-based representation of 2D and 3D shapes. 2) A new method for graph formulation with adaptive connectivity to represent shapes capturing their local and global characteristics. 3) Proposal of a new set of graph spectral features based on the node distribution of the adaptively connected graph for shape representation. Herein, we call them Adaptive Graph Spectral Features (AGSF). The rest of the paper is structured as follows: Related works are reviewed in Section II. The proposed method is presented in detail in Section III followed by the performance evaluation in Section IV and concluding remarks in Section V.

II. RELATED WORK
Shape representation work in the literature can be categorized into five different groups: deep-learning methods, modelbased methods, view-based methods, feature-based methods, and graph matching methods. Deep-learning-based methods [20]- [22] shown excellent performance in terms of recognition accuracy levels. The input features of these works are based on: exploitation of metric space distances (PointNet ++) [20], X-transformation (PointCNN) [21], and Kernel Point Convolution (KPConv) [22]. Deep learning methods often have high training times, high computational complexity and requires large training data. However, our focus in the present work is to propose a handcrafted features-based methodology, which does not require large training data.
Model-based methods include two types: shape-skeleton and shape-contour. The former constructs a tree model using object edges to form a shape descriptor, where the similarity measurement is based on tree matching approaches. For example, different methods are implemented by creating a shape descriptor prototype using short-cut [1], corresponding points [23], and skeleton pruning [24]. On the other hand, shape-contour models have relied on the boundaries of silhouette images, whose edges efficiently characterize the global structure of the object with a single closed curve provided there are no holes inside the object. An early study used Fourier descriptors to represent the shape [25], while the latest studies are based on convex details [26], circle view shape signature (CVS) with multi-views [27], and progressive shape-distribution-encoder [28]. In general, these approaches provide rich knowledge about shape structure in different strategies. However, the main issue in model-based methods is that the local details are omitted or completely neglected in their model. In other words, most model-based methods ignore the small protrusions or dense areas and focus on the global object structure instead. For each method, a specific dynamic program was used to identify the matching similarities between patterns. In some cases, even if the algorithm is not optimized, the matching program may increase the recognition score.
View-based methods measure the similarity between two objects based on different view angles. Particularly, they are based on analyzing and understanding the shapes rendered, distinguishing the models considering their visual similarity. Typical approaches based on view method includes multiview depth line [29], and symmetric branch [30]. The main concern in these approaches is computing the view similarity for samples, which have different topology details as can be seen in the SHREC2010 dataset.
Graphs and complex networks display useful topological features such as degree distribution, clustering coefficient, and hierarchical structure. Several methods have been proposed to explore the maximum probability of correspondence mapping between two patterns through the weight matrix based on the Eigen domain weighted graph matching [6], node degree [8], polynomial characterization [39], the center of clusters [40], spectral relaxation [41], higher order constraints [42] and Kronecker product of the graph adjacency matrices [43]. Recently several approaches have been proposed for large scale 3D point cloud representation. For example, the method in [44] presents a supervised 3D shape segmentation based on graph and deep learning. Similarly, the work in [45] proposes a deep learning approach-based on super point graph (SPG) for segmentation purposes. This method involves partitioning the 3D shape, down-sampling and embedding process followed by a graph convolution neural network to classify the shapes. However, both these methods aim to achieve an efficient cut of shapes using graphs, as opposed to modelling shapes using graphs as in our work.
In our recent work, we proposed graph spectral domain feature-based recognition for numbers [46]. The graph spectral feature-based methods that are primarily based on a fully connected graph formed by the shape were successful in recognizing global shape structures, but not on accurately representing local variations, as shown in FIGURE 1. To formulate the problem, we start with converting each shape into a set of nodes (for 2D by sampling the edges of silhouettes and for 3D by down-sampling the surface using Growing Neural Gas (GNG) [47]). The assumption is that the nodes at protrusions to have a high connectivity with each other compared to the nodes at the global shape boundary where less variations are present. The connectivity is defined as the number of nodes that one node is connected with. The existing graph node connectivity can be categorized into three types: 1) Full connectivity: when each node is connected to all other nodes in the graph. [46]. This type of connectivity provides an efficient characterization of the global outline of the shape. 2) Special connectivity: for specific applications, vertices have their own connectivity without the ability to change it. An example of this is a graph connecting major cities with a road network. 3) K-Nearest Neighbour: where each node is linked to the nearest K nodes, and each node has K connections [48]. As graph spectral theory uses the edge distribution for each node for converting graphs into their spectral domain, in order to provide an accurate spectral representation, accurate formulation of the connectivity for each node capturing the shape characteristics is important. Therefore, the main problem we aim to solve is how to determine the adaptive connectivity for each node in a shape. In this work, we propose to formulate an adaptively connected graph by forming conditional connectivity, where nodes are connected if a certain condition is satisfied as in Section III-C. Early stages and results of our proposed method [18] and a case study on use of it in 3D shape recognition [19] were reported as conference publications. In the current paper, we present a methodology representing both 2D and 3D shapes, by incorporating a new method to determine the threshold for conditional connectivity to build an adaptively connected graph and proposing additional spectral domain features for classification. We also show extended experimental evaluation of the method using several 2D and 3D datasets leading to analysis and discussion of the recognition accuracy and system complexity, including ablation studies.

III. THE PROPOSED METHOD
Our proposed method can be summarized into four steps as shown in FIGURE 2. The AGSF framework begins either by representing the shape as a collection of nodes or points. In the next step, these nodes are formulated into a graph with adaptive connectivity among nodes using the newly proposed adaptive graph formulation algorithm. This is followed by the spectral decomposition of the graph connectivity structure and feature extraction on graph spectral bases. Finally, these features are classified for recognizing the underlying shape.

A. SHAPE CONVERSION INTO NODES
2D shapes in datasets are usually given as silhouette or as an outline image. In this approach, we convert silhouettes to an outline, so that all input 2D shapes can be considered as a contour path. The resulting contour path, P, is usually a smooth curve with N number of pixels. To reduce the complexity of the subsequent graph spectral decompositions, we choose n number of nodes, where n < N , to form a new down-sampled shape contour,P, as follows: where k = 0, 1, . . . , n−1 is the new node index and {.} is the rounding to the nearest integer operator.P is then used as the nodes of the 2D shape, which will be later used for generating the graph with adaptive connectivity. 3D shapes are usually given as a point cloud with large number of points representing surfaces of the 3D shape. In order to make them manageable in the graph spectral decompositions, we down-sample the 3D shapes using the Growing Neural Gas (GNG) algorithm [47] due to its excellent quality, flexibility, and rapid adaptation to perform 3D surface representation of different objects. GNG is a simple unsupervised procedure to select the optimal pixels based on their distance, and it does not create any new pixels. The main characteristic of GNG is that the output neurons represent the topology of the shape with a fewer nodes. Although the GNG algorithm, outputs a connected graph, in this work we considered only coordinates of the nodes, ignoring the connectivity generated by GNG.
GNG starts with two nodes, randomly selected from a set of existing nodes. Then, it generates a signal based on the probability density between these nodes. After that, it finds the nearest node to both initial nodes. Based on Euclidean distance, the edges between these nodes will be updated based on the error function, which represents the difference in distance. These steps are repeated until the n nodes are selected. The use of GNG links is ineffective because it creates additional and unnecessary links outside the shape surface. For example, it creates links between different fingers, which are outside the geometry of the hand. Therefore, we cannot rely on GNG links. At the end of the training process, the GNG should satisfactory cover the shape regions as can be seen in FIGURE 3. Since GNG selects nodes regularly based on an unsupervised optimization process in such a way that these nodes have a uniform distribution inside the shape, the noisy pixels are removed by the GNG process. Node ordering is implemented from left to right for the 2D shapes, and bottom to top, and then left to right for the 3D shapes.

B. GRAPH SPECTRAL DECOMPOSITION PRELIMINARIES AND NOTATIONS
We start with the preliminaries of graph spectral decomposition and our notation. Let G = {V, E, A} an undirected graph comprising V, nodes or vertices, connected by E, set of edges and represented by A ∈ R n×n , adjacency matrix comprising edge weights. We define the weight value, A i,j corresponding to an edge, e i,j connecting vertices i and j is as follows: which is the Euclidean distance e (i,j) between the vertices, i and j, normalised with the total lengths of edges in the graph averaged per node. We define the signal r : V → R, where the i th component represents the Euclidean distance from the center (0,0,0) to the vertex i in V. We also define, the node degree, i , as the number of edges incident on each node on V i . The combinatorial graph Laplacian matrix, L, is defined as where D is the diagonal matrix of vertex degrees, whose diagonal components are computed as follows: Since, L is a symmetric positive semidefinite matrix, from spectral projection theorem, there exists a real unitary matrix, U, that diagonalizes L, such that U t LU = = diag{λ } is a non-negative diagonal matrix [49], leading to an eigenvalue decomposition of L matrix as follows: where u , the column vectors of U, are the set of orthonormal eigenvectors of L with corresponding eigenvalues, [50].

C. THE PROPOSED ADAPTIVELY CONNECTED GRAPH FORMULATION
Now, we consider 2D and 3D shapes converted to a set of nodes (in Section III-A), which are the vertices, V in the graph formulation. The rest of this subsection will focus on how to generate the representative graph, i.e., to define E forming adaptive connections of vertices (in this sub-section), followed by computing A for spectral decomposition and spectral domain feature extraction inSection III-D.
As discussed in Section II, although a fully connected graph provides an efficient representation of the global outline of the shape [46], the major drawback of this type is that local details are not reliably captured compared to the global outline. In addition, K-Nearest Neighbour connectivity does not reflect the topology structure, especially when using a small value of K, where nodes are only connected to their neighbours. Since we aim to classify more complex shapes, an appropriate connectivity is required to represent complex shapes within the Euclidean space. Thus, in this work, we propose adaptive connectivity, where vertices are connected if a certain condition is satisfied. In other words, V i is connected VOLUME 8, 2020  to V j , if and only if they satisfy the condition, |e i,j | < t, where t is a threshold distance. Using this connectivity type, there is no fixed number of connections at each vertex and the number of connected elements depends on the condition and the topology of the graph vertices. FIGURE 4 shows an example of how vertices are connected using a given threshold of distance between nodes.
The adaptive connectivity means that each vertex is connected to other vertices that fall in less than a certain dynamic distance. The dynamic threshold starts as a small distance value, then it steadily grows up till reaching a certain level to satisfy a certain conditions as explained in the rest of this section. This means that not all nodes are connected to each other, instead, the nodes that are close to each other tend to have strong connections. As a result, local details and the protrusions on the edge result in nodes with higher connectivity, which is captured by the Adjacency matrix and then by the spectral bases, where the spectral features are extracted.
As an example, Graph eigenvalues depend on the connectivity of the and thereby, reveal important characteristics of the underlying shape. Our method for choosing the threshold, t is formulated as below. To find the optimum t, we consider a range of thresholds, t δ , as follows: where t 0 ∈ R + is the initial threshold for the minimally connected graph and δ ∈ N is an increment distance with unit steps in the range 0 ≤ δ < (n − 1). Here we have chosen n as the upper limit of δ. However, depending on the resolution of shapes in datasets we do not need to compute for all n as will be evident in the rest of this section. Since Eq. (6) reduces to t δ = t 0 + t 0 δ, it allows us to choose t 0 according to the sampling rates of data in different datasets and keep the integer increments for δ.
The number of connected vertices for a graph with conditional connectivity depends on the threshold distance, t, and the topology of the graph vertices. This property of two graph structures can be compared by comparing their corresponding last eigenvalues, λ n−1 . The higher the λ n−1 , the higher the number of connections i.e., the connected nodes. Further, it can be observed that increasing t leads to increasing λ n−1 , as seen in the four examples in FIGURE 5 and their corresponding λ n−1 . Thus, we can summarise as λ n−1 ∝ t. For a given t value, there may be some instances where some nodes are not connected to any other node or only having one edge leaving gaps in the shape outline. In such cases, we can see clusters of connected vertices as in examples A and B in FIGURE 5. The number of clusters of connected vertices (ω) can be determined by counting the number of zero eigenvalues of the resulting graph structure for a given t [50]. As an example, the graph structure A in FIGURE 5 shows a dog with ω = 39 clusters. Such a graph leads to λ 0 = · · · = λ 38 = 0 eigenvalues. Similarly, for B and V B has ω = 4 and λ 0 = · · · = λ 3 = 0. For both C and D, it can be seen all nodes are connected as a single cluster with ω = 1 and only one zero eigenvalue, λ 0 = 0. We can also see that more local details and protrusions are captured by means of edges when increasing the threshold values from V A to V D .
We define t 0 as the minimum threshold distance that all vertices are connected as a single cluster i.e., ω = 1. Initially, we start creating graphs over the shapes using a small distance to link the nodes. This usually creates unconnected graphs and ω > 1 because of the small t. Therefore, t is gradually increased until the nodes are connected as a single group with ω = 1. The minimum t to form such a minimally connected graph gives us t 0 .
The value of t 0 is associated with the total number of nodes (n) in the sub sampled shape. The higher the n the smaller the value of t 0 . Hence, we can summarise as FIGURE 6 demonstrates the relationship between n and t 0 using the same shape with n = 40 and n = 400 nodes, respectively.  Note that we cannot define t 0 as the largest distance between any two nearest pair of nodes since the distance between the nodes in 3D shapes is not linearly related.

2) FINDING δ
For a given shape, the algorithm firstly finds t 0 , the minimum threshold distance that all vertices are connected as a single cluster i.e., ω = 1. We recall the graph node degree ( ) is a vector comprising of number of connections (edges) for each vertex of the graph (shape outline). The node degree, δ , for each t δ is computed for each formulated graph connectivity corresponding to each δ. An example of graph edges and δ vectors for various δ values for a shape in the Tools dataset are shown in FIGURE 7. The first row in FIGURE 7 shows the graph corresponding to t 0 and 0 . The following rows show snapshots of the graph edges and the set of δ vectors up to the given δ. They show how the number of connections for each vertex grow with increasing δ. Now we need to analyse the shape of δ to determine the most suitable graph connectivity for representing the shape. For t 0 and for fully connected graphs, δ show less variation. As δ is increased, each node grows new connections resulting in δ vectors showing more variations as in FIGURE 7. As connections grow, local details and the protrusions of the shape, where more nodes are densely present, result in higher connectivity for nodes at those locations. As δ increases, δ vectors show growing number of connections making peaks corresponding to nodes at protrusions and local shape variations. For this, we analyse the amount disorder in δ . Firstly, the normalised node degree, E δ , is computed as follows: This is followed by computing the metric we call weighted semi log normalised energy, S, as follows: The metric S is a measure of the degree of disorder of the normalised node degree vector, E δ . This is inspired by the computation of entropy in information theory. We do not call it entropy as it does not involve probabilities. The S value increases when the variations in the vector are high and decreases when the variations are low. We choose the δ that corresponds to maximum S δ to determine the threshold T for graph formulation as follows: Examples of S δ for various δ values and their corresponding maximum points are shown for 4 shapes in FIGURE 8. The node degree vectors corresponding to maximum S δ and the resulting graph connections are shown in FIGURE 9. Our experiments show that despite the rotation angles of view, different shapes in the same class result in similar δ values producing the maximum S values. In this way, we find the adaptively connected graph for each shape. As a result, local details and the protrusions on the shape result in nodes with higher connectivity, which is captured by the Adjacency matrix and then by the spectral bases as shown in Section III-D. Also note that of the shapes are rotated, the distance between the nodes remains unchanged. Therefore, the connection between the nodes is not affected. As a result, S δ are not affected by rotation either.

D. PROPOSED ADAPTIVE GRAPH SPECTRAL FEATURE (AGSF) EXTRACTION
After each shape is converted to its adaptively connected graph representation, we propose the following features to extract from the spectral representation of the formulated VOLUME 8, 2020  graph structure. We select the rotation invariant features that can represent both global shape outline and local details. We propose a feature vector, F, comprising of the following 3 components: F 1 , F 2 and F 3 . 1) We capture the features from global outline of the shape by considering the distance vector r, with r i representing the distance to node i from the central point (0,0,0). Although r i represents the global shape, in order to improve the discrimination among classes by considering the local variations, we modulate r with corresponding eigenvalues λ i in V corresponding to the adaptively connected graph formulated in the previous section for the given shape sample, as follows: Here, r represents spatial property of the shape, and λ corresponds to the spectral properties. Then, by modulating we aim to obtain spatial and spectral properties into the feature vector. 2) To compute F 2 , we consider the node degree, , of the adaptively connected graph for the shape sample, where = argmax δ S. Since it corresponds to the adaptive connectivity graph formulation, this vector captures local details adequately. The normalised vector, E , is used as F 2 .
3) The final feature component, F 3 , consists of a variety of other statistics of , the node degree corresponding to the adaptively connected graph for δ = as follows: These statistics (mean, variance, L 2 norm, and semi log normalised total energy) provide a set of features that is invariant to rotations. The overall length of the feature vector, F is 2n + 4, which is N.

E. MACHINE LEARNING
In this final step machine learning is used to learn the feature vectors generated in Section III-D for classification and recognition of the corresponding shape class. We have evaluated several classifiers including the Support Vector Machine with a cubic form as a kernel function (CSVM), the Nearest Neighbour (KNN), Classification Tree (CT), Discriminant Analysis (DA), Neural Network (NN). Based on several experiments conducted to select the optimal classifier, the Nearest Neighbour (KNN) with K= 1 shows the best performance compared to other classifiers in terms of accuracy and time processing, as shown in Section IV.

IV. PERFORMANCE EVALUATION
This section evaluates the performance of the proposed graph spectral features extracted from adaptively connected graph formulation from shapes on recognition of various 2D and 3D shapes with various sizes, orientations, articulation and scales from various datasets. All algorithms were implemented using MATLAB R2018a on a PC with Intel 3.6 GHz processor and 16 GB RAM.

A. DATASETS
The experiments were based on the following four 2D shape datasets and two 3D shape datasets: 1) ETU10 silhouette dataset is one of the most wellknown 2D datasets [51]. This dataset has 10 classes with 72 shape samples in each class i.e., 720 total images. Each class contains instances of different views of the shape leading to at least 5-degree rotation difference for each instance. The ten classes: Bed, Bird, Fish, Guitar, Hammer, Horse, Sink, Teddy, Television and Toilet are numbered 1 to 10, respectively. 2) The tool dataset [63] is one of the most challenging 2D datasets, as it has a conceptual similarity within its shape classes. It consists of 35 articulated silhouette shapes, which are classified into four classes: 10 scissors, 15 pliers, 5 knives and 5 pincers respectively. 3) Kimia 99 dataset [64] consists of 9 classes with 11 shape samples in each class leading to 99 shapes. The nine classes, numbered from 1 to 9, in Tool dataset correspond to Fish, Hand, Human, Aeroplane, Ray, Rabbit, Misk, Spanner and Dog, respectively. 4) Kimia 216 dataset [65] consists of 18 classes with 12 2D silhouette samples in each class, i.e., 216 images. The 18 classes, numbered in 1 to 18, correspond to Bird, Bone, Brick, Camel, Car, Children, Classic, Elephant, Face, Fork, Fountain, Glass, Hammer, Heart, Key, Mink, Ray and Turtle respectively. 5) MPEG-7 CE-Shape-1 PartB (MP7-shape) dataset [66] consists of 70 classes with 20 samples in each class, i.e., 1400 2D silhouettes in total. 6) SHERC2010 dataset [67] 1 shows accuracy rates of the proposed method using different classifiers with 2D and 3D datasets using n = 80 and n = 200, respectively. All accuracy rates shown in this paper for the first 7 datasets use the average accuracy rates obtained from the k-fold cross validation scheme, using the k values shown in TABLE 1. Since ModelNet datasets provide separated training and testing samples, k-fold cross validation was not used for them. As can be seen, NN and KNN (with the normal Euclidean distance) classifiers result in the best accuracy rates among all classifiers. It must be also noted that KNN is faster than NN.

C. COMPARISON OF RECOGNITION RATES WITH THE EXISTING METHODS
TABLE 1 also compares with the performance of our previous work [18], [19], the existing hand-crafted features based methods and deep learning based methods. Our proposed hand-crafted features (AGSF) outperform the existing handcrafted features based methods for 7 of the datasets. It also demonstrates that the use of the proposed energy function improves the performance compared to the variance in [19].
The confusion matrices showing recognition accuracy for each class in datasets that provide less than 100% overall accuracy rates for 2D and 3D datasets are shown in FIGURE 10 and FIGURE 11, respectively. Although, there is a significant similarity among the classes in ETU10 and Tool datasets, the proposed method VOLUME 8, 2020 FIGURE 10. Confusion matrices of the 2D datasets, where the X and Y labels refer to the classes, which are described in Section IV-A. manages to distinguish them well. For both Tool and Kimia99 datasets, shape samples are classified correctly with 100% accuracy. Although Tool dataset provides high similarity structures, the proposed method is found to be highly efficient in discriminating them as evident by the significant improvement in the accuracy rate. Similarly, despite having shape samples with different angles of views in ETU10 dataset, the proposed features outperform the existing methods with minimum confusion in recognising different classes. Kimia216 and MP7-shape are the most challenging 2D datasets due to having a small number of samples in each class compared to the total number of classes. For MP7-shape, the proposed hand-crafted AGSF even outperform deep learning [53]. For 3D datasets, the proposed hand-crafted features exceed performance of existing hand-crafted features based methods by 3.04% for SHERC2010 dataset, by 5.53% for the 3D shape benchmark dataset and by 14.02% for ModelNet10 dataset. As shown in FIGURE 11, the proposed method recognises all shape classes in SHREC2010 dataset with a high accuracy rate of 94%. The main confusing shape class is octopus (number 5), which is matched with spider class (number 9) due to its similarity graph structure leading to very similar eigenvalues and connected node distributions. Although, the 3D benchmark dataset is a very challenging dataset due to various angles of views, the proposed method has outperformed the existing hand-crafted features based methods. The Model-Net10 is a large dataset and usually used to evaluate deep learning methods. However, the proposed AGSF has achieved an overall recognition accuracy rate of 87.11% significantly outperforming the other hand-crafted features based methods. According to the confusion table, the most confusing class is the Night-stand (number 7) due to its structural similarity to Chair (number 3) and Desk (number 4). Also, a few errors appear in the Bathtub (number 1) class confused with the Dresser (number 5) class. For ModelNet40 our method achieves an overall accuracy rate of 86.43%. The accuracy of each class varies between different classes based on the number of training samples and the similarity among samples. As this dataset has mostly been used for evaluating deep learning based methods, no results for existing hand-crafted featuresbased methods have been reported in the literature.
Although our proposal is on hand-crafted features, we have included deep learning based methods for comparison. Out of all 2D shape datasets, used in our evaluation, deep learning results has been reported only for MPG7-shape dataset. Our proposed hand crafted AGSF has outperformed the deep learning based method by 17.3% for this dataset. However, for 3D datasets deep learning based methods have shown improvements of around 2% for smaller datasets and around 7% for large datasets.

D. COMPUTATIONAL COMPLEXITY
The computational complexity of forming adaptive connectivity, graph spectral decomposition and feature extraction stages are O(n 2 ), O(n 2 ), and O(2n 2 ), respectively. Then the overall complexity of the proposed method can be considered as O(n 2 ) excluding the pre processing and classification steps. We also show the execution times of our method in TABLE 3. It includes the average time taken for the feature extraction, training and testing for 12 instances. In general, the average time taken to test a new sample is around 12 milliseconds, which reflects the real-time performance of our proposed method.

E. ABLATION STUDIES
The effect of different parameters of out method including, k-fold cross validation, t, n, the adaptive connectivity and the proposed features are evaluated as follows:  FIGURE 10. In reporting the results, the best k for each dataset has been used.

2) THE PROPOSED THRESHOLD VS. NUMBER OF NODES
In FIGURE 6, we showed that, for the same shape, the minimum threshold value (t 0 ) depends on the number of nodes (n). Therefore, we evaluate the proposed method using a wide range of n and t δ values in order to evaluate the performance with respect to these parameters. In this experiment, the accuracy rates for 30 ≤ n ≤ 200 considering t δ values quantised into τ levels were obtained. The quantised threshold level τ is defined as the threshold at which at least one node having a node degree of 5τ % for τ = 1, . . . , 10 with respect to n. FIGURE 12 shows an example of identification of τ levels for n = 200 with their corresponding t δ and δ . We also explore the optimal value of n that provides the highest discriminative representation between samples when δ = 0. TABLE 3 shows that the range between 35-65 nodes is the efficient number to represent the shapes among all the datasets using only t 0 .

3) FULLY CONNECTED GRAPH VS. ADAPTIVELY CONNECTED GRAPH
Next, we test the performance using fully connectivity graphs and the proposed conditional connectivity. TABLE 4 shows that the adaptively connected graph achieves better performance than that for the fully connected graph.

4) ELEMENTS OF THE FEATURE VECTOR
We also show a full ablation study of the proposed three kinds of hand-crafted features. TABLE 5 illustrates the importance of these features for shape representation. In this table, we can see that F 1 and F 2 have a greater impact on performance compared to F 3 . TABLE 4 validates the contribution of the adaptively generated graph in shape representation. For, the large and challenging 2D silhouette dataset, MPG7, the proposed hand crafted features have outperformed the deep learning-based methods by 17.3%. discussed. Our proposed hand crafted AGSF has outperformed the deep learning based method by 17.3% for this dataset. As discussed in Section IV-C, deep learning methods have shown some advantage over handcrafted features by around 2% and 7% for small and large datasets, respectively. It can be noted from Section IV-D and TABLE 3, that our approach has low complexity compared to the deep learning-based approaches and more suitable for smaller datasets in that case. Also note that the proposed hand-crafted features consider only shape features while the deep learning based methods learn both shape and texture features.

F. DISCUSSION
A few aspects need to be considered when choosing n in the initial shape sampling for 2D and further downsampling by GNG for 3D. Sampling should be high enough to capture the whole shape. For example, some samples in the Mod-elNet10 dataset show close similarity with some shapes in other classes after downsampling. Although GNG is useful as a pre-processing tool, it has several parameters, which have a direct effect on the quality of the representation, such as, the required output of n, number of iterations, and the average distance between nodes. At the GNG step, it is crucial to carefully select them to give reasonable points cloud to apply out method.

V. CONCLUSIONS
This paper has proposed a new set of hand-crafted features (AGSF) for shape recognition by exploiting spectral features of the underlying graph adaptive connectivity formed by the shape characteristics. To achieve this, we have proposed a new method for formulating an adaptively connected graph to represent shapes with an unique graph structure. This is followed by proposing graph spectral features to capture both global and local characteristics of the shape to train a classifier. The effectiveness of the proposed AGSF is verified by experiments on five 2D shape datasets and four 3D shape datasets. The proposed AGSF has outperformed the existing hand-crafted feature methods up to 9.14% for 2D shapes and up to 14.02% for 3D shapes. Also for 2D datasets, the proposed AGSF has outperformed the deep learning methods by 17.3%.