Three-dimensional connectivity index for texture recognition

This work proposes a new method of texture analysis for grey-level images based on the distribution of connectivity indexes in local neighbourhoods. The connectivity index acts as a measure of homogeneity of textures and its distribution is computed at various local neighbourhood sizes. The resulting descriptors provide an eﬃcient multiscale representation of connectivity at different scales. The method was tested in the classiﬁcation of UIUC, Outex, and KTH-TIPS2b databases and outperformed several state-of-the-art approaches, including such as LBP, LBP+VAR, MR8, multifractals among others.


Introduction
The analysis of texture images has played a fundamental role in computer vision and pattern recognition during the last decades and the importance of this area of study is illustrated by the number of applications appearing in diverse areas such as Engineering [12] , Medicine [1] and Physics [6] .
Over the years, several approaches have been proposed to find methods capable of representing complex digital images using a reduced number of numerical features (or descriptors) that are relevant and non-redundant.Ideally, those features could be also useful for the analysis of objects and scenes in similar types of images.
Since the work of Haralick [8] , considerable effort has been dedicated to find strategies that quantify how image pixels are distributed within local regions or neighbourhoods.Examples are the theory of textons [17] , local patterns [14] , local affine regions [11] , invariants of scattering transforms [15] , Fast Features Invariant to Rotation and Scale of Texture [16] and others [3] .Many of these effort s consider fixed sized neighbourhoods to extract basic parameters such as moments [8] and histograms [14,17] .Even though those approaches have demonstrated efficiency in both theory and practice, there might exist further methods which are more robust and efficient in the increasingly challenging problems encountered when analysing digital images.This work proposes using texture descriptors that combine the local features of pixel neighbourhoods with a multiscale analysis.The local arrangement of pixels is quantified by means of a connectivity index, similar to that proposed in [5] and which represents the number of connected pixels (in Graph Theory sense) to the central pixel of a neighbourhood.The degree of pixel clustering is estimated in a representation of the image where pixels are points in a quasi-three-dimensional space where x and y are the image coordinates and z is the pixel intensity function.The local measure is multiscale since it is computed over neighbourhoods (local windows) of various sizes.Finally, the descriptors are provided by the cumulative distribution of the connectivity index, as computed in [5] , on the entire image and for the different neighbourhood sizes (radii).
Our proposed method has some particularities that distinguish it from previous textural analyses in the literature.First, unlike the method in [11] that focuses on particular affine regions, our method considers all pixels to have the same importance.Second, image invariants are not explicitly handled as in [11,15,16] , although the symmetry of the connectivity index together with the multiscale approach used allows the quantification of such invariants and simplify the analysis where such invariants are not relevant, for example, if there is no significant variation or when objects at different scales or rotated should actually be considered as different instances.A third difference to other methods such as those in [8,14,17] is that despite the importance of pixel-level descriptions, our approach does not use directly the grey values of the pixels or basic relations between neighbour pixels.Instead, a complex relation within the local neighbourhood is considered.This strategy takes into account how pixels are connected in the entire neighbourhood and this is based on real-valued distances instead of binary values at a particular distance.
The following section introduces the local connectivity index and other details of the proposed method.Section three discusses how the method relates to the concept of connectivity.Section four describes the validation experiments (databases and other compared descriptors), while section five shows and the results obtained.Finally, section six briefly summarises the conclusions of the work.

Local connectivity index
The concept of local connectivity in image textures was first presented in [5] as a measure of local regularity in the context of a multifractal analysis.In simple terms, it is a generalisation to the three-dimensional Euclidean space of the local connected fractal dimension of binary images previously reported in [10,18] .
In [10] a connectivity measurement is associated to each pixel p of the object (foreground) by counting the number of foreground pixels connected to p within a neighbourhood with radius r .The pixel p is "connected" to pixel q = p means that there exists a set P of pixels satisfying p ∈ P and q ∈ P ∀ (p 1 , p 2 ) ∈ P : where p 1 / − p 2 means that p 1 and p 2 are adjacent pixels (4 and 8adjacency are classically utilised in binary images).
The ability of the connected fractal dimension to differentiate between images of normal and abnormal retinal vessels in [10] suggests that concept of connectivity might also be useful in other applications where the local regularity of the binary image needs to be assessed.Furthermore, the computation of the local descriptors for every position in the image also provides a complete and point-wise representation of the image data.
To extend the notion of connectivity to grey-level images, a strategy was proposed in [5] based on a pseudo-three-dimensional representation of grey-level images.Each pixel with coordinates ( x , y ) in the image I is mapped onto a point in quasi-threedimensional Euclidean space E 3 with coordinates ( x , y , I ( x , y )).
Even though a similar definition of adjacency can be used in three-dimensional spaces (26-adjacency, for instance), this is not directly applicable to sparse sets of points such as those in the grey-level mapping.Therefore, a new definition for adjacency was established in [5] based on Euclidean distances between points: two points p with coordinates ( x 1 , y 1 , z 1 ) and q with coordinates ( x 2 , y 2 , z 2 ), both in E 3 , are considered adjacent if they are at a distance smaller than a pre-defined threshold t : The next step is to define the connectivity around a reference pixel p in the image.While in [5] a cube centred at p was used, here we use a ball or sphere to preserve the geometrical symmetry in the local neighbourhood.In this way, a ball B ( p , r ) centred at p with coordinates ( x p , y p , z p ) with radius r is defined via ( The connectivity index can then be informally defined as the number of points inside B ( p , r ) and connected to p .The connected set therefore corresponds to the set of points within B , and including p , such that each point q a has at least one other point q b at a distance smaller than t .It is important to point out hat pixels connected to p but not belonging to B ( p , r ) (at a distance from p larger than r ) are not counted in the composition of the connected set.The connectivity index C ( p , r , t ) can be summarised as where D is the Euclidean distance, ∪ represents the classical notation for set union or, equivalently in this case, concatenation and # symbolizes the cardinality (number of elements) of the set.
A flowchart in Fig. 1 shows the basic procedure to compute the connectivity of a particular pixel p 0 in the image.More details and a pseudo-code is provided in [5] for the interested reader.

Proposed method
From its definition, the connectivity index presented earlier can be computed locally, relative to each pixel in the image.Thus, this corresponds to a transform of the original image where each pixel is converted into an integer non-negative value representing the local connectivity.Same as in numerous other transforms (Fourier, wavelets, etc.) these mappings constitute powerful tools to provide a new viewpoint on the image and therefore might be useful to describe various patterns that cannot be discriminated in the original space representation.
Fig. 2 illustrates the local connectivity for four images from the UIUC database [9] , belonging to two different classes.Two major features can be noted in this representation.Firstly, the transform conveys information which is different from that represented by the grey levels while still preserves some prominent characteristics (for instance, the cracked edge in the left figure).Secondly, images from diverse classes lead to dissimilar patterns in the connectivity image version.
Another interesting property of the connectivity measure is represented in Fig. 2 where the connectivity indexes were computed for different radii r (3, 4 and 5 pixels) and shown as a colour map: apart from the increased magnitude of the indexes (which is expected, provided that larger neighbourhoods contain more points) different patterns emerge in the distribution of the connectivity indexes at different radii.This suggests the multiscale nature of the connectivity, making it potentially useful to describe complex images.In particular, the ability to distinguish the classes presented in Fig. 2 is boosted by considering the image transform results at different radii.
The distinction of classes appears more evident in Fig. 3 resulting from different distributions across classes at different sized neighbourhoods.Such example illustrates that whereas the connectivity index shows its importance in a multifractal context in [5] , it also could be a useful local descriptor on its own.While the performance of the connectivity is illustrated here on a small subset of images from UIUC database, the Results section presents a more robust experiment over the whole database, confirming the power of our approach.
Based on these observations, we propose a method for computing descriptors based on the combination of local connectivity transform histograms for various radii r and thresholds t .While radii between 1 and r max pixels are considered, the possible values of r are those that correspond to the underlying discrete Euclidean distances, expressed by the set R in the following expression: us assume that C (r, t ) is the increasingly ordered set of unique C ( p , r , t ) for any p ∈ I .For radius r ∈ R the cumulative histogram h t r is defined for each counter index k ∈ C by where δ( x ) stands for the Dirac delta function (one at x = 0 and zero everywhere else).
Next, the sets of features in h t r for all radii are concatenated into a feature vector H t : Finally, the proposed descriptors D t for image I are obtained from the cumulative sum of H t followed by a logarithm operation: There are two reasons to use the cumulative sum instead of an L 2 -histogram: first, it ensures more regular features because the sum provides a monotonically increasing function; second, its values increase according to the variation in the original data (histogram).These two characteristics are important in the distancebased classifiers used here because the dissimilarity metric is more sensitive to the accumulated variation.The logarithm can be justified in a similar way: the connectivity index scales exponentially as described in [5] and the logarithm attenuates the wide range of scales in the feature space.Note that more than one value of t could be used.After the descriptors are computed, the resulting features are concatenated, and to reduce the dimensionality a large number of features, a Karhunen-Loève transform is applied to preserve only the most relevant and uncorrelated descriptors.Fig. 4 summarizes the steps involved in the method in a flowchart, while Fig. 5 illustrates each step with a visual example.

Experiments
The proposed descriptors were assessed in the classification of three established databases of texture images, namely, UIUC [9] , Outex [13] , and KTH-TIPS2b [2] .UIUC is composed of 10 0 0 greylevel texture images with dimension 256 × 256 equally divided into 40 groups, containing photographs taken under uncontrolled conditions and with relevant variation in scale and viewpoint.The version of Outex database used here is the suite Outex_TC_0 0 013 in [13] and contains 1360 colour images (here converted to a greyscale) captured under controlled conditions of illumination and imaging geometry.The samples are divided into 68 groups (20 images with dimension 128 × 128 in each one).Finally, KTH-TIPS2b is a set of material images, collected under varied conditions of illumination, scale and pose, comprising 11 materials (classes), each one with 4 samples and each sample with 108 cropped images with dimension of 200 × 200 pixels.
The classification of the databases according to the compared descriptors was carried out using a linear discriminant classifier (LDA) [4] after a principal component analysis (PCA) [4] to reduce the correlation among features in all compared approaches.Such approach for the classification prevents redundant information from being considered in the segmentation of the feature   space.The simultaneous dimensionality reduction and decorrelation is achieved by a combination of PCA with the canonical analof LDA, which takes into the distribution of features among the in the training set.These make the adopted classifier more appropriate for the proposed descriptors, which are naturally highly a consequence of the iterative they follow (descriptors from neighbourhoods of larger radii part of the information expressed by those of smaller radii).A maximum of 100 PCA scores were used in the LDA classification.Considering that we were working on databases where the classes are known in advance, the classification was considered to be supervised.A division into training and testing sets following a 10-fold scheme was applied for cross-validation.
The parameter r max was set to 5 in all tests based on the observation in [5] that larger radii severely increase the computational cost and number of descriptors without a significant gain in the classification performance.On the other hand, the values for the parameter t were empirically obtained from a training set of images.Fig. 6 illustrates the variation in the rate of images correctly classified in UIUC database when t ranges between 2 and 10.In the end, t = 10 was used for UIUC, a combination of t = 2 , 3, 4, 5, and 7 constituted the optimal parameters for Outex, and a combination of t = 5 , 6, 8, and 9 was used to classify KTH-TIPS2b materials.As mentioned earlier, all the connectivity indexes were concatenated before the application of PCA to reduce dimensionality.

Results
Table 1 lists the rates of correct classification (i.e.ratio of images correctly classified) for each database and each descriptor in comparison with the proposed method.The table suggests that classifying the Outex and KTH-TIPS2b databases was some-what more challenging than the UIUC images, as illustrated by the smaller correctness rates and higher cross-validation errors.This behaviour is expected because of the nature of each database, because the high number of classes in Outex and samples in KTH-TIPS2b data set.Such complexity is the rationale, for example, for the need of including more than one value of t to generate the feature vector.In all cases, the connectivity descriptors outperformed all the other approaches by at least 5% (UIUC), 2% (Outex), and 13% (KTH-TIPS2b) in relative percentages.In KTH-TIPS2b the classification performance was also better than that reported in [16] (76%) using a similar database and protocol.
Beyond the correctness rate, a more complete and precise metric to assess the performance of supervised classification is the confusion matrix.This expresses the correctness in each class as well as the number of samples misclassified, including which pairs of classes A / B introduced confusion (e.g.samples from class A but assigned to B by the classifier and vice-versa).Fig. 7 shows the confusion map for UIUC, Outex, and KTH-TIPS2b, when classified by the best two methods in Table 1 .The colour bars represent the number of images assigned to each particular class.Although visually the differences may not appear evident, the advantage of the connected descriptors is expressed in the colour map by a diagonal with higher values (i.e. more red points).Less confusion is also observed in the most critical regions of the map, for instance, in the last 10 classes (right end of the plot) of the Outex dataset or in the lack of yellow squares for the connected descriptors in KTH-TIPS2b.In practice, the information enclosed by this type of representation can be used to identify classes of materials with high potential of being misclassified.For example, in KTH-TIPS2b, classes 9 and 8 are confused to a significant extent by MR8 but not by the proposed descriptors.The corresponding materials (white bread and linen) are examples of textures that are more appropriately handled by a multiscale approach as that employed by the connected descriptors.
The results confirm the efficiency of the connectivity descriptors as a means of expressing the richness and complexity encompassed by the pixel values of a grey-level image in a reduced and more easily manageable set of numerical values.The classification performance can be justified by the ability of the connectivity index to quantify certain pixel relations within a local neighbourhood.Varying the size of the neighbourhoods makes the analysis multiscale and this helps describe the heterogeneity of the texture.This is important information for both natural and artificial vision systems.

Conclusions
This work presented a new method, named connectivity descriptors, to extract features from texture grey-scale images.The approach is based on the computation of a connectivity index within a local image neighbourhood.This corresponds to the number of pixels more closely related to the central pixel in the neighbourhood.The method performance was compared to other methods in the literature and tested on three well established texture databases.
The connectivity descriptors provided the best classification rates among the comparison of methods for both database tests.Such promising results can be supported by the ability of connectivity indexes in quantifying the relationships among pixels within a local region of the image.This, added to the multiscale nature of the method, makes the features sufficiently robust and precise to distinguish between images of different textures even in the complex scenarios of the datasets investigated.
Finally, given the generic nature of the UIUC, Outex, and KTH-TIPS2b databases, the results suggest that the connectivity descriptors might be suitable for a number of real-world applications in

Fig. 1 .
Fig. 1.Flowchart used to compute the connectivity of a pixel in the image.

Fig. 2 .Fig. 3 .
Fig. 2. Connectivity indexes in a colour map (normalised between 0 and 1) with different values of radius (from top to bottom, r = 3 , r = 4 and r = 5 ) for four UIUC images, from two distinct classes.

Fig. 5 .
Fig. 5. Steps involved in the proposed method.From left to right, the original image, the connectivity for different radii, connectivity histograms, concatenated histograms, cumulative sum and Karhunen-Loève transform.

Fig. 6 .
Fig. 6.Percentage of UIUC images correctly classified when the value of t is varied between 2 and 10.

Fig. 7 .
Fig. 7. Confusion matrices for the best two methods in Table 1 on UIUC, Outex, and KTH-TIPS2b.The scale bar represents the number of images pertaining to an expected class and assigned by the classifier to a predicted class.

Table 1
Best success rates for the compared descriptors.