Texture recognition by the methods of topological data analysis

Abstract High spatial resolution satellite images are different from Gaussian statistics of counts. Therefore, texture recognition methods based on variances become ineffective. The aim of this paper is to study possibilities of completely different, topological approach to problems of structures classification. Persistent Betti numbers are signs of texture recognition. They are not associated with metrics and received directly fromdata in form of so-called persistence diagram (PD). The different structures built on PD are used to get convenient numerical statistics. At the present time, three of such objects are known: topological landscapes, persistent images and rank functions. They have been introduced recently and appeared as an attempt to vectorize PD. Typically, each of the proposed structures was illustrated by the authors with simple examples.However, the practical application of these approaches to large data sets requires to evaluate their efficiency within the frame of the selected task at the same standard database. In our case, such a task is to recognize different textures of the Remote Sensing Data (RSD). We check efficiency of structure, called persistent images in this work. We calculate PD for base containing 800 images of high resolution representing 20 texture classes. We have found out that average efficiency of separate image recognition in the classes is 84%, and in 11 classes, it is not less than 90%. By comparison topological landscapes provide 68% for average efficiency, and only 3 classes of not less than 90%. Reached conclusions are of interest for new methods of texture recognition in RSD.


Introduction
The most part of information obtained in modern experiments has the form of high resolution digital images. Example of HR-image primarily is remote sensing data (RSD), which is used for monitoring and GIS for different purposes. Images are taken from spacecraft with a resolution reaching several tens of centimeters in different spectral channel. For example, spacecrafts groupings called WorldView-1(2) have resolution of 50 (46) cm, respectively; GeoEye-1 has 41 cm resolution in panchromatic mode and IKONOS -1 m. The Quick Bird spacecraft allows to get panchromatic images of terrestrial landscapes with spatial resolution of 0,6 m; multispectrial resolution reaches 1.65 m in near infrared region. These images give qualitatively new information about structure of the observed terrestrial scenes. However, they have number of features that reduce effectiveness of classical methods in image recognition, or even makes them almost impractical.
So, within traditional problems of texture analysis of terrestrial landscapes, hyperspectral images obtained on the spacecraft board are discrete sets of extremely high dimension. For example, EO-1 spacecraft, Hyperion, NASA has 220 channels overlying spectral range of 0.4-2.5 nm and MightySatII spacecraft, FTHSI has 256 channels and spectral range of 0.35-1.05 nm. The number of possible combinations of numerical values for the fragment consisting of k × k pixels reaches ∼ k 255 astronomical numbers. Such data is provided with the curse of dimensionality diagnosis. It is a metaphor of two unpleasant features. Firstly, high-dimensional spaces of characteristic are "empty" and they consist of bubbles where data is concentrated in covers, so they are ultrametric.
In this case, conventional clustering methods based on the nearest neighbors, become almost inapplicable. Secondly, they have measure concentration effect. Broadly speaking, this effect means that at dimension increase the measure aspires to its average (or median) value [1].
Furthermore, histograms of brightness or contrast distribution of HR-images contrast considerably differ from normal distribution. They have "heavy tails", decaying by power law and higher statistical moments, different from Gaussian counterparts [2]. Therefore, statistics of the second order, on which the least square method is based, becomes unstable. However, well-known methods of the textural analysis are based on quadratic functionals. Also, by power law in the PDF leads to the asymptotic descending of Fourier spectrum. These circumstances indicate properties of statistical scale invariance or multifractality [3][4][5]. Non-Gaussian statistical properties of the edges, limiting varieties of maximum singularity, are important for quantitative description of the texture [6,7].
Segmentation required by Mumford-Shah functional, minimizes the sum of the Dirichlet's energy, i.e. the square of brightness gradient [8]. This functionality is not effective, when one pixel contains a mixture of patterns. That is why the soft option, in which each pixel contains all possible patterns from a set of signatures, but with some probability, has been offered [9]. Attempts to overcome the computational difficulties, which have arisen at the same time, have led to the methods of Markov modeling of images -MRF (Markov Random Field) [10,11]. MRF experiments have led to many interesting results in Markov Random Fields modeling [12]. However, image segmentation has been lost in them "killed" by analytical severity.
Note that HR-images are associated not only with space monitoring. In medicine non-invasive diagnostics of biological tissues are based on these images.They are obtained by magnetic resonance tomography or Diffusionweighted image with spatial resolution of less than 1 mm. Methods of diffusion tensor imaging or tractography enables to track anatomical structure (paths) of the brain. For magnetic fields of 1.5-3 teslas, spatial resolution reaches 1-2 mm. It is no wonder that nontrivial algorithm based on the results of the various areas of modern mathematics are used in these researches [13][14][15]. Methods of Industrial Xray Tomography are another important sources of HR images [16].
Thus, there are several areas having problems to be solved, at least at the level of technical complexity. In a general sense, most part of these tasks is connected with textural pattern recognition. The purpose of this article is to study the effectiveness of new, not metrical approaches to the problem of recognition. They are based on topological data analysis [17]. The main advantage is that we can calculate descriptors directly on the image patterns. De-scriptors are topological invariants that are weakly dependent on the metric distortion.
The article is of the following structure. In the second section we briefly describe the general classification of well-known approaches, centers where methods are developing and mention their main disadvantages. In Section 3, we give a heuristic introduction of topological methods.
The fourth section describes the tested algorithm and its results. Conclusion summarizes our analysis. As this article is not destined for a professional mathematician, we have seen value in including a glossary of necessary basic concepts in the Appendix.

Morphological and multifractal models of HR-images
Different approaches based on modern mathematical methods have been proposed to construct models HRimages. The most important of them are described in works with theoretical content [2,[18][19][20]. Practical methods in modeling can be divided into three areas. The first one presents the geometry of random fields and mathematical morphology. The basic idea is to analyze of the geometry of the field excursions above the prescribed level. Under some restrictions on the regularity of the field, the excursion sets belong to the ring of convexity B, i.e., they are convex sets, or finite union of such sets. The exursions form a nested sequence of subintervals, i.e. filtering, parameterized Morse height function. Sets of excursion allow to conveniently determine sets of valuations, i.e. functionals that possess the property: ϕ (A)+ϕ (B) = ϕ (A ∪ B)+ϕ (A ∩ B), for any two sets A, B ∈ B. If we endow the valuation with properties of continuity in the Hausdorff metric and the invariance relative to the group of solid motions, we get the known morphological Minkowski functionals [21,22]. They can be easily calculated, and have simple geometric meaning: total area, total perimeter and the Euler characteristic for sets of each subinterval. Morphological functionals are widely used in medicine, and astrophysics [23,24]. The disadvantage of these descriptors for texture analysis is an integral character -the total areas and perimeters are multivalued functions of sets of a ring. The Euler characteristic does not allow to distinguish, for example, two clusters of three, one of which has a "hole".
The second direction is a multifractal image analysis. It was started by the work of mathematicians from INRIA group [25], in which it was proposed to use the multifractal formalism as a tool for the analysis of textures.The correct-ness of its application to images of landscapes was later substantiated by a group of Antonio Turiel from Barcelona on the example of the analogy of the correlation patterns of images and turbulent structures [26].
Later, the practical algorithms have been developed for multifractal study of images based on the wavelet analysis [27]. Application of multifractals to geophysical problems based on HR satellite images can be found in the works [28,29]. The development of these methods and their applications to RSD in Kazakhstan is described in a series of articles [4,5,30]. The approach is based on the abovementioned properties of scale invariance of the HR images. This allows to approximate the measure on a small scale by a power law with some exponent, which corresponds to the Holder exponent, which measures the regularity of the function. We select the image fragments with a fixed value Holder exponent. A set of such strata forms the so-called multifractal decomposition of the texture for each of the selected exponents. We measure the box dimension of each component of the decomposition. Pair, the exponent and the dimension of the strata form a multifractal spectrum [5]. The individual components of the decomposition are distinguished for the texture analysis, called the Most Singular Manifolds (MSM) [6]. The main disadvantage of the multifractal approach is the amount of necessary calculations and variability of digital measures. It is possible to partly improve the situation by replacing the conventional measures on Choquet capacity [4,5]. MSM can be used for the solution of particular problems certainly, especially in magnetic resonance tomography, but its use for handling large array of remote sensing data can hardly be justified.

Topological Data Analysis
The third direction forms a TDA, Topological Data Analysis is a new area of intelligent processing of experimental information [31,32]. TDA is a complex of methods used for diagnostics of high dimensional data that are contaminated by noise and may have gaps. TDA algorithms use the structures that are studied in algebraic topology [17]. Descriptors obtained with their help describe the multiscale properties of the data that are independent of the choice of a metric and a priori assumptions.
Heuristically, the idea is building a 'shape' for a 'cloud of points' in the spaces of not too large dimension. The cloud can be, for example, the values of photometric measures at each pixel of the image [33]. If the notion of the neighborhood is introduced with the help of balls Br(x) Figure 1: The filtration with the help of Cech covering. At the top, 5 are the connected components that has been decreased to the three due to creating the edges. The barcodes for Betti numbers show three open line segment and two closed ones. In the bottom, one connected component has been left and one barcode that corresponds to it. The "hole" has been born and the barcode has appeared for Betti 1 of variable radius r, the forms can be obtained as simplicial complexes. These sets, which are built by vertexes, edges, faces, tetrahedrons, which are obtained by intersecting balls homotopies cover ∩Br(x) to the elements of the complex. The result of this collapse is called the nerve of the cover. Thus, the crossing of two balls "collapses" to the edge that joins their centers. The triangle is obtained to be the collapse of three crossing balls, etc. The nerve theorem states that for the triangulated space X, the crossing ∩ x∈X Br(x) of the finite contractible, the covering with balls is homotopic X. The variation of the coverage scale r changes the level of detail of the approximating nerve X. This procedure is called a topological filtration. Indeed, the cloud of points is aggregated in nested complexes. Every next one of them contains the previous one. In the filtration process the main topological properties -connected components and "holes", i.e. empty regions limited by cycles and surfaces, appear and die (see Fig. 1). Their number is topological invariants -so called Betti numbers that roughly speaking measure the number of kdimensional holes [34].Some formal definitions of topological concepts for the technique described here can be found in Appendix.
The feature described scheme lies in the idea of obtaining the evaluations of invariants in all units simultane-ously [17]. The life span of a property is the persistency that is measured by the range of the scale change and coded with barcodes [31] or so called persistence diagrams [35]. There have been received the nontrivial theorems of persistence diagrams robustness as regard to the perturbations that have made the computational topology to the universal non-metric means to analyze various objects [36,37].
Let us describe in brief, in what way the filtration can be implemented in practice. Let us assume that as per the series of HR-images we have made simplicial complexes on their pixels. Primarily such structures appear close to the highest readings (peaks) and pixels of their nearest proximity. By increasing the scale, we will connect more and more low values of measure in pixels. Besides, separate complexes start merging due to appearing edges, which decrease the connected components number, i.e. the first one of the Betti numbers β 0 . The result will be one connected complex (see Fig. 1). However, during the filtration empty areas with pixels of low values will almost probably appear. They will exist as the holes and will die, when the filtration level height reaches their values. The number of these holes makes the second Betti number β 1 . The scheme described is implemented by the algorithm that has been thoroughly stated in [17,38].
The algorithm for computing persistence diagram [17] consists of two sequential steps: filter construction of simplices (for two-dimensional images the simplex is a vertex, an edge, or a triangle) and computing the Betti numbers on the created filtration. Let f (x, y) be a gray-value function of the image that takes values from 0 to 255. For the filter construction we need to determine the function value for each of simplices. In order to do this, we associate each pixel (x, y) of the image with the vertex such that this vertex corresponds to the value f (x, y) that equals the intensity of the corresponding pixel on the image. We define the value for the remaining simplices by assigning the maximum of values between their vertices. Now we describe the algorithm for the filter construction. First, we sort all vertices (pixels of the image) in increasing order of their function value F(v) (i.e., the intensity level of the corresponding pixels) and create a sequence v 1 , v 2 , v 3 , . . . , vn. Let us further assume that if two vertices have the same value of the function F, then the vertex that is higher or to the left of the second vertex on the image is located closer to the beginning of the sequence. Next, we iterate through all elements of the ordered sequence and add each of them to the filter. At the same time, attaching the new vertex to the filter, we add all edges and all triangles that can be generated by vertices that we already have in the filter and the new vertex. A condition for creating the edge or the triangle is the presence of two neighboring vertices for the edge and three neighboring vertices for the triangle (in such a way that no two edges cross each other). The value of the added simplex is the value computed by the method described above. As a result we obtain the filter, which is the sequence of simplices s 1 , s 2 , s 3 , . . . , sn such that the simplices there are sorted in increasing order of their value F(s j ), and if the simplex s i is a face of the simplex s j , then s i goes in front of s j in the filter. Now we can easily compute the Betti numbers by processing the simplices in the filter and keeping track of changes in connectivity of the obtained set.
The algorithm output gives overall lengths of barcodes or parametrization in view of the persistence diagram (PD). Let us recall that the diagram represents the persistence, i.e. the life span of the property measured in the height scale in view of the 2-dimensional coordinate point. The first one of them is the height that gives birth of the corresponding Betti number, and the second one is the height of its dying.

Persistence Images
The practical application of the persistence diagram is reduced to the construction on its basis of the descriptors that would allow to distinguish between two texture patterns formed by photometric measures. Unfortunately the cloud of points of the persistence diagram or barcode set is neither a vector spaces nor manifold. There are several approaches to convert these objects into a more convenient structure. Historically, the first topological landscapes have been proposed in paper [35]. They are based on the construction of persistent module and piecewise linear functions, allowing the comparison and averaging. We tested this approach in application to HR images [38] and found that the landscapes have a rather low efficiency of recognition of natural textures. Here we consider another approach that allows to "vectorize" the PD. It is attractive because it is adapted to problems of classification and machine learning. This section contains, therefore, a continuation of our studies on the effectiveness of TDA for the analysis of digital images [38].
Typically, the classifiers are constructed based on the matrix of pairwise distances between descriptors. However, for a detailed study of objects properties of machine learning methods only one matrix of distances is not enough. These techniques become available for the vector representation of persistent diagrams [39,40].
The idea of theses approaches lies in the blurring of the persistence diagram by using a Gaussian kernel [41]. Resulting persistent surface is reduced to a finitedimensional vector by discretization of relevant domain and subsequent integration. It is called persistent image, in contrast to the generating PD. The additional filter is used to suppress the impact of the points near the diagonal of the persistence diagram where noises are located.
Formally, following the [39], we suppose that each point PD (bx, by) ∈ R 2 of the persistence diagram is the center of the Gaussian kernel. The value of the persistent image I(p) in each pixel p is determined by the following expression: Here, σx and σy is the kernel variances in directions x and y accordingly. It is expected that variances are statistically independent. This definition of the pixel in the persistent image considers the contribution with similar weight both from the points near the diagonal that mostly represent the noise and from the remote points that represent the strong signal. In order to fight with the noise effect when summing up, the contribution of each Gaussian into the integral is considered with the weight that depends on its distance to the diagonal |b| = by − bx. The weighed image can be shown in view of: where the kernel parameters are defined as in the expression (1), and a nonnegative weighting function f (|b|) dependent on the distance to the diagonal. It is selected to effectively suppress the noise diagonal. The selection of a persistent image parameters is discussed in detail in [39].
In our experiments, the same function as in [39]: f (|b|) = exp(M(|b|))−1. Choosing the weight for the longest barcode which equals to unity, we obtain a simple formula for selecting the parameter M = ln(2)/bmax, where bmax = max|b| , b ∈ B. When the maximum length of barcodes of two compared point clouds differ, the maximum length of the barcode among all the diagrams is selected as the given parameter.
The base of HR (High-Resolution) images of natural objects has been used for numerical experiments, which is available on the website http://www.cfar.umd.edu/~fer/ website-texture/texture.htm. It contains 20 classes of textures, with 40 participants in each of them.
Examples of creating persistent images for 4 textures are presented in Fig. 2. For the sake of simplicity we used a blurring of the histogram instead of direct integration.This was done by a Gaussian kernel in Matlab using the function filter2. This filter is applied to high-resolution images. Then we apply a grid with a lower resolution (e.g. 20×20) and intensity is computed in each cell. The obtained persistent image has a vector representation in the form of a column vector of length 400. We used a simple binary classifier to evaluate the effectiveness of persistent images as the descriptor for texture recognition. The recognition results, compared with persistent landscapes [35] are listed in Table 1. With the exception of one class (No.18) the recognition by the persistent images significantly exceeds the results of landscapes.

Conclusion
The paper discusses topological methods for pattern recognition of high-resolution digital images. Their advantage is that the descriptors are computed only by images without using any model. Topological filtration of counting on sub(super)-levels is the cornerstone of descriptors calculation. Photometric measure of each pixel is preliminarily sized. Morse function is a level of this measure. When scanning image by level it is believed that a pixel generates a new connected component on the support, if it is not adjacent to its predecessor. The eight pixels adjacent on edges and vertices are considered as pixel adjacencies on a square lattice. Outcome of scanning is merge of all clusters in one connected component. The level of connected component emergence is called the moment of "birth", and the level of its merger with an existing cluster is the moment of "death". The interval between them measures persistence of topological property. Clusters of pixels can form voids or "holes" therein. Their birth and death identifies images scanned by the levels top to bottom. Thus, using algebraic topology methods for each level, the number of connected components (Betti 0) and number of holes or non-limiting cycles (Betti 1) is calculated. The stability of these numbers represents PD.
As a matter of principle, filtration allows the following descriptors: • Distribution histograms of the Betti numbers of level and their statistical moments. The distance between them, for example, can be Kullback-Leibler divergence. • Landscapes -piecewise linear vector functions build on diagrams. • Persistent images -persistence diagram, blurred by the Gaussian kernel.
Experiments with landscapes have shown that the acceptable quality of recognition manages to be reached only in those cases where the class of textures has "characteristic" profile of a landscape at least for one of the Betti numbers component. In case of high variability landscapes within the same class recognition result can be very bad. As an alternative, the method of persistent images has been tested in this article. We have found out that efficiency of texture recognition by the mentioned method from the database can reach 90%.

A The glossary on algebraic topology
This section contains some definitions and concepts that are more general than just determination. We used mainly monographs [17,34]. A simplicial complex is a global object made of N points. It is based on oriented k-simplexes, σ k which are convex envelope (k + 1), geometrically independent points of Euclidean space where k m. For example, a 0-simplex is a point, 1simplex is a line segment, 2-simplex is a triangle and 3simplex -a tetrahedron. Location of peaks in k -simplex σ k = ⟨x 0 , x 1 , . . . x k ⟩ notes determines orientation of the simplex. In topological space simplexes are homeomorphic to k balls. V set is called simplicial complex if it is formed by a set of simplexes with the following properties: • if σ k ∈ V, then all its edges i.e. σ m ⊆ σ k , m < k low-dimensional simplexes also belong to V; • non-empty intersection of two k simplexes in V is an face i.e. simplex in V.
Let C k be set of all k-simplexes. It is not difficult to give it vector space structure over F field. In practice, Zp is chosen as the field for p small prime numbers. Then ksimplexes form the basis of the free Abelian group, called k-chain group, C k . In general, k-chain is the formal sum of final number oriented k-simplexes of c k = ∑︀ i a i σ k i type, where coefficients a i ∈ Zp. Addition of two chains, c 1 k = ∑︀ i a i σ k i and c 2 k = ∑︀ i b i σ k i is defined as: ∂ k : C k → C k−1 linear mapping, which is called a boundary operator, displays k-simplex in its border, formed by the sum of its edges (k − 1). For k-simplex ⟨x 0 , x 1 , . . . x k ⟩ boundary map is defined by the formula: . . x k ⟩ by x i top removal. Action of the boundary operator on usual k-chains is received by linear expansion of their action on k-simplexes: So, boundary of 2-simplex ⟨a, b, c⟩ is ∂(⟨a, b, c⟩) = ⟨b, c⟩ − ⟨a, c⟩ + ⟨a, b⟩ , and boundary of this 1-chain is It illustrates fundamental property of the boundary operator ∂ k ∂ k+1 = 0. Action of ∂ k operator on sequence of embedded k-chains C 0 ⊂ · · · ⊂ C k−1 ⊂ C k ⊂ C k+1 ⊂ . . . lowers their dimension: and is homomorphism, i.e. it keeps grouped operation. Z k ⊂ C k subgroup or subspace, formed by kernel of the boundary operator Z k = ker ∂ k consists of k cycles, i.e., chains for which ∂ k Z k = 0. In particular, for k = 0 boundary operator becomes zero: ∂ 0 (c 0 ) = 0. Therefore, all 0chains are Z 0 = C 0 cycles. The second C k subgroup is group of k-chains, which is boundary of (k + 1)-chain. This group of B k boundaries is image of im (∂ k+1 ) operator: B k = im ∂ k+1 (C k+1 ). As ∂ k B k = ∂ k ∂ k+1 C k+1 = 0, borders are subgroup of group of cycles: B k ⊂ Z k . Thus, B k ⊂ Z k ⊂ C k and consequently, we can form H k = Z k /B k factor group. It is called k-dimensional homology group. H k elements are classes of k-cycles, equivalent to within the accuracy of boundary. Let us consider ⟨a, b⟩ 1-simplex as an example. According to (A3) its boundary is difference of two vertexes: ∂ ⟨a, b⟩ = b − a. Vertexes or zero-simplixes are homologous to one other, b ∼ a, as they differ in boundary: b = a + ∂ ⟨a, b⟩. In other words, two w, z ∈ Z k k-cycles are homologous to one another if their difference is boundary: w − z ∈ B k . Number of independent homology classes in each dimension is measured by so-called Betti numbers, which are ranks of homology groups β k = dim H k .
The cloud of points obtained in the experiment is neither space nor manifold. Therefore, homology is calculated by building links between nearly points. There are various methods of building point cloud complexes. Most often so-called Vietoris-Rips complex is used. It is based on the points and N × N matrix of pair-wise distances. For the given ε scale parameter Sε simplicial complex is built by means of including points as 0-simplexes and ksimplexes, formed by sets of (k + 1) point, for which pairwise distances do not exceed ε. However, selection of ε is often not obvious. If we choose too small scale, we will get the same H k (Sε) homology as the point cloud has. On the contrary, too big scale leads to topological space equivalent to a point.
The way out is to consider a set of interleaved simplicial complexes, indexed by ε parameter. Thus, Sε 1 ⊆ Sε 2 ⊆ . . . embedded sequence corresponds to ε 1 ε 2 . . . sequence.When ε changes, topological characteristics of simplicial complexes also change. This procedure is called a filtration. Persistent homologies track topology in a certain range of scale values.
There are two ways to provide information about persistent homologies: barcodes and PD. These tools indicate at what values ε appears (is born) topological property and when it dies. Horizontal line segment in barcode begins at the birth of property and ends when the property dies. For any ε Betti number β k (ε) is simply the number of barcodes, which intersect with the vertical passing through ε.
On the other hand, the PD shows homological information by embedding information in the Cartesian plane. Horizontal and vertical axis encode scales of birth and death, respectively. Since all components die after their birth, this embedding is realized in the upper half plane, above ∆ = {︀ (x, x) ∈ R 2 |x ∈ R }︀ diagonal. Properties can be born and die at the same time. They create a bag multiset with infinitely many copies on diagonal. Points near the diagonal are considered as noise. For diagram illustration we will consider the graph of {︀ x, f (x) }︀ continuous function and set of A t = f −1 ((−∞, t]) sublevels, i.e. points in which function is lower than the set level (Fig. A.1). Thus, level is the filtration parameter for topology of sublevels. So, minimum in (2,5; 3,5) point adds to A 3.5 component of connectivity which is "killed" by maximum (5.5; 6.5) in A 3,5 sublevel. Point, corresponding to this event, is shown on the right part of the PD. It is almost obvious that points near diagonal will correspond to small fluctuations of the graph.
Stability theorem states, that for two f 1 , f 2 functions that are close in terms of suprenum metrics, correspond- ing D 1 , D 2 persistence diagrams are also close in sense of "bottleneck" distance W∞ -one of Wasserstein metrics are called so: This theorem is the cornerstone of indicator diagnostics, based on persistent homology.