Modeling the spatial structure of thermokarst lake fields in the cryolithozone of Northern Eurasia accounting lognormal law of their size distribution

Power and exponential lake size distributions are now routinely used for modeling the spatial structure of thermokarst lake fields. Based on a combination of satellite images of moderate and high spatial resolution, a synthesized lake size distribution histogram is constructed, which is approximated by a lognormal distribution law using the Pearson criterion with a probability of 0.99. This distribution law takes into account small lakes which are considered as intensive sources of methane emission into the atmosphere from thermokarst lakes. Based on a geo-simulation approach, a model of the spatial structure of thermokarst lake fields is proposed taking into account the lognormal lake size distribution. Algorithms for modeling the spatial structure of thermokarst lake fields are presented. An example of modeling a field of thermokarst lakes with a lognormal lake size distribution is given.


Introduction
The current global warming of the climate, most clearly manifested in the northern latitudes of the planet, accelerates the degradation of permafrost. Permafrost, being a storehouse of conserved carbon in the vast frozen peat bogs of Northern Eurasia, can become a source of even greater warming with the release of greenhouse gases, which will lead to the formation of new big challenges for the world community related to the violation of human-nature interaction. Indeed, carbon is currently in a bound state as an organic matter in the permafrost layer in the northern territories of Eurasia and America. With the climate warming, a temperature rise will lead to the melting of frozen rocks and additional release of methane as a product of the vital activity of microorganisms recycling thawed organic matter, which can make an additional tangible contribution to the climate warming.
The dominant role in the accumulation of methane of small thermokarst lakes (with areas less than 0.01-0.05 ha) was established [1] in the permafrost zone of Western Siberia. However, the contribution of millions of such lakes to the global greenhouse effect due to small size has not yet been taken into account. Attempts to take them into account in estimating the total volume of world methane reserves in a recently published paper [2], based on the use of the theoretical power law of the size distribution of lakes due to lack of experimental data, raise great doubts, since the power law is not supported by experimental data [3]. The development of measures to prevent an increase in the average annual temperature by more than 2 degrees by 2050 in accordance with the decisions of the World Summit on Climate (Paris, 2015) calls for the formation of forecasts of the dynamics of lake methane stock in the lakes of northern territories for the coming decades. This required the development of methods and tools for modeling the dynamics of thermokarst lake fields that would allow for the contribution of millions of small lakes to the total amount of methane reserves in the vast territories of Northern Eurasia.
According to Moiseev and Svirezhev [4], simulation modeling is a research method which can build an approximate model of spatial objects being studied. Simulation modeling is one of the most important mathematical modeling types which may be used to construct an efficient model of thermokarst lake fields with an accuracy that is sufficient for the current research. Low and Kelton [5] claim that simulation modeling is used to construct models in cases where, first, there is no analytical solution or this solution is very complex and requires huge computer capacity and, second, the amount of experimental data about an object being modeled is insufficient for the statistical method. In this case a mathematical model is developed in simulation modeling.
All above types of modeling are aimed at studying spatial objects using spatial data whose analysis is based on spatial analysis methods implemented with the help of modern geo-information systems (GIS-analysis). In our opinion, the most suitable general term for the above-mentioned types of modeling (mathematical-cartographical, spatial, geo-information, geo-simulation, etc.) is the term 'geo-simulation modeling', which is defined as the creation of a model and model application to objects with spatial structure. Problems of creating a geo-simulation model of thermokarst lake fields are studied by Polishchuk V. and Polishchuk Y. in [18,19]. Nowadays the power and exponential laws of the lake size distribution based on data from Landsat space images are used for modeling the spatial structure of thermokarst lakes fields. These images have a spatial resolution of 30 m that does not allow studying small lakes. The main goal of the paper is using the lognormal law to take into account small lakes in modeling the spatial structure of thermokarst lake fields.
2 Geo-simulation model of spatial structure of thermokarst lake fields The creation of a geo-simulation model of thermokarst lake fields requires knowledge of the basic properties of these fields, which can be obtained experimentally. Because of the inaccessibility of the northern territories of Eurasia, thermokarst experimental studies were carried out by remote sensing. For the remote study, twenty-nine test sites (TSs) were chosen in different zones of the West-Siberian permafrost (sporadic, discontinuous, and continuous ones). The remote study of the shape of the boundaries of thermokarst lakes was carried out using satellite images. Investigations performed at test sites in sporadic, discontinuous, and continuous permafrost showed that the error in estimating the lake areas while replacing their real lake boundaries by a circle is comparatively small (about 5% [19,20]). It may be the reason for choosing a circle as a model of lake in geo-simulation modeling of thermokarst lake fields. In addition, the formation of a geo-simulation model of thermokarst lake fields in the form of a population of random circles requires experimental knowledge of the distribution of the coordinates of lake centers and the distribution of lake sizes (lake size distribution). To establish the regularities of the distribution of random coordinates of lakes, satellite images were used. Analysis of histograms of the distribution of latitude and longitude values of the locations of lake centers given by Polishchuk V. and Polishchuk Yu. In [18] showed that experimental regularities of the distribution of the coordinates of lake centers correspond to the law of uniform density according to the criterion χ² with a probability of 95% [19].
Hence, the following fundamental principles determining substantial properties of the model of spatial structure of thermokarst lake fields can be formulated: 1. Lake coastline shapes can be represented by a circle equation with the coordinates of centers i x , i y and the area i s ( i is the lake serial number).
2. Spatial changes in the position of the centers of circles and their areas are statistically independent.
3. Random distribution of the circle centers' coordinates 3 A model of a thermokarst lake field is a population of random circles (Fig. 1) whose statistical characteristics correspond to the above principles (1-3). Figure 1 presents a geometrical interpretation of the model of thermokarst lake fields. Using a triplet of numbers ) , , ( s y x representing the value of the center of the circle coordinates and the value of its area, the coordinates of the points determining the borders of each circle are calculated as follows:  Major elements in the model description are the characteristics of lake shapes, the parameters of their random location on the surface, and the specific form of the size distribution law of lakes. The properties of the model fields of thermokarst lakes will depend substantially on the form of the distribution law of lakes in size (areas). The form of the size distribution law can be determined experimentally from satellite images.
At present, the lakes inventory data are used in studying the regularities of lake size distribution both at planetary [21] and regional [3,19] levels based on the results of remote measurements of areas and the number of lakes from satellite imagery of the average spatial resolution of Landsat (resolution: 30 m).

Modeling of thermokarst lake fields
In the general case, the mutual density of the probabilities of random coordinates of the centers and areas of circles imitating lakes in a mathematical model of random thermokarst lake fields can be presented in the form: 4 and the statistical connections between changes in the lake coordinates and their areas which are statistically independent according to [19]. Hence, together with using software generators for the distribution of pseudo-random numbers, the software realization of an imitation model of thermokarst lake fields includes creating a generator for pseudo-random number sequences distributed in accordance with the lake size distribution law determined based on satellite images.
We should consider a geo-simulation model of the spatial structure of a thermokarst lake field that is an aggregate of circles and reflects the state of the thermokarst lake field whose geometrical representation is shown in Fig. 1. According to the above, the coordinates x and y are distributed according to a uniform law. Based on Landsat satellite imagery in [19] it is shown that lake areas are distributed according to an exponential law. Therefore, a one-parameter exponential law was chosen for describing thermokarst lake size distribution in the following form: , ) ( Testing of the correspondence of the exponential law of lake area distribution given by Eq. (1) with experimental histograms was carried out in [19]. It is shown that at all test sites in studied territory this law corresponds to experimental data in accordance with the criterion χ² with an average probability of 90%. Consequently, the exponential law of lake size distribution in the form (1) according to areas does not contradict the experimental data. An analysis of the experimental size distribution of lakes according to their areas has shown that λ at all test sites in a continuous permafrost varies in the range of 0.037 -0.071 and in a discontinuous permafrost, in the range of 0.034 -0.086 with average values of 0.054 and 0.060, respectively. The estimation has shown that the error of determination of the average values of lake areas by modeling with the use of experimental data is 17 %. This may be regarded as a suitable result of modeling thermokarst lake fields for predicting thermokarst lake field dynamics. However, this model does not take into account small lakes (with sizes less than 0.2-0.5 ha) which have an increased concentration of methane in the lake water. These small lakes are not detected in Landsat images and, therefore, are not taken into account in the model within the exponential distribution of lakes in size, which makes it necessary to use satellite imagery of high and ultra-high resolution.
In this connection, investigations of the distribution of lakes in the cryolithic zone of Western Siberia were carried out using high-resolution space images of Canopus-B (resolution: 2 m) in combination with Landsat-8 (30 m) images. The processing of space images carried out using the standard tools of the geographic information system ArcGIS 10.3 was aimed at obtaining data on the number and areas of lakes with significantly different sizes for the purpose of constructing a histogram of the distribution of lakes in a very wide range of their sizes. The construction of such a histogram of the distribution of the lake areas in an extremely wide range of changes in their sizes (from units of m 2 to hundreds of km 2 ) from satellite imagery was carried out using a technique described in [23], which proposed a three-stage procedure for constructing a histogram based on the integration of satellite data 5 on the areas and the number of lakes. As a result, a single synthesized histogram of lake distribution over areas ( Fig. 2A) was obtained in a wide range of sizes, resulting from the synthesis of two initial histograms of the lake distribution obtained separately from high (HR) and medium (MR) resolution images. For the construction of such a histogram, partial intervals with an irregular pitch (logarithmic law) were chosen, namely: 2-5 m 2 , 5-10 m 2 , 10-20 m 2 , 20-50 m 2 , etc. up to 200 km 2 , which made it possible to provide data on the distribution of lakes over intervals of their sizes quite compactly in a very wide range of changes in the lake areas. The synthesized histogram taking into account both large and small lakes obtained on the basis of integration of data from satellite images of different resolution allows approximation by a lognormal distribution law.
According to [24], the probability density for the lognormal distribution law of the area of lakes (s) is determined by the equation (5) where μ is the expectation, and σ is the standard deviation.
In our case, estimates of the mathematical expectation (M) and variance (D) for the lognormal distribution of the lake areas in the permafrost zone were obtained from experimental sample data in [23] in the form M = 6.88 and D = 3.42. Testing of the compliance of the empirical and theoretical distributions, performed with the Excel software package, using the Pearson test showed that the synthesized histogram of the distribution of the lake areas obtained over a wide range of their sizes corresponds to a lognormal law with a probability of 0.99. Fig. 2B presents the empirical distribution of the total area of lakes according to their sizes, which shows that lakes with sizes from 2 to 500 ha give the bulk of the total area of lakes (about 80%).
Based on the above geometric interpretation of the model of the spatial structure of the thermokarst lake fields, where the circles are characterized by three random variables, namely, the two coordinates of the centers and area, we briefly describe the procedure of modeling the fields of thermokarst lakes in accordance with the lognormal distribution. The values of the coordinates of the centers are calculated using a pseudo-random number generator distributed according to the law of uniform density, as described in [19]. Calculation of the pseudorandom value characterizing the area of the circle is performed using a generator of pseudo-random numbers distributed according to the lognormal law in accordance with formula (5). The result of the procedure is the modeled field of model lakes whose area values obey the lognormal distribution law.
For illustration, Fig. 3 shows the result of modeling of a simulated image of the field of thermokarst lakes. Here, the lognormal distribution parameters M = 6.88 and D = 3.42, which characterize the permafrost zone in the territory of Western Siberia, were used. When modeling the fragment, the number of model lakes 3000 is specified.

Conclusions
This paper described an approach to modeling the spatial structure of thermokarst lakes on the basis of a geo-simulation model that represents a set of random circles with a uniform distribution of the coordinates of their centers and a lognormal size distribution. An experimental substantiation of the lognormal distribution of the lake size was given on the basis of the results of a study on an empirical distribution of thermokarst lake areas in a very wide range of their sizes in the permafrost zone based on a joint use of space images of various types of spatial resolution. The results of testing the compliance of this law with an empirical histogram showed that the lognormal law fits the experimental data, according to the Pearson criterion, at a significance level of 0.99. A procedure was briefly described for modeling a field of thermokarst lakes, where each model lake is characterized by a triplet of numbers: the coordinates of the center and the area. A fragment of