Modeling Exact Frequency-Energy Distribution for Quakes by a Probabilistic Cellular Automaton

We develop the notion of Random Domino Automaton, a simple probabilistic cellular automaton model for earthquake statistics, in order to provide a mechanistic basis for the interrelation of Gutenberg–Richter law and Omori law with the waiting time distribution for earthquakes. In this work, we provide a general algebraic solution to the inverse problem for the model and apply the proposed procedure to seismic data recorded in the Legnica-Głogów Copper District in Poland, which demonstrate the adequacy of the method. The solution of the inverse problem enables adjustment of the model to localization-dependent seismic properties manifested by deviations from Gutenberg–Richter law.


Introduction
Earthquakes are among the most devastating natural phenomena, but their studies are far from being completed, mainly because the mechanisms of earthquakes are still not fully understood [1]. They are extremely complex, and investigation of them involves many different approaches (see, for example, [2]). There is also additional complexity coming from sensitivity on local geological settings, which are always-to some extentunknown. In spite of dependence on local tectonics, there are known universal empirical laws for earthquakes-Gutenberg-Richter (G-R) law [3], and Omori law [4]. Earthquakes, being complex systems, exhibit scaling behavior [5], in particular the probability density function of interevent times can be rescaled into a single function [6][7][8]. This property was further investigated in the context of the epidemic-type aftershock sequence (ETAS) models [9,10]. Other universal behavior of earthquakes was investigated using a concept of natural time-see [11,12] and references therein.
This article is a contribution toward constructing a mechanistic model in the form of probabilistic cellular automata (PCAs) for generating interevent time distribution based on a locally deviated size-frequency distribution (see also [13]). In this paper, we focus on accurate modeling of the frequency-energy distribution, while time-related issues will be presented in the next article.
PCAs [14] are simple computational models, yet are capable of simulating complex phenomena. In particular, cellular automata proved to be useful modeling tools in seismology [15]. The action of CA and PCA is determined by its local transformation rules, which-in the case of natural phenomena-reflect the physical dependencies crucial for the phenomenon under investigation. Their relatively simple structure allows for insight into expected time series and general properties of earthquakes.
Here, we consider abstract representation of earthquake statistics by Random Domino Automaton (RDA) [16] with a local transformation rule based on a dissipative process of slow energy accumulation and abrupt releases understood as quakes. Contrary to the majority of PCA models, which are investigated mostly by extensive simulations (see, for example, [17] and also references therein), RDA possesses a unique mathematical structure, which allows for analytical derivation of several of its properties and constraints in the form of equations for stationary state.
The plan of this paper is as follows. In Section 2, we introduce the notion of Random Domino Automaton. Section 3 contains the derivation of equations describing the stationary state of the automaton. Some detailed calculations of the derivation are presented in Appendix A. Next, in Section 4, we formulate and solve the inverse problem for Random Domino Automaton. Section 5 contains application of the developed framework to exemplary open access data of episode LGCD [18] from LUMINEOS network located in the Legnica-Glogow Copper District, Poland (for details see [19,20]). Finally, we end with a discussion in Section 6.

The Random Domino Automaton
The RDA model [16] is a completely discrete dynamical system, i.e., both independent and dependent variables are discrete. Space consists of N cells along a line or circle, if periodic boundary conditions are assumed. This means that a cell has exactly two neighboring cells. The size of the automaton is N, where we assume N can be arbitrarily big.
Each cell may be empty or occupied by a single ball, which represents the absence or presence of energy, respectively. If a number i of consecutive cells are occupied, they belong to a cluster of size i. Clusters are separated by empty clusters-consecutive empty cells. The size of an empty cluster is equal to the number of cells contained in it. Empty cells may have zero, one or both neighboring cells occupied, and hence we distinguish three kinds of them. We will refer to these three kinds of cells as creation, enlarging and merging cells, respectively. The names originate from the dynamics of the automaton: what is the influence of the change in the status of the cell (from empty to occupied) on the total number of clusters?

Evolution Rules
Discrete time dynamics is defined as follows. In each time step one cell is chosen. We assume that every cell has the same chance of being selected.

•
If the chosen cell is empty, then it becomes occupied with some fixed probability, depending of the type of the cell. The value of probability is: c 0 for a creation cell, c 1 for an enlarging cell, and c 2 for a merging cell. The state of the chosen cell remains unchanged (i.e., empty) with probability (1 − c 0 ), or (1 − c 1 ), or (1 − c 2 ), respectively. • If the chosen cell is occupied, and it belongs to the cluster of size i, the whole cluster is removed (i.e., each cell in the cluster is emptied) with probability µ(i) = µ i depending on the size i of the cluster. The state of the chosen cell stays the same (i.e., occupied) with probability (1 − µ i ).
The removed cluster forms an avalanche of size equal to the size of the removed cluster. For µ = constant the RDA model is equivalent to the Drossel-Schwabl forest fire model [21].

Variables
The number of clusters of size i in the system is denoted by n i , and the number of empty clusters of size i by n 0 i . Both n i s and n 0 i s depend on time; however, we will refer to these variables as describing a stationary state, thus constant [22]. Note that these constant values are average values over time, and hence they are non-negative real numbers, not necessarily integers.
The density of the system, a ratio of the number of occupied cells to the size of the system, is thus given by Energy of a cluster of size i is equal to i statutory units, and hence ρN refers to the total energy contained in the system. The total number of clusters is The total number of empty clusters n 0 is defined in the same way. We assume periodic boundary conditions, i.e., assume that the last cell is adjacent to the first one, and also, that there are at least one cluster and at least one empty cluster present in the system. These assumptions make n and n 0 equal to each other, because each cluster is followed by an empty cluster. Without these assumptions, the actual values of n and n 0 can either be equal to each other or may differ by one. For the large size of the system N, and large values of n and n 0 , such a difference is negligible.
There are three kinds of empty cells, distinguished according to the number of occupied neighbors. An empty cell may be a neighbor for 0, 1 or 2 occupied cells. The total number of such empty cells in the system is denoted by x 0 , x 1 and x 2 , for these three kinds, respectively. Therefore, it follows An empty cell can change its state to occupied, according to specific evolution rules defined below. In such a case, there are three different situations, depending on the number of its occupied neighbors. Occupation of an empty cell with no occupied neighbor results in the creation of a new cluster (of size 1). Other results are: enlarging of an existing adjacent cluster and merging of two adjacent clusters. Hence, empty cells are named: creating, enlarging and merging, respectively.
The following expressions for x i , i = 0, 1, 2 follow directly from the definitions The first one comes from counting empty cells from interiors (inner cells, without those on ends) of empty clusters and the second from counting the remaining end cells. We point out the identity which reflects the fact that each cluster has two empty cells as neighbors. These cells are of enlarging or merging type, and each merging cell is a neighbor for two clusters. Obviously, the constraint expressed by the Equation (7) follows also from Equations (5) and (6).

Equations
The RDA is a Markov chain, and its space of states is irreducible, aperiodic and recurrent [22]. Thus, statistically stationary state is well defined, and it is possible to derive respective balance equations by using the "flow in = flow out" principle, and counting respective probabilities.
Below, we present the balance equations for ρ, n, x i and n i for the statistically stationary state of the automaton in mean field approximation for the special choice c 0 = c 1 = c 2 = c, while details of the derivation in the general case (arbitrary c 0 , c 1 and c 2 ) are presented in Appendix A.
The balance equation for density ρ reflects an equilibrium condition between all processes-creating, enlarging, merging and triggering avalanches, and hence it is The balance equation for the number of clusters n is All processes, except for enlarging, make a change to the number of clusters. These two equations are exact, contrary to the following equations, whose derivation uses the mean field approximation (see Appendix A). It follows that the balance equation for creating cells x 0 , enlarging cells x 1 and merging cells x 2 are of the form 2c Remark 1. Not all of the equations derived above are independent. Because of the relation (3), a combination of Equations (10)- (12) gives the Equation (8) for the density ρ. The relation (7), implies that a combination of Equations (11) and (12) must be consistent with the Equation (9) for the total number of clusters n.
Finally, the balance equations for n i 's are as follows: Remark 2. Equations (13)-(15) sum up to balance Equation (8) for ρ.

Formulation
The inverse problem for RDA [23] is to find the rebound parameters (probabilities µ i and c i ) that result in the given stationary distribution of avalanches w i . Here, w i is a frequency of appearance of avalanches of a given size i.
The probability of an avalanche of size i is proportional to the number of cells contained in clusters of size i (i.e., i · n i ) times respective rebound parameter µ i , hence and we assumed a normalization ∑ i w i = 1 .
A key observation for solving the inverse problem is that it is possible to express parameters of µ i as functions of n i and x i , using Equations (13)- (15). Moreover, the set of Equations (26)- (28), which are shown below, allows us to calculate w i from n i sequentially. In order to achieve this goal, we need one more step, namely defining "unit-less" variables.
Consideration of the stationary state makes the "speed" of time flow irrelevant, and thus the stationary state of RDA is influenced by ratios of probabilities rather then values of rebound parameters itself. Similarly, considering relative frequencies of avalanches, we may remove the dependence of numbers of avalanches and clusters in favor of ratios of respective variables in a way analogous to definition "intensive" (as opposite to "extensive") variables in thermodynamics and statistical physics. Thus, we definê Note 0 ≤x 1 ≤ 2, 0 ≤x 2 ≤ 1, andx 0 ,μ are non-negative. Using these "unit-less" variables, we define where we assume both sums are convergent. Notice that the average size of avalanche Obviously η ≥ 1.
In view of Remark 1 and the relation (7) which allows us to eliminate the variablex 1 , the set of balance equations is reduced to three independent equations. The balance Equations (8)-(12) (for the density ρ, for the total number of clusters n, and for the number of merging cells x 2 ) expressed in normalized variables are of the form The set can be solved explicitlŷ and Note that the limit η −→ ∞ givesx i = 2/3 for i = 1, 2, 3. Note also that the average size of empty clusters is We can also see that, from the equations of the distribution of clusters (13)-(15), by removing denominators and rearranging, one can obtain Summing up these equations for all i, one can obtain the identity 0 = 0.

The Procedure of the Solution to the Inverse Problem
With equations of the previous subsection, we can formulate the following procedure of solving the inverse problem.
• Take w i from data and normalize ∑ i w i = 1.

Exponential Tail
First, we apply the proposed above framework to an avalanche distribution, given by a geometric relation: We set q = 99/100, which implies the average size of avalanche η = 100, and having performed computations using rational values, one can obtain the following exact values: The distribution of clustersn i is displayed in Figure 1 on the left side. The starting two values equaln 1 = 10,201/30,100 andn 2 = 104,979,699/906,010,000. The sequence of parameters µ i obtained accordingly is calculated up to i = 10.000 and presented in Figure 1 on the right side. Starting with obtained values of rebound parameters µ i and calculating values of w i from respective equations, we reconstruct the values of avalanche distribution. Due to the usage of computations with rational numbers, which contain no approximation errors, the distribution fits to the geometric series (29) with value q = 99/100 exactly. This property, that geometric extinction can be calculated exactly, makes it useful for serving as a smooth cut-off of any real data, which always are given within some finite range of values.

Seismic Data from LUMINEOS Network
Next-generic-example deals with exemplary data [18] recorded by LUMINEOS Seismic Network in the Legnica-Głogów Copper District in Poland. The open-access data were downloaded from and are available on the EPISODES platform [19] within LGCD episode [20]. The data set contains 6095 items, with 32 different values of magnitude, ranging from M0.9 to M4.0, rounded to the nearest 0.1. The catalogue completeness threshold-i.e., the minimal value of magnitude above which 100% of the earthquakes are detected-is estimated to M1.7. This value is relatively high as for the local seismic network, and it reflects the influence of high noise levels generated by industrial activity nearby. We emphasize that the modeling data below the catalog completeness threshold do not reflect the physical properties of the system. However, we also model these data to show the scope of the model's capabilities.
The notion of Random Domino Automaton assumes that the sizes of avalanches and clusters correspond to energy, thus we convert magnitudes M to energies E using the formula [24] E = C · 10 1.5M , where C is a constant, and its value depends on units of energy. Assigning a value of energy to a single occupied cell is arbitrary, so we use this freedom and set C = 0.05. This choice gives for the lowest recorded magnitude value M = 0.9 the value of E = 1.1, for the catalog completeness threshold M = 1.7 it gives E = 17.7, and E = 5 · 10 4 for the highest magnitude value M = 4.0. We interpret a magnitude M as cumulative value for the interval (M − 0.05, M + 0.05), so the maximal value of energy is 59,425. On the other side, with the resulting low resolution for the smallest energies-which are also well bellow the catalog completeness threshold-the first two values corresponding to M0.9 and M1.0 we treat as a contributing to a single record. Next, we take these 31 values as a basis for piece-wise linear interpolation at [1,59,425], and we treat the obtained spline function as frequency distribution. It is also possible to choose another interpolation, however for our purpose to show possibility of reproducing variability of energy-frequency distribution, this is not of the primary importance. The piece-wise interpolation contains artificially created abrupt changes in the decrease rate, which is rather more difficult to reproduce, then only smooth changes.
Then, we fit inverse power distribution to the spline starting with value i = 42, which is equivalent to magnitude M1.95, and even slightly more than the provided catalog completeness threshold M1.7. We use the procedure of [25], to fit a truncated power-law, by logarithmic binning, maximizing the highest-likelihood estimator with data values parametrized along, to fit the distribution at the theoretical centers of weight of 18 intervals with the downhill simplex numerical method (see [25] for details). We obtain the following inverse-power distribution: Next, we add an exponential tail g(i) = γe −δ(i−a) starting from a = 59,426. We require f (a) = g(a) and f (a) = g (a) in order to have a smooth transition in the neighborhood of a. This condition allows us to calculate γ and δ in terms of α and λ, so g(i) = f (a) · e −λ i−a a . Thus, we obtain augmented avalanche distribution w i , consisting of the spline and the exponential tail. All the data and results of the subsequent steps described above are presented in Figure 2. The average size of the avalanche is < i w >= 215. Then, we follow the procedure described in Section 4.2. The value of I is set to 1/107, we assumeĉ = 1. The density is ρ ≈ 0.9891589, andx 0 = 0.6718844,x 1 = 0.674922, and x 2 = 0.662538. The obtained distributions ofn i andμ i , being a solution of the inverse problem, are presented on the left part of Figure 3. It is clear from the picture, that the "jump" of the distribution of avalanches-i.e., a large difference between the values for successive points i = 59,425 and i = 59,426-reflects in the respective "jump" of the distribution of rebound parameters µ i , and the variability of the distribution of clusters remains small.
Next, we calculate the distribution of avalanches w i starting from the obtained distribution of rebounds parameters µ i , i.e., we solve the direct problem. Comparison of the initial and the calculated distributions of avalanches w i is presented in Figure 3 on the right side. These two distributions overlap, confirming the ability of the Random Domino Automaton model to reproduce exact shape of relatively variable distribution of seismic data from LGCD episode.

Discussion and Conclusions
The article develops the notion of Random Domino Automaton [16] and presentsanticipated earlier in [23]-a solution of the inverse problem for this model. To that aim, we introduced normalized variables (17) and solved the problem algebraically. Then, we applied the procedure to exemplary data [18] recorded in the Legnica-Głogów Copper District in Poland, in order to demonstrate the efficiency of the approach.
The chosen exemplary data are quite challenging, because of the lowered number of records for small magnitudes due to a relatively high catalog completeness threshold and substantial variability for larger magnitudes, due to smaller number of entries. Nevertheless Random Domino Automaton proved to be able to reproduce these variable data very accurately.
We also propose a procedure to deal with a problem of data cut-off. Having recorded data for phenomena following fat-tail distributions, such as, for example, earthquakes, one might expect the appearance of events bigger then those already recorded. From the other side, it is clear that inverse-power distribution can not be valid up to infinity. Thus, we balance these two points by addition of a tail, which smoothly changes from inverse-power to exponential decay. The next step for extending the RDA model is to introduce aftershocks following the Omori law into the system in order to investigate the waiting time distribution (compare [9]). From the other side, the form of the distribution [8] has been connected with the timing of stress-redistribution events and it leads to so-called anomalous diffusion-see [26,27] and references therein. Extending the RDA model, we aim at providing a mechanistic model connecting the above-mentioned ideas. The solution of the inverse problem for RDA presented in this article is a necessary step for consideration of localization-specific deviations from Gutenberg-Richter law.
Finally, let us note that Random Domino Automaton, in a specific setting, can lead to Motzkin numbers recurrence [28], and give rise to similar construction for more widely known Catalan numbers recurrence [29]. Moreover, in spite of its relatively simple evolution rules, the RDA exhibit complex behavior for which a spontaneous migration between two SOC-like states was discovered [30].

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A. Derivation of the System Equations
Here, we present a complete derivation of the equations of one-dimensional Random Domino Automaton in a general case for arbitrary rebound parameters.
In order to make a description short, we use a symbol c i → c j to refer to a transition of a cell of one type i to another type j, and a symbol µ i → c 0 to refer to a transition of an occupied cell to a creating cell, since after an avalanche occupied cells become creating cells only. Occupation of an empty cell of type i is denoted by c i → µ j .

Appendix A.1. Balance Equation for ρ
In a single time step, the number of occupied cells may stay unchanged (when the new ball is reflected) or may increase by one (when a chosen empty cell becomes occupied) or decrease by i (when tan avalanche of size i is triggered). Probability rates for specific types of empty cells (i.e., creating, enlarging, merging) to be occupied are The probability of relaxation of a cluster of size i is µ i (in i /N). Clusters of arbitrary size i contribute, thus the expected value of the change in the number of occupied cells is Hence, the balance equation for density ρ is This equation includes no approximations, thus it is exact.

Appendix A.2. Balance Equation for n
The number of clusters may increase (by 1 in a single time step) only if a new cluster is created, i.e., for c 0 → µ i . The number of clusters may decrease (by 1) by the joining of two clusters (i.e., c 2 → µ i ), and by triggering an avalanche (of arbitrary size), which happens with the probability Hence, the equation is This equation is exact, too.
Appendix A.3. The Balance Equations for x i 's In the derivation below it is necessary to take into account the sizes of adjacent clusters and empty clusters, which in our framework are not traced. Thus, from this place, we assume there is no correlation between these sizes. This simplification is a version of the mean field approximation.
The equation for the number of creating cells x 0 . Four transitions contribute here: avalanche µ i → c 0 , a removal of the adjacent cluster c 1 → c 0 , occupation of the empty cell, c 0 → µ i and occupation of a neighboring empty cell c 0 → c 1 . The expected values (which are the same as probabilities in this case) are where c 01 is unknown value of probability of occupation of a neighbor of the empty cell of c 0 type. It is a weighted (somehow) average of c 0 and c 1 . Note, for c 1 → c 0 , a ratio n i /n gives the probability that the neighboring cluster is of size i, and iµ i /N is the probability, which it is removed from the system. Thus, the balance equation for the number of creating cells x 0 is (c 0 + 2c 01 )x 0 = ∑ i≥1 µ i n i i 2 + x 1 n ∑ i≥1 µ i n i i.
The equation for the number of enlarging cells x 1 . The following transitions contribute: c 0 → c 1 , c 2 → c 1 , c 1 → µ, c 1 → c 0 , and c 1 → c 2 , and because the following rates are where c 12 is an unknown value of probability of occupation of a neighbor of the empty cell of c 1 type. Again, it is a weighted average of c 0 and c 1 . The equation is The equation for the number of merging cells x 2 . The following transitions contribute: c 1 → c 2 , c 2 → µ, and c 2 → c 1 , and the equation is Remark A1. Not all of the equations derived above are independent. As one may expect, because of the relation (3), a combination of Equations (A3)-(A5) gives the Equation (A1) for the density ρ.
Remark A2. Because of the relation (7), a combination of Equations (A3)-(A5) must be consistent with the Equation (A2) for the total number of clusters n .
For the approximation that there are no correlations between empty cells, both c 01 and c 12 are assumed as c 01 = c 12 = 2c 0 x 0 + c 1 x 1 2x 0 + x 1 . (A7) It follows from the argument that any given empty cell of c 0 type is a neighbor for two empty cells (thus contributes twice), while any given empty cell of c 1 type is a neighbor for only one empty cell. For the Formula (A7), Equation (A6) is satisfied, and a combination of Equations (A4) and (A5) gives Equation (A2) for the total number of clusters n.
Appendix A.4. Equations for the Distribution of Clusters n i The creation of 1-cluster happens with the probability of transition c 0 → µ i . The creation of i-cluster by enlarging of (i − 1)-cluster for i ≥ 2 happens with the probability The merging of two clusters of size k ∈ 1, 2, . . . , (i − 2), and of size (i − k − 1), for i ≥ 3, separated by a merging cell happens with probability The removing of i-cluster happens with the probability µ i in i N .