Exact Percolation Probability on the Square Lattice

We present an algorithm to compute the exact probability $R_{n}(p)$ for a site percolation cluster to span an $n\times n$ square lattice at occupancy $p$. The algorithm has time and space complexity $O(\lambda^n)$ with $\lambda \approx 2.6$. It allows us to compute $R_{n}(p)$ up to $n=24$. We use the data to compute estimates for the percolation threshold $p_c$ that are several orders of magnitude more precise than estimates based on Monte-Carlo simulations.


Introduction
Most of the work in percolation is intimately connected with computer simulations.Over the years, various algorithms have been developped [1].Think of the Hoshen-Kopelman algorithm to identify clusters [2], the cluster growth algorithms [3,4] or the cluster hull generating algorithm [5,6,7] to name but a few.Another example is the the invasion percolation algorithm, that works well independently of the spatial dimension [8] and even on hyperbolic lattices with their "infinite" dimension [9].
The microcanonical union-find algorithm by Newman and Ziff [10,11] was a major improvement for Monte-Carolo simulations.It is not only exceedingly efficient, but also simple and beautiful.
In this contribution we will introduce an algorithm for the exact enumeration of percolating configurations in 2d lattices.We will discuss this algorithm for site percolation on the square lattice, but the method can easily be adapted to other 2d lattices.
Our motivation to devise such an algorithm and to run it is twofold.First, because we think that it is always good to have exact solutions, even for small systems.And second, because one can use the enumeration results to compute estimates for the percolation threshold p c that are more accurate than anything achievable with Monte Carlo simulations.
The latter is a recent trend in percolation.Exact estimators for small systems combined with careful extrapolations have lead to estimates for p c with unprecedented accuracy for many different lattices, see [12] and references therein.
The situation for site percolation is depicted in Table 1.TM refers to transfer matrix methods as a label for enumerative, exact solutions of small systems combined with careful extrapolations.MC stands for Monte-Carlo results.As you can see, in 2008 TM took over MC.We will also use TM and add two more values to this Table .0.569 (13) MC 78 × 78 1963 [13] 0.590 (10) Series 1964 [14]  2015 [24] Table 1.Some published estimates of the percolation threshold for site percolation on the square lattice.TM refers to transfer matrix algorithms, MC to Monte-Carlo methods.See [18] and [21] for more extensive lists.Or consult the always up to date WikipediA entry on percolation thresholds, maintained by Bob Ziff.
The paper is organized in three sections plus conclusions and appendix.The first section defines the task and introduces the generating function.The second section comprises a detailed description of the algorithm, followed by a brief discussion of the results.In the third section we describe the computation of high precision values for the percolation threshold p c .Sections two and three can be read independently.The appendix explains some of the combinatorics behind the algorithm.

The Number of Percolating Configurations
Our focus is on site percolation on the n × m square lattice, where percolation refers to spanning the m direction.In particular we want to compute the number A n,m (k) of spanning configurations with k sites occupied.These numbers are conveniently organized in terms of the generating function The spanning probability at occupancy p is given by The numbers A n,m (k) can be computed by enumerating all 2 nm configurations.Enumerating only the hulls of the spanning clusters [6] reduces this complexity to O(1.7 nm ), but even with this speedup, the maximum feasible system size is 8 × 8 [25].
Obviously A n,m (k) = 0 for k < m.For k = m, the only spanning cluster is a straight line of length m, A n,m (m) = n .
For k = m + 1 the spanning cluster is either a straight line with one extra site which does not contribute to spanning, or it is a line with a kink: For k = m + 2 a spanning configuration is either a straight line with two free sites, a line with one kink and one free site or a line with two small or one large kink: The formulae for k = m + 3, m + 4, . . .can be derived using similar considerations, but they quickly get complicated and are not very illuminating.And of course we are here to have the computer do this work.
In the high density regime we focus on the number of empty sites.Obviously we need at least n empty sites to prevent spanning, n empty sites can block spanning only if they form a path that a king can take from the left column of an n × m chess board to the right column in n − 1 moves.Let I m (n) denote the number of such paths.Then This number of royal paths would be m3 n−1 if the king were allowed to step on rows above and below the board.If he cannot leave the board, the number of paths is given by a more intricate formula [26].On a quadratic board, however, the number of paths can be expressed by the Motzkin numbers M k (A.3): Equations ( 3) and( 4) can be used as sanity check for the algorithm.Another sanity check is provided by a surprising parity property [27],

Algorithm
If you want to devise an efficient algorithm for a computational problem, you would be well advised to consult one of the approved design principles of the algorithms craft [28].In our case the algorithmic paradigm of choice is dynamic programming.The idea of dynamic programming is to identify a collection of subproblems and tackling them one by one, smallest first, using the solutions of the small problems to help figure out larger ones until the original problem is solved.

Dynamic Programming and Transfer Matrix
In the present case, a natural set of subproblems is the computation of F n,m (z) for a given occupation pattern in the mth row.For n = 2, for example, either the left site, the right site or both sites are occupied: An all empty row can not occur in spanning configurations.The mth row can only represent a spanning configuration if the (m − 1)th row is either or , hence Complementing the equations for F 2,m (z) and F 2,m (z), we can write this as matrixvector product For general values of n we can write where σ and σ denote possible configurations that can occur in a row of width n.Each σ in row m represents all compatible configurations in the n × m lattice "above" it.Appropriately we refer to σ as signature of all these configurations.The idea of the algorithm to compute F σ n,1 (z) for all signatures σ and then use (8) to iteratively compute F σ n,2 (z), F σ n,3 (z), etc., thereby reducing the two-dimensional problem with complexity O(2 nm ) to a sequence of m one-dimenional problems of complexity O(λ n ) each.
This particular form of a dynamic programming algorithm is known as transfer matrix method.When transfer matrices were introduced into statistical physics by Kramers and Wannier in 1941 [29], they were meant as a tool for analytical rather than numerical computations.Their most famous application was Onsager's solution of the two dimensional Ising model in 1944 [30].With the advent of computers, transfer marices were quickly adopted for numerical computations, and already in 1980, they were considered "an old tool in statistical mechanics" [31].Yet the idea to use transfer matrices to compute spanning probabilities in percolation emerged only recently, and not in statistical physics but in theoretical computer science.For their analysis of a two-player board game, Yang, Zhou and Li [23] developped a dynamic programming algorithm to compute numerically exact values of the spanning probability for a given value of the occupation probability p.We build on their ideas for the algorithm presented here.

Signatures
Signatures are the essential ingredients of the algorithm.How do we code them?And what is their number?Both questions can be answered by relating signatures to balanced strings of parentheses.
Coding table of cluster sites in a signature as parenthesis expressions (top) and example (bottom).
Occupied sites in a signature can belong to the same cluster.Either because they are neighbors within the signature or because they are connected through a path in the lattice beyond the signature.In any case the set of signature sites that belong to the same cluster has a unique left to right order.This allows us to encode cluster sites using left and right parentheses (Figure 1).We represent the leftmost site of a cluster by two left parentheses ((, the rightmost site by two right parentheses )) and any intermediate cluster site by )(.An occupied site that is not connected to any other site within the signature is represented by ().
The resulting string is a balanced parentheses expression (with some white spaces from unoccupied sites), i.e., the total number of left parentheses equals the total number right parentheses, and counting from left to right, the number of right parentheses never exceeds the number of left parentheses.The latter is due to the topological constraints in two dimensions (two clusters can not cross).Identifying all sites of a cluster in a signature means matching parentheses in a balanced string of parentheses, a task that can obviously be accomplished in time O(n).
Since we take into account only configurations that contain at least one spanning cluster, we need to identify clusters that are connected to the top row.We could follow [23] and use brackets [[, ]], ][ and [] to represent those clusters in a signature, thereby maintaining the concept of balanced strings of parentheses.It is more efficient, however, to use only a single symbol like || for sites that are connected to the top.We can think of all || sites being connected to each other by an all occupied zeroth row.Indentifying the left-and rightmost || site in a signature can again be done in time O(n).
The total number of signatures for spanning configurations depend on the width n and the depth m.Some signatures can occur only in lattices of a minimum depth.The most extreme case in this respect are nested arcs, represented by signatures like which can only occur if m ≥ n/2 .If m is larger than this, however, the total number of signatures will only depend on n.In the Appendix we will prove that this number is given by We also show in the Appendix, that S n grows asymptotically as

From one dimension to zero dimensions
The central idea of the transfer matrix method ( 8) is to transform a two dimensional problem into a sequence of one dimensional problems, as a result of which the complexity reduces from O(2 nm ) to O(m2 n S n ).This idea can be applied again.By building the new row site by site (Figure 2), we can subdivide the one dimensional problem into a sequence of n zero dimensional problems.This reduces the complexity from O(m2 n S n ) to O(nmS n ).In this approach, the new row is partially filled up to column c.The corresponding signatures contain a kink at column c.Each signature is mapped to two signatures, depending on whether the added site at columns c + 1 is empty or occupied (Figure 2).
The number of signatures S n,c depends on n and c.We can think of a partially filled row as a full row of width n + 1, where an extra site has been inserted between positions c and c + 1, this extra site being empty in all signatures.This gets us where the first inequality reflects the fact that the site at position c has only two neighbors (left and up), and therefore less restrictions to take on "legal" values.The actual value of S n,c depends on c, see Table 2.

Condensation
The number of signatures is paramount for the complexity of the algorithm.Let S n,c ≡ max c S n,c .Figure 4 shows that S n,c 0.350 • 2.85 n .In particular, S n,c exceeds the "Giga" threshold (10 9 ) for n > 20.This is significant because we need to store pairs (σ, F σ ) for all signatures.For S n,c above 10 9 , this means hundreds of GBytes of RAM.In this regime, space (memory) rather than time will become the bottleneck of our algorithm.
In addition, the symmetry of the problem allows us to identify each signature of a full row (c = n) with its mirror image.
Let Sn,c denote the number of signatures that haven been condensed by these rules, and let Sn,c = max c Sn,c .Figure 4 shows that Sn,c 0.196 • 2.61 n .The number of condensed signatures exceeds the "Giga" threshold only for n > 23.
L old := first row configurations L new := empty list for r = 1, . . ., m do for c = 1, . . ., n do while L old not empty do take (σ,

Implementation
The elementary step of the transfer matrix algorithm is the addition of a new lattice site in a partially filled row (Figure 3).This is implemented in a subroutine extend(σ, c) that takes a signature σ and a column c as input and returns a pair σ 0 , σ 1 , where σ 0 (σ 1 ) follows from σ by adding an empty (occupied) site in column c.The rules for these computations are depicted in Table 3.The signatures σ 0 and σ 1 are of course condensed before they are returned.The subroutine extend is then called from within a loop over rows and columns of the lattice (Figure 5).
The algorithm maintains two lists L old and L new of configurations, where a configuration consists of a signature σ and the corresponding generating function F σ .L old is initialized with the 2 n − 1 first row configurations, in which each occupied site is connected to the top row by definition.L new is initially empty.
A configuration (σ, F σ ) is removed from L old and σ 0 , σ 1 = extend(σ, c) is computed.Then the configurations (σ 0 , F σ ) and (σ 1 , zF σ ) are inserted into L new .This is repeated until L old is empty.At this point, L old and L new swap their names, and the process can be iterated to add the next site to the lattice.
If the lattice has reached its final size nm, the total generating function is the sum over all F σ for (σ, F σ ) ∈ L old .
The lists L maintained by the algorithm contain up to Sn,c configurations.An efficient data structure is mandatory.We take advantage of the fact that the signatures  can be ordered according to their senary value.This allows us to use an ordered associative container like set or map from the standard C++ library.Those guarantee logarithmic complexity O(log Sn,c ) for search, insert and delete operations [32] Equipped with a compact representation for the signatures and an efficient data structure for the list of configurations, we still face the challenge to store the large coefficients A n,m (k) of the generating functions F σ n,m .These numbers quickly get too large to fit within the standard fixed width integers with 32, 64 or 128 bits (Figure 6).
One could use multiple precision libraries to deal with this problem, but it is much more efficient to stick with fixed width integers and use modular arithmetic: with b-bit integers, we compute U (i) n,m (k) ≡ A n,m (k) mod p i for a set of prime moduli p i < 2 b such that i p i > max k A n,m (k) .We can then use the Chinese Remainder Theorem [32,33] to recover A n,m (k) from the U (i) n,m (k).This way we can trade space (bit size of numbers) for time (one run for each prime modulus).

Performance
Figure 7 shows the actual memory consumption to compute F n,n with b-bit integers.Note that data structures like red-black-tress require extra data per stored element.Unfortunately the product of all primes below 2 8 is not large enough to compute the coefficients in F n,n via modular arithmetic and the Chinese Remainder Theorem if n ≥ 19.Hence 8-bit integers are essentially useless.
Extrapolation of the data in Figure 7 shows that the computation of F n,n for n ≤ 21 cane be done with 16-bit integers on machines with 256 GB of RAM.Larger lattices require 384 GB (n = 22) or 1 TB (n = 23).
There is an alternative approach to determine F n,m with less memory.Instead of enumerating the coefficients A n,m (k), one can compute F n,m (z) for nm integer arguments z (again modulo a sufficient number of primes) and then use Lagrange interpolation to recover F n,m .This approach requires only a single b-bit integer to be stored with each signature.The prize we pay is that now we need nm computations,   each consisting of several runs for a sufficient number of prime moduli.This was the approach we used to compute F 22,22 on 30 . . .60 machines with 256 GB each.Yet another method suggests itself if one is mainly interested in evaluating the spanning probability R n,m (p) for a given value of p.Here we need to store only a single real number with each signature, and there is no need for multiple runs with different prime moduli.Quadruple precision floating point variables with 128 bit ( float128 in gcc) allow us to evaluate R n,m (p) with more than 30 decimals precision.We used this approach to compute R n,n (p) for n = 23, 24, see Table 4.
Figure 8 shows the time to evaluate R n,n (p) for given value p and with 30 decimals accuracy, measured on a laptop.The time complexity corresponds to the expected

Results
We have computed the generating functions F n,m (z) for n ≤ 22 and all m ≤ n.The data is available upon request.Here we will briefly discuss some properties of the coefficients A n,m (k).For simplicity we will confine ourselves to the quadratic case m = n.
A nn (k) has a maximum A max for k = k max (n).Our data reveals that It is tempting to conjecture that 6).This number determines how many prime moduli of fixed width we need to compute A n,n (k) via the Chinese Remainder Theorem.
The width of the maximum at k max scales like n −1 , as can be seen in the "data collapse" when plotting An,n(k) Amax versus n(k − k max ) (Figure 9).

Critical Density
The generating functions F n,m (z) can be analyzed in various ways.One can, for example, investigate their properties for arguments outside the "physical" interval z ≥ 0. This is how the parity property ( 6) was discovered and then proven for a large, quite general class of lattices [27].
In this paper we use the generating function to compute finite size estimators for the critical density p c that we can then extrapolate to the infinite lattice.

Estimators
The properties of various finite-size threshold estimators for two-dimensional percolation have been discussed by Ziff and Newman [34].Here we will focus on estimators p med (n) and p cell (n), defined as The values for these estimators up to n max = 24 are shown in Table 4.
Both estimators are supposed to converge with the same leading exponent, with ν = 4 3 .This scaling can clearly be seen in the data (Figure 10).Any convex combination of finite size estimators is another finite size estimator.We can choose λ in λp med (n) + (1 − λ)p cell (n) such that the leading order terms of p med (n) and p cell (n) cancel: Table 4. Finite size estimators for pc (14).All decimals are exact.
The precise values of c med and c cell are not known, but their ratio follows from scaling arguments [34]: Hence we expect that the estimator converges faster than both p med (n) and p cell (n).This is in fact the case, as can be seen in Figure 10. Figure 10 also depicts the estimator p pol (n), which is the root of a graph polynomial defined for percolation on the n × n torus [35,36,37].Empirically, it converges like ∼ n −4 in leading order.Jacobsen [24] computed this estimator for n ≤ 21 and extrapolated the results to obtain p c = 0.592 746 050 792 10(2) .
This is the most precise value up to today.We have used it as reference p c in Figure 10 and we will continue to use it as reference in the rest of the paper.

Exponents
Extrapolation methods are commonly based on the assumption of a power law scaling with exponents 0 > ∆ 1 > ∆ 2 > . ... These exponents can be determined by the following procedure.We start by considering the truncated form p(n) = p c + A 1 n ∆1 .From each triple of consecutive data points p(n), p(n − 1) and p(n − 2) we can compute three unknows p c , A 1 and ∆ 1 .This gets us sequences A 1 (n) and ∆ 1 (n) that we can use to compute ∆ 1 = lim n→∞ ∆ 1 (n) and A 1 ≡ lim n→∞ A 1 (n).Figure 11 shows ∆ 1 (n) for p(n) = p med (n) plotted versus n −1 along with a 4th-order polynomial fit.The result ∆ 1 = − 7  4 is of course expected.It is stable under variations of the order of the fit and the number of low order data points excluded from the fit.
We then subtract the sequence A 1 n ∆1 from p(n) to eliminate the leading scaling term n ∆1 .Considering again a truncated form p(n) − A 1 n ∆1 = A 2 n ∆2 , we can get sequences A 2 (n) and ∆ 2 (n) from three consecutive terms of this sequence.And again we extrapolate these sequences by fitting a low order polynomial in n −1 , thereby obtaining ∆ 2 and A 2 .
Obviously, this can be iterated to determine the exponents ∆ k one by one, allthough the stability of the procedure deteriorates to some extent with each iteration.The results for p cell are similar.These results provide compelling evidence that These are the exponents that we will use in our extrapolations.
Applying the same procedure, Jacobsen [24] concluded that the scaling exponents for For p (n), we could not reliably compute exponents ∆ k .It can already be sensed from the curvature in the logarithmic plot (Figure 10), that p (n) is not determined by a set of well separated exponents.This is the reason why we will not use p (n) for extrapolation in this contribution, despite its fast convergence.
The best results are usually obtained with "balanced" rationals N = (n − 1)/2 and M = (n − 1)/2 .This is what we will use for our extrapolations.
The BST method discussed in [38] uses exponents w k = kω, where the parameter ω is chosen to match the leading order of the finite size corrections.Here we choose w k = −∆ k = 4k+3 4 .Having fixed the exponents and the degrees of nominator and denominator, we still have a free parameter: the number of data points to be included in the extrapolation.Since we are interested in the asymptotics n → ∞ we will of course use the data of the large systems, but we may vary the size n 0 of the smallest system to be included.We expect that the extrapolation gets more accurate with increasing n 0 , but only up to a certain point after which the quality deteriorates because we have to few data points left.This intuition is confirmed by the data (Figure 12).
The accuracy of an extropolation based on system sizes n 0 , . . ., n max is gauged by the difference between extrapolations of data points for n 0 + 1, . . ., n max and n 0 , . . ., n max − 1.These are the errorbars shown in Figure 12.
It seems reasonable to focus on the regime of n 0 where the results form a plateau with small errorbars, and to take the mean of these values as the final estimate of p c .Using the values for n 0 = 14, 15 (p med ) and n 0 = 15, 16 (p cell ) (see  The first value deviates from the reference value (19) by 2 errorbars, the second by 2.5 errorbars.The reference value p c 19 was obtained with different extrapolation approach.To see how consistent our method is, we applied it also to the estimator p pol (n) from Jacobsen [24,Table. 2].Here we used the exponents (23) in the BST extrapolation.
Figure 13 shows, that the errorbars are smaller than the errorbars on p med and p cell (note the scaling of the y-axis).The estimator p pol is also more robust under variation of n 0 .Taking the average of the estimates for n 0 = 1, . . ., 10 gives p c = 0.592 746 050 792 0(4) (p pol ) .
This value agrees with Jacobsen's value (19) within the errorbars.We believe that the superiority of the estimator p pol is due to the better separation of scales: ∆ k+1 − ∆ k = −1 for p med , p cell versus ∆ k+1 − ∆ k = −2 for p pol .

Conclusions
We have introduced a transfer matrix algorithm to compute the generating function F nm (z) for the number of spanning configurations in an n×m square lattice.Adapting this algorithm to other 2d lattices and other boundary conditions (cylindrical, helical) is straightforward.The exponential number of signatures imposes a hard limit on system sizes.With the current algorithm, computing F 25,25 (z) would require a computer with a 4 TB memory and a couple of months CPU time.Evaluating R 25,25 (p) for a single real argument p requires 512 GB of memory und would take about one month.
We have also seen another example where the exact solution of small systems is sufficient to compute estimates for the critical density p c with an accuracy that no Monte-Carlo method could ever achieve.Or, as Bob Ziff put it, "Monte-Carlo is dead" [40] when it comes to computing p c for 2d systems.
This approach can be advanced.Not so much by solving larger systems, but by a better understanding of finite size estimators and their scaling.This is only an upper bound because it does not take into account that h-clusters not connected to the top row can not be connected to each other across a top row h-cluster.In the example shown in Figure A1, the number of signatures compatible with the top row h-cluster is not C 6 = 132 but C 3 C 1 C 2 = 10.We need to consider the compositions that w top row h-clusters can impose upon the r −w other h-clusters.The composition of an integer n into k parts is defined as an ordered arrangement of k non-negative integers which sum to n [45].The compositions of 4 into 2 parts are (0, 4), ( Using Catalan's convolution formula [46] for 1 ≤ k ≤ n,

Figure 2 .
Figure2.Increasing the system size by adding a new site in the (c + 1)th column of the partially filled row.The state of the new site may affect the cluster structure of the whole lattice, but we only need to update the signature, the cluster structure as it shows in the n sites colored gray.

Figure 7 .
Figure 7. Memory required to compute Fn,n with b-bit integers.

Figure 8 .
Figure 8.Time to evaluate Rn,n(p) with 30 decimals accuracy on a laptop.

Figure 9 .
Figure 9. "Data collapse" in the number of spanning configurations.

Figure 10 .
Figure 10.Finite size estimators for pc and their convergence.

Figure 13 .
Figure 13.Rational extrapolation p pol vs. minimum system size n 0 .

Figure A1 .
Figure A1.The dashed connection is forbidden: h-clusters can not connect across h-clusters that are connected to the top row.

6 )
The sum over w can be done using binomial coefficient identities[47, Table 174].The result reads C r+1,r−1 , (A.7) Signature of the cluster configuration of Fig 2 (top), and signatures after adding a new site that is either empty (middle) or occupied (bottom).

Table 2 .
Values of Sn,c.
Figure 4. Number of signatures S n,c (plain) and Sn,c (condensed).The lines are S n,c 0.350 • 2.85 n and Sn,c 0.196 • 2.61 n .

Table 3 .
Extension rules of the transfer matrix algorithm.Each cell shows the resulting configuration if the added site is empty (top) or occupied (bottom).Some update rules are non-local: a || may only be deleted, if there is at least one other || elsewhere in the signature, b deleting (( or )) means shortening of the corresponding cluster, c the whole cluster has to be connected to the top, and d two clusters merge.Configurations with must never occur.

Table 5
Figure 12.Rational extrapolation result vs. minimum system size n 0 .

Table 5 .
(25)apolations for different minimum system sizes n 0 .Starred values are averaged to yield the final estimates(25)for pc.