Length partition of random multicurves on large genus hyperbolic surfaces

We study the length statistics of the components of a random multicurve on a surface of genus $g \geq 2$. For each fixed genus, the existence of such statistics follows from the work of M.~Mirzakhani, F.~Arana-Herrera and M.~Liu. We prove that as the genus $g$ tends to infinity the statistics converge in law to the Poisson--Dirichlet distribution of parameter $\theta=1/2$. In particular, as the genus tends to infinity the mean length of the three longest components converge respectively to $75.8\%$, $17.1\%$ and $4.9\%$ of the total length.

1. Introduction 1.1.Lengths statistics of random multicurves in large genus.Let X be a closed Riemann surface of genus g ≥ 2 endowed with its conformal hyperbolic metric of constant curvature −1.A simple closed curve on X is a connected closed curve on X, non-homotopic to a point and without self-intersection.In the free homotopy class of a simple closed curve γ, there exists a unique geodesic representative with respect to X.We denote by X (γ) the length of this geodesic representative.
A multicurve on X is a multiset of disjoint simple closed curves on X.Given a multicurve γ, a component of γ is a maximal family of freely homotopic curves in γ.The cardinal of a component is called its multiplicity and the length of a component is the sum of the lengths of the simple curves belonging to the component (or equivalently its multiplicity multiplied by the length of any simple closed curve if the component).A multicurve is called primitive if all its components have multiplicity one.We denote by ↓ X (γ) the vector of the lengths of each component sorted in decreasing order and by mult(γ) the multiset of the multiplicities of the components of γ, and by mult(γ) the maximum of mult(γ).Neither mult(γ) nor mult(γ) depend on the hyperbolic structure X.We define X (γ) as the sum of the entries of ↓ X (γ) and the normalized length vector to be .
We denote by ML X (Z) the set of homotopy classes of multicurves on X.Our notation for the set of multicurves is explained by the fact that multicurves are the integral points of the space of measured laminations usually denoted ML X .
In order to make sense of convergence, we need all normalized vectors to belong to the same space.For an integer k ≥ 1 and a real number r > 0 let us define Let also define ∆ ∞ ≤r := {(x 1 , x 2 , . . . ) ∈ [0, ∞) N : x 1 + x 2 + • • • ≤ r}.For k ≤ k we have an injection ∆ k ≤r → ∆ k ≤r by completing vectors with zeros.The infinite simplex ∆ ∞ ≤r is the inductive limit of these injections and we always identify ∆ k ≤r as a subspace of ∆ ∞ ≤r .In particular each vector ˆ ↓ X (γ) is naturally an element of ∆ ∞ ≤1 by completing its coordinates with infinitely many zeros.As our aim is to study convergence of random infinite vectors, let us mention that ∆ ∞ ≤1 is a closed subset of [0, 1] N endowed with the product topology.This topology coincides with the topology of the inductive limit.When we consider a convergence in distribution on ∆ ∞ ≤1 we mean convergence in the space of Borel probability measures on ∆ ∞ ≤1 which is a compact set.The following result is a consequence of the works [A-H] and [Liu].
We actually prove a more precise version of the above statement, Theorem 12, in which the law of L (g,m)↓ is made explicit.Remark that the limit depends only on the genus of X and not on its hyperbolic metric.
The Poisson-Dirichlet distribution is a probability measure on ∆ ∞ ≤1 .The simplest way to introduce it is via the stick-breaking process.Let U 1 , U 2 , . . ., be i.i.d.random variables with law Beta(1, θ) (i.e. they are supported on ]0, 1] with density θ(1 − x) θ−1 ).Define the vector Informally, the components of V are obtained by starting from a stick of length 1 identified with [0, 1].At the first stage, U 1 determines where we break the first piece and we are left with a stick of size 1 − U 1 .We then repeat the process ad libitum.The law of V is the Griffiths-Engen-McCloskey distribution of parameter θ that we denote GEM(θ).The Poisson-Dirichlet distribution of parameter θ, denoted PD(θ), is the distribution of V ↓ , the vector V whose entries are sorted in decreasing order.For more details, we refer the reader to Section 5.2.The distribution PD(1) is the limit distribution of the orbit length of uniform random permutations.The distribution PD(θ) appears when considering the Ewens distribution with parameter θ on the symmetric group.See Section 1.2.2 below for a more detailed discussion on permutations.
Our main result is the following.
The most interesting cases of this convergence are for m = 1 (primitive multicurves) and m = ∞ (all multicurves).Let us insist that L (g,1)↓ and L (g,+∞) converge to the same limit as g → ∞.
All marginals of the Poisson-Dirichlet law can be computed, see for example [ABT03,Section 4.11 y dy.The formulas can be turned into a computer program and values were tabulated in [Gri79,Gri88].For θ = 1/2 we have 1.2.Further remarks.
1.2.1.Square-tiled surfaces.In this section we give an alternative statement of Theorem 2 in terms of square-tiled surfaces.The correspondence between statistics of multicurves and statistics of square-tiled surfaces is developed in [DGZZ21] and [A-H20b] and we refer the readers to these two references.
A square-tiled surface is a connected surface obtained from gluing finitely many unit squares [0, 1] × [0, 1] along their edges by translation z → z + u or "halftranslation" z → −z + u.Combinatorially, one can label the squares from 1 to N and then a square-tiled surface is encoded by two involutions without fixed points (σ, τ ) of {±1, ±2, . . ., ±N }.More precisely, σ encodes the horizontal gluings: +i and −i are respectively the right and left sides of the i-th squares.The orbits of σ with different signs are glued by translations and the ones with same signs are glued by half-translations.And τ encodes the vertical gluings : +i and −i are respectively the top and bottom sides of the i-th squares.The labelling is irrelevant in our definition and two pairs (σ, τ ) and (σ , τ ) encode the same square-tiled surface if there exists a permutation α of {±1, ±2, . . ., ±N } so that α( A square-tiled surface comes with a conformal structure and a quadratic form coming from the conformal structure of the unit square and the quadratic form dz 2 (both are being preserved by translations and half-translations).This quadratic form might have simple poles and we denote by Q g (Z) the set of holomorphic square-tiled surfaces of genus g.
A square-tiled surface come equipped with a filling pair of multicurves (γ h , γ v ) coming respectively from the gluings of the horizontal segments [0, 1] × {1/2} and vertical segments {1/2}×[0, 1] of each square.Conversely, the dual graph of a filling pair of multicurves in a surface of genus g defines a square-tiled surface in Q g (Z).Our notation comes from the fact that holomorphic square-tiled surfaces can be seen as integral points in the moduli space of quadratic differentials Q g .A component of the multicurve γ h corresponds geometrically to a horizontal cylinder.For a square-tiled surface M we denote by A ↓ (M ) the normalized vector of areas of these horizontal cylinders sorted in decreasing order and by height(M ) the maximum of their heights.Here as in the introduction, normalized mean that we divide by the sum of entries of a vector which coincides with area(M ).The following is a particular case of [DGZZ21, Theorem 1.29] using the explicit formulas for L (g,m)  given in Theorem 12.
An important difference to notice between Theorem 1 and Theorem 3 is that in the former the (hyperbolic) metric X is fixed and we sum over the multicurves γ while in the latter we sum over the discrete set of holomorphic square-tiled surfaces M .
Using Theorem 3, our Theorem 2 admits the following reformulation.
Corollary 4. The vector of normalized areas of horizontal cylinders of a random square-tiled surface of genus g converges in distribution to PD(1/2) as g tends to ∞.
1.2.2.Permutations and multicurves.Given a permutation σ in S n we denote by K n (σ) the number of orbits it has on {1, 2, . . ., n} or equivalently the number of cycles in its disjoint cycle decomposition.The Ewens measure with parameter θ on S n is the probability measure defined by Then under P n,θ , as n → ∞ we have that • the random variable K n behaves as a Poisson distribution Poi(θ log(n)) (e.g. by mean of a local limit theorem), • the normalized sorted vector of cycle lengths of σ tends to PD(θ), • the number of cycles of length k of σ converges to Poi(θ/k).See for example [ABT03].
By analogy let us denote by K (g,m) the number of non-zero components of L (g,m) .In [DGZZ], it is proven that K (g,m) behaves as a Poisson distribution with parameter log(g) 2 (by mean of a local limit theorem) independently of m.In other words, it behaves as the number of cycles K g (σ) for a random permutation σ under P g,1/2 .
Our Theorem 2 provides another connection between L (g,m) and P g,1/2 .Namely, L (g,m)↓ is asymptotically close to the normalized sorted vector of the cycle length of σ under P g,1/2 .
Finally, let us mention that components of L (g,m) of the order of o(1) are invisible in the convergence towards PD(1/2).It is a consequence of Theorem 2 that the macroscopic components of the order of a constant carry the total mass.Building on the intuition that in the large genus asymptotic regime random multicurves on a surface X of genus g behave like the cycles of a random permutation in the symmetric group S g , one should expect to have a Poisson limit for components of order g −1 and that there is no component of order g −1− .In a work in progress, we provide an affirmative answer to this intuition.However, because lengths are continuous parameters, the limit is a continuous Poisson process and not a discrete one supported on N as in the permutation case.1.3.Proof overview and structure of the paper.The first step of the proof consists in writing an explicit expression for the random variable L (g,m)↓ that appears in Theorem 1, see Theorem 12 in Section 3. The formula follows from the work of M. Mirzakhani on pants decompositions [Mir08] and the result of F. Arana-Herrera [A-H] and M. Liu [Liu] on length distribution for each fixed topological type of multicurves.The expression of L (g,m)↓ can be seen as a refinement of the formula for the Masur-Veech volume of the moduli space of quadratic differentials from [DGZZ21].
The formula for L (g,m)↓ involves a super-exponential number of terms in g (one term for each topological type of multicurve on a surface of genus g).However, in the large genus limit only O(log(g)) terms contribute.This allows us to consider a simpler random variable L(g,m,κ)↓ which, asymptotically, coincides with L (g,m)↓ .See Theorem 17 in Section 4. This reduction is very similar to the one used for the large genus asymptotics of Masur-Veech volumes in [Agg21] and [DGZZ].
The core of our proof consists in proving the convergence of moments of the simpler variable L(g,m,κ)↓ .We do not use directly L(g,m,κ)↓ but its size-biased version L(g,m,κ) * .The definition of size bias and the link with the Poisson-Dirichlet distribution is explained in Section 5.In Section 6, we show that the moments L(g,m,κ) * converge to the moments of GEM(1/2) which is the size-biased version of the Poisson-Dirichlet process PD(1/2), see Theorem 23.
1.4.Acknowledgement.We warmly thank Anton Zorich who encouraged us to join our forces and knowledge from [Liu] and [DGZZ] to study the lengths statistics of random multicurves.The second author would like to thank Grégoire Sergeant-Perthuis and Maud Szusterman for helpful conversations about probability theory.
The work of the first named author is partially supported by the ANR-19-CE40-0003 grant.

Background material
In this section we introduce notations and state results from the literature that are used in our proof.
2.1.Multicurves and stable graphs.Recall from the introduction that a multicurve on a hyperbolic surface X of genus g is a finite multiset of free homotopy classes of disjoint simple closed curves.We denote by ML X (Z) the set of multicurves on X.The homotopy classes that appear in a multicurve γ are called components.There are at most 3g − 3 of them.The multiplicity of a component is the number of times it is repeated in γ, and γ is primitive if all the multiplicities are 1.
Let us also recall our notations: • X (γ): total length of γ, • ↓ X (γ): length vector of the components of γ, • mult(γ): maximum multiplicity of component in γ, • mult(γ): multiset of multiplicities of components in γ.The mapping class group Mod(X) of X acts on multicurves.We call topological type of a multicurve its equivalence class under the Mod(X)-action.For each fixed genus g, there are finitely many topological types of primitive multicurves and countably many topological types of multicurves.They are conveniently encoded by respectively stable graphs and weighted stable graphs that we define next.Informally given a multicurve γ with components γ 1 , . . ., γ k and multiplicities m 1 , . . ., m k we build a dual graph Γ as follows: • we add a vertex for each connected component of the complement X (γ 1 ∪ • • • ∪ γ k ); the vertex v carries an integer weight the genus g v of the corresponding component, • we add an edge for each component γ i of the multicurve between the two vertices corresponding to the connected components bounded by γ i ; this edge carries a weight m i .More formally, a stable graph Γ is a 5-tuple (V, H, ι, σ, {g v } v∈V ) where is called an edge and we denote by E(Γ ) the set of edges, Given a stable graph Γ , its genus is An isomorphism between two stable graphs Γ = (V, H, ι, σ, g) and Γ = (V , H , ι , σ , g ) is a pair of bijections φ : V → V and ψ : H → H such that Note that ψ determines φ but it is convenient to record automorphism as a pair (φ, ψ).We denote by Aut(Γ ) the set of automorphisms of Γ and by G g the finite set of isomorphism classes of stable graphs of genus g.
A weighted stable graph is a pair (Γ, m) where Γ is a stable graph and m ∈ N E(Γ ) .An isomorphism between two weighted stable graphs (Γ, m) and (Γ , m ) is an isomorphism (φ, ψ) between Γ and Γ such that for each edge e of Γ we have m e = m ψ(e) (where we use ψ(e) to denote {ψ(h), ψ(h )} for the edge e = {h, h } ⊂ H ). We denote by Aut(Γ, m) the set of automorphisms of the weighted graph (Γ, m).There is a one-to-one correspondence between topological types of multicurves and weighted stable graphs.Primitive multicurves correspond to the case where all edges carry weight 1.
2.2.ψ-classes and Kontsevich polynomial.The formula for the random variable L (g,m)↓ that appears in Theorem 1 involves intersection numbers of ψ-classes that we introduce now.These rational numbers are famously related to the Witten conjecture [Wit91] proven by Kontsevich [Kon92].
Let M g,n denote the Deligne-Mumford compactification of moduli space of smooth complex curves of genus g with n marked points.There exist n so-called We use the following standard notation All these intersection numbers are positive rational numbers and can be computed by recursive equations from τ 3 0 0,3 = 1 and τ 1 1,1 = 1 24 , see for example [ItzZub92].For our purpose, it is convenient to consider the Kontsevich polynomial V g,n ∈ Q[x 1 , . . ., x n ] that gathers the intersection number into a symmetric polynomial on n variables.More precisely, For later use we gather the list of small Kontsevich polynomials below 2.3.Random multicurves.M. Mirzakhani proved the polynomial growth of the number of multicurves on hyperbolic surfaces with respect to its length.This result and some extensions of it are nicely presented in the book of V. Erlandsson and J. Souto [ES].
Let X be a hyperbolic surface of genus g.We define Theorem 5 ([Mir08, Theorem 1.1, 1.2 and 5.3]).Let X be a hyperbolic surface.For any multicurve γ ∈ ML X (Z) there exists a positive rational constant c(γ) such that we have as R → ∞, where B(X) is the Thurston volume of the unit ball in the space of measured laminations ML X with respect to the length function X , and The above theorem allows to give sense to the notion of a random multicurve.Namely we endow the set of topological types of multicurves ML X (Z)/ Mod(X) with the probability measure which assigns c(γ)/b g to [γ].We now provide the explicit expression for this probability.For Γ ∈ G g a stable graph we define the polynomial F Γ on the variables {x e } e∈E(Γ ) by ( 2) where x v is the multiset of variables x e where e is an edge adjacent to v and V gv,nv are the Kontsevich polynomial defined in Section 2.2.In the case e is a loop based at v, the variable x e is repeated twice in x v .
Remark 6.The polynomial F Γ appeared first in Mirzakhani's work [Mir08], see in particular Theorem 5.3.They were related to square-tiled surfaces and Masur-Veech volumes in [DGZZ21] though with a different normalization.Namely, the polynomial P Γ from [DGZZ21] is related to F Γ by The list of topological types of primitive multicurves in genus 2, their associated stable graphs and their corresponding polynomial F Γ .The labels on edges are used as variable indices in F Γ .
The normalization of F Γ is identical to the conventions used in [ABCDGLW] and simplifies the computations of the present article.
Following [DGZZ21], for a weighted stable graph (Γ, m) and we denote by We derive the following directly from [DGZZ21]: Theorem 7. Let γ be a multicurve in genus g and (Γ, m) the dual weighted stable graph.Then Remark 8.In Theorem 7 we fix a misconception in [DGZZ21] about automorphisms of multicurves (or equivalently weighted stable graph).Indeed, the way we defined automorphisms of stable graphs and weighted stable graphs in Section 2.1 make it so that the following formula is valid where the sums are taken over isomorphism classes of respectively stable graphs of genus g and weighted stable graphs of genus g.
Proof.Up to the correction of Remark 8 this is exactly [DGZZ21, Theorem 1.22] (see Remark 6 for the difference between P Γ and F Γ ).
. The list of topological types of primitive multicurves in genus 2 and the associated values Y m (Γ ) that is proportional to c(γ) (see Theorem 7).
Then lim Note that b g = b g,+∞ .
Remark 10.We warn the reader that the constant denoted b g,m in this article has nothing to do with the analogue of b g in the context of surfaces of genus g with n boundaries which is denoted b g,n in [Mir] and [DGZZ21].
For m ∈ N ∪ {+∞} and a real number κ > 1 we also define As we have less terms in its definition, bg,m,κ ≤ b g,m .
We will use the asymptotic results of [Agg21] and [DGZZ] in the following form.

Length vectors of random multicurves
The aim of this section is to state and prove a refinement of Theorem 1 that provides an explicit description of the random variable L (g,m) .For each weighted stable graph (Γ, m) we define a random variable U (Γ,m) .We then explain how L (g,m) is obtained from them.
Theorem 12. Let L (g,m) be the random variable on ∆ 3g−3

≤1
with density where b g,m is defined in (3) and µ Γ,m is the measure on ∆ where F Γ is the polynomial defined in (2).Then where s X (R, m) := #{γ ∈ ML X (Z) : X (γ) ≤ R and mult(γ) ≤ m} is the number of multicurves on R of length at most R and multiplicity at most m and L (g,m)↓ is the vector L (g,m) sorted in decreasing order.
The study of the length vector of multicurves of a given topological type was initiated by M. Mirzakhani in [Mir].She studied the special case of maximal multicurve corresponding to a pants decomposition.The general case that we present now was proved independently in [A-H] and [Liu].
Theorem 13 ([A-H], [Liu]).Let X be a hyperbolic surface and γ a multicurve on X with k components.Let (Γ, m) be a weighted stable graph dual to γ.Let U (Γ,m) be the random variable on ∆ k =1 with density where E(Γ ) and V (Γ ) are the set of edges and the set of vertices of Γ , respectively.Then we have the convergence in distribution where U (Γ,m)↓ is the sorted version of U (Γ,m) and s X (R, γ) is defined in (1).
We endow ∆ k ≤r with the restriction of the Lebesgue measure on R k that we denote by λ k ≤r .We define the slice ≤r is closed for the product topology in [0, r] N and hence compact.However ∆ ∞ =r is dense in ∆ ∞ ≤r .For this reason, it is more convenient to work with measures on ∆ ∞ ≤r even though they are ultimately supported on ∆ ∞ =r .On ∆ k =r which is contained in a hyperplane in R k we consider the Lebesgue measure induced by any choice of k − 1 coordinates among x 1 , . . ., x k .The latter measure is well defined since the change of variables between different choices has determinant ±1.We first start with an elementary integration lemma.
Here the factorial of a real number has to be considered by mean of the analytic continuation given by the gamma function : x! = Γ(x + 1).
Remark 15.Using Lemma 14, let us check that (5) and (8) are indeed densities of probability measures.From the second equation in the statement of Lemma 14 it follows that the total mass of (6 Indeed, each monomial that appears in F Γ has k variables and total degree 6g−6−k. Hence the denominator coming from the formula of Lemma 14 compensates the (6g − 7)! term from (6).The numerator in the formula of Lemma 14 matches the definition of Y m .
Proof of Lemma 14.For x > 0 real and α, β ≥ 1 integral, we have the following scaling of the beta function (9) This implies that The two equations in the statement then follow by induction.
Proof of Theorem 12.We just have to gather the different contributions of each multicurve coming from Theorem 13 that F. Arana-Herrera and M. Liu gave.From Theorem 5 of M. Mirzakhani, for any multicurve γ ∈ ML X (Z), its asymptotic density in ML X (Z) is c(γ) bg .Now Theorem 7 provides the values of c(γ) and b g,m in terms of the stable graph polynomials F Γ .

Reduction in the asymptotic regime
The random variable L (g,m) appearing in Theorem 12 is delicate to study because it involves a huge number of terms.Using Theorem 11 from [Agg21] and [DGZZ] we show that we can restrict to a sum involving only O(log(g)) terms associated to non-separating multicurves.
We denote by Γ g,k the stable graph of genus g with a vertex of genus g − k and k loops.To simplify the notation we fix a bijection between the edges of Γ g,k and {1, 2, . . ., k} so that F Γ g,k is a polynomial in Q[x 1 , . . ., x k ].Note that because the edges in Γ g,k are not distinguishable, the polynomial F Γ g,k is symmetric.
Using the same notation as in Theorem 12 we have the following result.

≤1
with density where bg,m,κ is defined in (4).Then for any function h ∈ L ∞ (∆ ∞ ≤1 ) we have Note that terms appearing in the sum (10) in Theorem 17 form a subset of the terms in the sum (5) in Theorem 12.
Proof.By Theorem 11, a random multicurve of high genus is almost surely nonseparating with less than κ log(6g−6) 2 edges.As h is bounded, we obtain the result.

Size-biased sampling and Poisson-Dirichlet distribution
5.1.Size-biased reodering.The components of a multicurve are not ordered in any natural way.In Theorem 1 we solve this issue by defining a symmetric random variable L (g,m) on ∆ 3g−3 ≤1 and making the convergence happen towards L (g,m)↓ whose entries are sorted in decreasing order.In this section we introduce another natural way of ordering the entries: the size-biased ordering.Contrarily to the symmetrization or the decreasing order, it is a random ordering.The size-biased ordering turns out to be convenient in the proof of Theorem 1.
The idea under the size-biased reordering is to pick components according to their values.One can define the random permutation σ inductively as follows.If x is the zero vector, then x * = x σ where σ is taken uniformly at random.Otherwise, we set σ(1) according to and define a new vector y = (x 1 , . . ., x σ(1) , . . ., x k ) on ∆ k−1 ≤1 which is the vector x with the component x σ(1) removed.In order to keep track of the components we denote φ : {1, 2, . . ., k − 1} → {1, 2, . . ., k} the unique increasing injection such that its image avoids σ(1).In other words Assuming that by induction y has a size-biased reordering σ y we define for i ∈ {1, 2, . . ., k − 1} the other values by σ(φ(i)) := φ(σ y (i)).This defines inductively the size-biased reordering.
A more direct definition can be given as follows.Given r such that at least r components of x are positive, for 1 ≤ i 1 , . . ., i r ≤ k distinct integers, we have and one can perform a simplification of the last terms in the numerator and denominator.Now let X : Ω → ∆ k ≤1 be a random variable.In order to define its size-biased reordering X * : Ω → ∆ k ≤1 , we consider for each x ∈ ∆ k ≤1 independent random variables σ x distributed according to P x as defined above which are furthermore independent from X.We then define for each ω ∈ Ω X * (ω) := σ X(ω) • X(ω) where σ • x = (x σ(1) , . . ., x σ(k) ).
Lemma 18.Let X a random variable on ∆ k =1 with density f X : ∆ k =1 → R. Let 1 ≤ r ≤ k.Then the r-th marginal of the size-biased reordering of X, that is to say the density of the vector (X * 1 , . . ., X * r ) is We first consider the case r = k.Since X admits a density, almost surely all components are positive and distinct.Hence one can use (11) to write its density as In the above formula we used the fact that the sum of X is s = 1 almost surely.Now, for 1 ≤ r ≤ k − 1, the r-th marginal is obtained by integrating the free variables ( 12) We can decompose the integral (12) as a sum over these subsimplices .
Using the fact that g is symmetric, we can rewrite it by mean of a change of variables on the standard simplex ∆ k−r .
We finish this section by mentioning that the size-biased reordering extends to infinite vectors, that is elements on ∆ ∞ ≤1 .

Poisson-Dirichlet and GEM distributions.
Recall that the GEM(θ) distribution was defined in the introduction via the stick-breaking process.We also defined the PD(θ) as the sorted reordering of GEM(θ).The Poisson-Dirichlet distribution admits an intrinsic definition in terms of the Poisson process first introduced by Kingman [Kin75].We refer to [ABT03, Section 4.11] for this definition.Instead we concentrate on the simpler Griffiths-Engen-McCloskey distribution.
In the introduction we passed from GEM(θ) to PD(θ).The following result formalizes the equivalence between the two distributions.
We will use the above result in the following form.

Corollary 20 ([DJ89]
).Let X (n) be a sequence of random variables on ∆ ∞ =1 .Let θ > 0. Then the sorted sequence X (n)↓ converges in distribution to PD(θ) if and only if the size-biased sequence X (n) * converges in distribution to GEM(θ).
In order to prove convergence towards GEM we will need the explicit description of its marginals.
In order to simplify computations, we consider moments of the GEM distribution that get rid of the denominator in the density (13).Namely, for a random variable X = (X 1 , X 2 , . ..) on ∆ ∞ ≤1 and p = (p 1 , . . ., p r ) a r-tuple of non-negative integers we define These moments of GEM(θ) are as follows.
Lemma 22.If X = (X 1 , X 2 , . . . ) ∼ GEM(θ) and (p 1 , . . ., p r ) is a non-negative integral vector, then the moment M p (X) defined in (14) has the following value Proof.By Proposition 21 we have The last term is an instance of Lemma 14 on the simplex ∆ r+1 =1 .Replacing the value obtained from the integration lemma gives the result.

Proof of the main theorem
The aim of this section is to prove the following result Theorem 23.For g ≥ 2 integral, m ∈ N ∪ {+∞} and κ > 1 real, let L(g,m,κ) * be the size-biased version of the random variable L(g,m,κ) from Theorem 17.Then as g tends to ∞, L(g,m,κ) * converges in distribution to GEM(1/2).
Let us first show how to derive our main Theorem 2 from Theorem 23.
Proof of Theorem 2. By Theorem 17, the random variables L (g,m,κ) * and L (g,m) * have the same limit distribution as g → +∞.Hence by Theorem 23, the random variable L (g,m) * converges in distribution towards GEM(1/2).
Finally Corollary 20 shows that the convergence in distribution of L (g,m) * towards GEM(1/2) is equivalent to the convergence of L (g,m)↓ towards PD(1/2).This concludes the proof of Theorem 2. 6.1.Moment's method.Let us recall from Section 5.2 Equation ( 14) that we defined some specific moments M (p1,...,pr) (X) for a random variable X on ∆ ∞ =1 .In this section, we show that the convergence of a sequence of random variables X (n) is equivalent to the convergence of all the moments M p (X (n) ).This strategy called the method of moments is a standard tool in probability, see for example [Bil95, Section 30] for the case of real variables.
Lemma 24.A sequence of random variables =1 if and only if for all p = (p 1 , . . ., p r ) vector of non-negative integers we have lim Proof.The infinite-dimensional cube [0, 1] N is compact with respect to the product topology by Tychonoff's theorem.The set ∆ ∞ ≤1 is a closed subset of [0, 1] N , and is therefore compact.The signed measures on ∆ ∞ ≤1 are identified with the dual of the real continuous function C(∆ ∞ ≤1 , R).In particular, we have the convergence of X (n) towards X (∞) in distribution if and only if for any continuous function , with r ≥ 0, p 1 , . . ., p r ≥ 0. We claim that the span of S (that is finite linear combinations of elements of S) is dense in C(∆ ∞ ≤1 , R).Indeed, S contains 1 and is stable under multiplication.Therefore, the algebra generated by S is equal to its span.Now, the set S is a separating subset of C(∆ ∞ ≤1 , R) and density follows from the Stone-Weierstrass theorem.
We will use the following asymptotic simplification of the moments.
Lemma 26.For each k = 1, . . ., 3g − 3, let U (g,m,k) be the random variable on where we use the notation F g,k for F Γ g,k .Then for any p = (p 1 , . . ., p r ) ∈ N k we have Note that Formula (15) is the density of a probability measure by Remark 15.It is more precisely the density of the asymptotic normalized vector of length of random multicurves restricted to multicurves of the type Γ g,k .
Proof.By definition of the stable graph polynomial we have Using the coefficients cg,k defined just above the statement of the lemma, we rewrite the polynomial F g,k as Hence the density of U (g,m,k) in (15) can be rewritten as Now, by Lemma 18, the r-th marginal of the sized-biased version U (g,m,κ) * of U (g,m,κ) is In the above, we used the fact that the density of U (g,m,k) is a symmetric function.
Hence the sum over all permutations of k elements only pops out a k! coefficient.
The value of the integral in the above sum follows from Lemma 14 and is equal to We end up with the following formula for the distribution of the r-th marginal of From the above formula and the definition of the moment M p in (14), the moment M p (U (g,m,κ) * ) equals Lemma 14 gives the value of the above integral Substituting the above value in our last expression for M p (U (g,m,κ) * ) gives the announced formula.
Proof of Theorem 25.Because the distribution of L(g,m,κ) is a weighted sum of distributions, we can perform the computation of the moments for each term in the sum and gather the result in the end.More precisely, we have where bg,m,κ was defined in (4).Now substituting the formula for M p (U (g,m,k) * ) from Lemma 26 and the asymptotic value of bg,m,κ from Theorem 11 in the sum (16), we have as g → ∞ the asymptotic equivalence where we have used that cg,k (j 1 , . . ., j k ) ∼ 1 uniformly in k ∈ [1, κ log(6g − 6)/2].
On the one hand, by [DGZZ,Equation (3.13)] (in the proof of Theorem 3.4) we have On the other hand Replacing ( 18) and ( 19) in ( 17) we obtain which is the announced formula.
6.2.Asymptotic expansion of a related sum.Let θ = (θ i ) i≥1 be a sequence of non-negative real numbers and let p = (p 1 , . . ., p r ) be a non-negative integral vector.This section is dedicated to the asymptotics in n of the numbers which should be reminiscent of the formula from Theorem 25.
Definition 27.Let θ = (θ j ) j≥1 be non-negative real numbers and let g θ (z) be the formal series We say that θ is admissible if the function g θ (z) • converges in the open disk D(0, 1) ⊂ C centered at 0 of radius 1, • g θ (z)+log(1−z) extends to a holomorphic function on D(0, R) with R > 1.
The following is essentially [DGZZ, Lemma 3.8] that we reproduce for completeness.
Lemma 29.For m ∈ N ∪ {+∞}, let Then g m (z) is summable in D(0, 1) and g m (z)+log(1−z) extends to a holomorphic function on D(0, 4).In particular the sequence Proof.Since ζ m (2j) is bounded uniformly in j, the series converges in D(0, 1).Now, expanding the definition of the partial zeta function ζ m and changing the order of summation we have for z ∈ D(0, 1) and hence This completes the proof.
For a non-negative integer p we define the differential operator on C[[z]] by D p (f ) := z d p+1 dz p+1 (z p f ).We start with some preliminary lemmas.
Lemma 31.Let θ = (θ i ) i and g θ (z) as in Theorem 28.Let p = (p 1 , . . ., p r ) be a tuple of non-negative integers and let Then, for any n ≥ 0 we have [z 2n ] G θ,p (z) = S θ,p,n where [z 2n ] is the coefficient extraction operator and S θ,p,n is the sum in (20) Proof.Let us first note that 1 2 g θ (z 2 ) = i θ i z 2i 2i .We aim to compute the expansion of D p 1 2 g θ (z 2 ) .By linearity, it is enough to compute a single term and we have The lemma follows by expanding the exponential.
Lemma 32.For any p ≥ −1 we have The proof for − log(1 + z) is similar.
Proof of Theorem 28.By Lemma 31, the sum S θ,p,n is the coefficient in front of z 2n of G θ,p .By the conditions in the statement, g θ (z 2 ) = − log(1−z 2 )+β +r θ (z) where r θ (z) is holomorphic on D(0, √ R) and r θ (1) = 0. Using Lemma 32, we deduce that for any p ≥ 0 we have This completes the proof.
6.3.Truncation estimates.Recall that Theorem 25 provided an expression for the moment M p ( L(g,κ) * ) which involves a sum which is a truncated version of S θ,p,n from (20).In this section, we show that the difference between S θ,p,n and its truncation is negligible compared to the asymptotics of Theorem 28.
Theorem 33.Let θ and g θ (z) be as in Theorem 28.Then for any real κ > 1 we have as n → ∞ Bounding the coefficient in a Taylor expansion is a standard tool in asymptotic analysis as the "Big-Oh transfer" [FS09, Theorem VI.3].However, in our situation we need to bound the n-th Taylor coefficient of a function f n that depends on n.To do so, we track down the dependencies on the functions inside the transfer theorem.
Lemma 34 ([DGZZ, Lemma 4.4]).Let λ and x be positive real numbers.We have, Fix a real κ > 1, and non-negative integers p and q.For n ≥ 1 let Then, we have as n → ∞ The same estimate is valid for the integral along the other three segments λ , λ , and λ .For the two large demi-circles Σ + and Σ − , we have which decreases exponentially fast.
We conclude the proof by combining the above estimates.