A decorated tree approach to random permutations in substitution-closed classes

We establish a novel bijective encoding that represents permutations as forests of decorated (or enriched) trees. This allows us to prove local convergence of uniform random permutations from substitution-closed classes satisfying a criticality constraint. It also enables us to reprove and strengthen permuton limits for these classes in a new way, that uses a semi-local version of Aldous' skeleton decomposition for size-constrained Galton--Watson trees.


Uniform random permutations in classes: some background and overview of our results
We assume some familiarity of the reader with basic definitions of permutation patterns and permutation classes, i.e., what is a pattern, an occurrence and a consecutive occurrence, a class, its basis, . . . If needed, the definitions of these notions are given at the end of the introduction.
Permutation classes are classically studied from an enumerative point of view, i.e., one wants to compute the number of permutations of any fixed size in a given class or the generating function of the class (possibly refining according to some statistics). In recent years, there has also been an increasing interest in the behaviour of a large typical permutation taken in a given permutation class. We refer for example to [13,24,25,26,32,33,43,44] for results on random τ -avoiding permutations with τ of size 3. Other specific classes (or sets of permutations) have been studied: permutations avoiding a • As a first application we give a new proof of the main scaling limit result of [7,9] by using an extension of Aldous' skeleton decomposition (see [6] for Aldous' original statement, and Theorem 4.2 for our extension). This new proof works under weaker conditions and makes transparent the connection to random trees which was suggested, but unclear, in [7] (see in particular Remark 1.11 or the beginning of Section 1.7 there). In particular, our proof yields a probabilistic interpretation of the conditions under which this scaling limit result holds (see Section 1.6).
• Our second main contribution is a novel quenched local limit for random permutations from substitution-closed classes. Here we use fringe subtree count asymptotics and the skeleton decomposition to describe a concentration phenomenon for consecutive patterns. This notion of convergence has recently been introduced by Borga in [13], where such limits were proven for random permutations avoiding patterns of length 3.
The rest of the introduction defines substitution-closed classes and provides details on our results and on the approach used in this paper.

Substitution of permutations and closed classes
To define the substitution operation, it is convenient to think of permutations as diagrams. That is, if n denotes the size of a permutation ν, we may identify ν with the set of points (i, ν(i)) (for i in [n]). The substitution θ[ν (1) , ..., ν (d) ], where θ, ν (1) , . . . , ν (d) are permutations and d is the size of θ, is then obtained as follows. For each i, we first replace the point (i, θ(i)) with the diagram of ν (i) . Then rescaling the rows and columns yields the diagram of a bigger permutation, which is by definition θ[ν (1) , ..., ν (d) ]. A permutation of size greater than 2 is called simple if it cannot be obtained as the substitution of smaller permutations. An example of substitution is given in Fig. 1. i.e., classes C such that θ, ν (1) , ..., ν (d) ∈ C implies θ[ν (1) , ..., ν (d) ] ∈ C. Alternatively, a class is substitution-closed if and only if its basis (i.e., the avoided patterns defining the class) consists only of simple permutations. In particular, there are uncountably many substitution-closed permutation classes. Due to their nice combinatorial structure (see A decorated tree approach to random permutations in substitution-closed classes Section 2), substitution-closed permutation classes are a nice general framework, where to investigate the properties of uniform random elements.
We note that a substitution-closed class C is entirely determined by the set S of simple permutations in it (see Theorem 2.11). We consider this set S as the data of our problem, and the goal is, under various conditions on S to obtain convergence results for uniform random permutations in the class C. These conditions will typically be expressed in terms of the generating functions of S, that we conveniently also denote S. From Stanley-Wilf-Marcus-Tardös' theorem [41], it always has a positive radius of convergence ρ S > 0 (except in the trivial case where C is the set S of all permutations, which we exclude from now on; permutation classes different from S are called proper).

Permuton convergence of substitution-closed classes
The notion of permutons was introduced in [29] to describe limits of permutation sequences. Formally, a permuton is a probability measure on the unit square [0, 1] 2 , whose projection on each axis is the Lebesgue measure on [0, 1] (we say that the measure has uniform marginals). Permutations can be seen as permutons by considering the rescaled diagrams; we will denote µ ν the permuton associated with the permutation ν.
The weak topology on measures gives then a natural meaning to the convergence of a sequence of permutations to a given permuton. A nice feature is that the convergence in terms of permutons is equivalent to the convergence of pattern proportions. We refer to [7,Section 2] for details. Some specific permutons have been described as limits of permutation classes, as in [12,Chapter 6], [15] and [9,7,8]. Among these, the biased Brownian separable permuton µ (p) of parameter p is a random permuton, constructed from a Brownian excursion and independent signs associated with its local minima, see Maazoun [40]. It was proved in [7,9] that this is a universal limiting object for substitution-closed permutations classes, in the sense that uniform random permutations in many substitution-closed classes converge to µ (p) , for some p. In this article, we give a new proof of this theorem that is based on an extension of Aldous' skeleton decomposition [6] and the framework of random enriched trees and tree-like structures [48,50]. Theorem 1.1. Let ν n be the uniform n-sized permutation from a proper substitutionclosed class of permutations C. Suppose that S ′ (ρ S ) > 2 (1 + ρ S ) 2 − 1, (1.1) or A decorated tree approach to random permutations in substitution-closed classes

Local convergence: a concentration phenomenon for substitution-closed classes
In addition to scaling limits, our decorated tree approach also allows us to obtain local limit results for uniform random permutations in substitution-closed classes. For this, we use a local topology for permutations recently defined by Borga in [13]. This topology is the analogue of the celebrated Benjamini-Schramm convergence for graphs, in the sense that we look at the neighbourhood of a random element of the permutation. Pleasantly, convergence for this local topology is equivalent to the convergence of consecutive pattern proportions.
For convenience, we present our results in term of consecutive patterns. For a permutation ν and a pattern π, we denote by c-occ(π, ν) the number of consecutive occurrences of a pattern π in ν; for instance, for π = 21 (resp. π = 321), these are the number of descents (resp. double-descents) in the permutation.  For each n ∈ N, we consider a uniform random permutation ν n of size n in C. Then, for each pattern π ∈ C, there exists γ π,C in [0, 1] such that 1 n c-occ(π, ν n ) P −→ γ π,C . (1.4) We note that the hypothesis made in this theorem is slightly weaker than that for scaling limits. The theorem shows the convergence of all random variables 1 n c-occ(π, ν n ) to deterministic constants, revealing a "concentration" phenomenon in substitutionclosed class under hypothesis (1.3). The constants γ π,C can be constructed from local limits of conditioned Galton-Watson trees around a random leaf, see Section 6 and in particular Theorem 6.22. They depend both on the pattern π and on the class C.

Proof methodology
Start with a permutation ν of size n ≥ 2. If it is not simple nor monotone 1 , it can be written as θ[ν (1) , ..., ν (d) ], for some smaller permutations θ, ν (1) , ..., ν (d) . We can iterate this decomposition on θ, ν (1) , ..., ν (d) : as long as they are not simple nor monotone, we decompose them further through substitution. The result is a representation of ν as a tree with n leaves, whose internal vertices are decorated by monotone or simple A decorated tree approach to random permutations in substitution-closed classes positively and negatively decorated vertices to be of a different type from other vertices in the tree.
Results on conditioned multitype Galton-Watson trees do exist in the literature: in particular, there are some scaling limit results under finite or infinite variance assumptions [11,18,42], and local limit results around the root [1,47] for such trees. Nevertheless these results do not cover our needs.
• For the scaling limit results on permutations, we need information on the type and outdegree of the closest common ancestors of randomly selected leaves (while tree scaling limit results only give information on the genealogy of such leaves).
• For the local limit results on permutations, we need some local limit results around a random leaf, and not around the root. For studying local convergence of random separable permutations we additionally require joint convergence with the parity of the height of the leaf.
We therefore do not use this encoding as multitype Galton-Watson trees, but rather provide a novel encoding of random permutations in substitution-closed classes as decorated monotype Galton-Watson forests. That is, random plane forests where each vertex is enriched with an independent local structure. This integrates the random permutations naturally into the framework of random tree-like structures [48].
To identify permutations with decorated forests, we first note that a generic permutation is the ⊕-sum of an ordered sequence of ⊕-indecomposable permutations, i.e., of permutations which cannot be obtained as a substitution 12[π (1) , π (2) ] (see Theorem 2.5 below). We then associate to each of these ⊕-indecomposable permutations its canonical tree. To those trees, we apply a packing procedure. This packing procedure merges vertices decorated with a simple permutation with its children having a positive decoration. As a consequence, we do not need anymore to distinguish between positive and negative decorations. The resulting tree, called packed tree of the (⊕-indecomposable) permutation, is still a decorated tree with n leaves, but the decorations are now more complicated objects than permutations, being themselves trees of permutations (called S-gadget below, see Section 2.3 for details). The advantage of this new representation is that there is no condition on the decoration of a vertex, depending on the one of its parent. As a result of this construction, any permutation is represented as an ordered sequence of decorated trees, i.e., an ordered decorated forest, without any constraint on the decorations (Theorem 2. 19). We note that this representation is a bijection from the set of all permutations to ordered decorated forests, and could thus be of interest, independently from its application to the study of random elements in substitution-closed classes done here.
To study random permutations of size n taken uniformly at random in a substitutionclosed class C, we use a result on convergent Gibbs partitions (see Stufler [49,Thm. 3.1]) to prove that the associated ordered decorated forest contains a giant tree of size n − O p (1) (Theorem 3.2 page 17). It is therefore enough to study a random decorated tree with n leaves. Such trees have the same distribution as a monotype Galton-Watson tree T ξ n with a specific offspring distribution ξ conditioned on having n leaves. We can therefore use results or techniques on monotype Galton-Watson trees, which are much more developed than in the multitype case.
• In particular, to find the scaling limit of our permutations, there are some results on the genealogy and the outdegree of common ancestors of randomly chosen vertices (this is implicit in the original paper of Aldous, see [6,Eq. (49)]). We will refer to this as Aldous' skeleton decomposition. In this article we will need an extension of this, considering also local neighbourhood of the common ancestors A decorated tree approach to random permutations in substitution-closed classes (being therefore semi-local ) and allowing to condition on the number of leaves instead of the number of vertices. Theorem 4.2 provides a general result to this effect, allowing to condition on the number of vertices with arity in any given set Ω ⊆ N 0 satisfying P(ξ ∈ Ω) > 0.
• The literature also contains results on the number of (extended) fringe subtrees of T ξ n (and related models) isomorphic to a given tree [3,28,31,48,51,52]. When ξ is critical, such results may be translated to local limit results for T ξ n , pointed at a random leaf (see Theorem 6.13). We shall however need and will prove a slightly stronger result when ξ is critical and additionally has finite variance, taking also into account the parity of the height of the pointed leaf (see Theorem 6.20 page 46).
The last step of the proofs (both in the scaling and local limit cases) is to translate the results on the packed trees to results on the permutation ν n itself. A difficulty here arises from the identification of positive and negative decorations in the packing construction. To invert this construction, and recover the correct signs on the decorated trees, we need to determine the distance to the closest ancestor decorated with a simple permutation. When S = ∅, this ancestor is at a stochastically bounded distance, so that this inversion procedure is still local. However, when S = ∅, i.e., in the case of separable permutations, there is no such ancestor and we need to go all the way to the root to invert the packing construction. This creates an extra difficulty, that we overcome by using a local limit theorem for the length of "bones" in the skeleton decomposition.

Interpretation of the various assumptions on S
Our assumptions on S might seem artificial but they are in fact very natural, after having introduced the above representation of permutations as decorated conditioned Galton-Watson forests. Namely • Eq. (1.3) is equivalent to the fact that the Galton-Watson tree model is critical; • Eq. (1.1) asks in addition that the offspring distribution has small exponential moments; • finally, Eq. (1.2) means that the offspring distribution has no exponential moments, but finite variance.
Such hypotheses are classical in the analysis of conditioned Galton-Watson trees, and give a probabilistic meaning to the conditions used in [7]. In terms of substitution-closed classes, the small exponential moment condition is satisfied for most classes in the literature, see the discussion in [ There is however at least one class not satisfying Eq. (1.3): the class Av(2413). The packed forest associated with a uniform random permutation ν n in this class has the distribution of a decorated conditioned Galton-Watson forest with a subcritical offspring distribution. It will therefore contain with high probability a unique vertex with macroscopic degree (see [31,34,38,52]). This vertex is decorated with a large simple permutation α n in the class and the scaling (resp. local) limit of ν n could be described if one knew that of α n . In the current state of the art, studying a uniform random simple α n in Av(2413) does not seem to be a simpler problem than the original one of studying ν n , hence this approach appears to be ineffective for Av(2413).

Outline of the paper
The paper is organized as follows. The rest of the introduction sets up some notation. Section 2 presents the combinatorial construction used in this paper, that is the canonical tree and packed forest associated with a permutation. Section 3 identifies the packed forest associated with a uniform random permutation in a substitution-closed class as a conditioned monotype Galton-Watson forest. We also discuss the existence of a giant tree in such a forest. In Section 4, we state and prove our improvement of Aldous' skeleton decomposition. The last two sections are devoted to the proofs of the main theorems: Section 5 for the scaling limit result (Theorem 1.1) and Section 6 for the local limit result (Theorem 1.2).

Permutation patterns and permutation classes: basic definitions and notation
We let N 0 = {0, 1, . . .} denote the collection of non-negative integers and N = {1, 2 . . .} the collection of strictly positive integers. For any n ∈ N, we denote the set of permutations of [n] := {1, 2, . . . , n} by S n . We write permutations of S n in one-line notation as ν = ν(1)ν(2) . . . ν(n). For a permutation ν ∈ S n the size n of ν is denoted by |ν|. We let S := n∈N S n be the set of finite permutations. We write sequences of permutations in S as (ν n ) n∈N .
We will often view a permutation ν as its diagram, which is (as said earlier -see also the right part of Fig. 1) the set of points of the Cartesian plane at coordinates (j, ν(j)). If x 1 . . . x n is a sequence of distinct numbers, let std(x 1 . . . x n ) be the unique permutation π in S n that is in the same relative order as x 1 . . . x n , i.e., π(i) < π(j) if and only if x i < x j . Given a permutation ν ∈ S n and a subset of indices I ⊆ [n], let pat I (ν) be the permutation induced by (ν(i)) i∈I , namely, pat I (ν) := std (ν(i)) i∈I . For example, if ν = 87532461 and I = {2, 4, 7} then pat {2,4,7} (87532461) = std(736) = 312.
Given two permutations, ν ∈ S n for some n ∈ N and π ∈ S k for some k ≤ n, we say that ν contains π as a pattern (and we write π ≤ ν) if ν has a subsequence of entries orderisomorphic to π, that is, if there exists a subset I ⊆ [n] such that pat I (ν) = π. Denoting i 1 , i 2 , . . . , i k the elements of I in increasing order, the subsequence ν(i 1 )ν(i 2 ) . . . ν(i k ) is called an occurrence of π in ν. In addition, we say that ν contains π as a consecutive pattern if ν has a subsequence of adjacent entries order-isomorphic to π, that is, if there exists an interval I ⊆ [n] such that pat I (ν) = π. Using the same notation as above, ν(i 1 )ν(i 2 ) . . . ν(i k ) is then called a consecutive occurrence of π in ν. All along the article, for any integers a, b ∈ Z (resp. n ∈ N ), the interval [a, b] (any interval I ⊆ [n]) is to be understood as an integer interval, i.e., an interval contained in Z. For real numbers a ≤ b, we use the same notation [a, b] to denote the interval [a, b] ⊆ R Example 1.3. The permutation ν = 1532467 contains 1423 as a pattern but not as a consecutive pattern and 321 as consecutive pattern. Indeed pat {1,2,3,5} (ν) = 1423 but no interval of indices of ν induces the permutation 1423. Moreover, pat [2,4] We say that ν avoids π if ν does not contain π as a pattern. We point out that the definition of π-avoiding permutations refers to patterns and not to consecutive patterns. Given a set of patterns B ⊆ S, we say that ν avoids B if ν avoids π, for all π ∈ B. We denote by Av n (B) the set of B-avoiding permutations of size n and by Av(B) := n∈N Av n (B) the set of B-avoiding permutations of arbitrary size.
A permutation class C is a set of permutations closed under the operation of patterncontainment (i.e., if ν ∈ C and π ≤ ν then π ∈ C). We recall that every permutation class can be rewritten as a family of pattern-avoiding permutations, i.e., for every permutation class C there exists a set of patterns B ⊆ S such that C = Av(B). Note that if one permutation of B is contained in another then we may remove the larger one without changing the family. Thus we may take B to be an antichain, meaning that no element of B contains any others. In the case that B is an antichain we call it the basis of this family. We note that the basis of a class may be finite or infinite.

Probabilistic notation
In order to avoid any confusion, we write random quantities using bold characters to distinguish them from deterministic quantities. Moreover, given a random variable X, we denote with L(X) its law. Unless otherwise stated, all limits are taken as n → ∞. Given a sequence of random variables (X n ) n∈N we write X n d −→ X to denote convergence in distribution and X n P −→ X to denote convergence in probability. We let O p (1) represent an unspecified random variable Y n of a stochastically bounded sequence (Y n ) n .
Besides, the indicator of an event A is denoted 1[A]. Finally, the expression with high probability means with probability tending to 1 (without precision on the speed of convergence). Table 1 summarizes some notational conventions and frequently used terminology in this paper. In general, for a combinatorial class denoted by a curly letter, e.g. P, we use the same letter P(z) for its generating series, ρ P for the radius of convergence of P(z), a standard uppercase letter P for an object in the class, and a lowercase letter p n with index n ≥ 0 for the number of objects of size n. Bijections between classes will be denoted by two upper case letters, e.g. CT (page 11) building the canonical tree of a permutation. S n the set of permutations of size n, page 7 C a substitution-closed class of permutations, page 12 S the subset of simple permutations in C, page 12 T the class of canonical trees associated with C, page 12 P the class of packed trees associated with C, page 14 ν n the uniform random n-sized permutation from C, page 16 P n = (T n , λ Tn ) the uniform random packed tree with n leaves, page 19 T • the collection of (possibly infinite) pointed plane trees, page 39 T •,luf the collection of (possibly infinite) locally and upwards finite pointed plane trees, page 39 T •,luf D the collection of (possibly infinite) locally and upwards finite decorated pointed plane trees, page 40 P •,luf the collection of (possibly infinite) locally and upwards finite pointed packed trees, page 41

A novel encoding of permutations as forests of decorated trees
In this section we show that any substitution-closed class of permutations may be bijectively encoded as a forest of trees decorated (or enriched) with local structures. This goal is achieved in Theorem 2.19, in Subsection 2.4.
A decorated tree approach to random permutations in substitution-closed classes

Basics on combinatorial classes and decorated trees
In this paper, we only consider rooted (a.k.a. planted) plane trees; plane means that the children of a given vertex are endowed with a linear order. Throughout the paper, the outdegree d + T (v) (or d + (v) when there is no ambiguity) of a vertex v in a tree T is the number of its children (which is sometimes called arity in other works). Note that it may be different from the graph-degree: the edge to the parent (if it exists) is not counted in the outdegree. We consider both finite and infinite trees. We say a tree is locally finite, if all its vertices have finite degree. A vertex of T is called a leaf, if it has outdegree zero. The collection of non-leaves (also called internal vertices) is denoted by V int (T ). The fringe subtree of a tree T rooted at a vertex v is the subtree of T containing v and all its descendants. We will also speak of branch attached to a vertex v for a fringe subtree rooted at a child of v.
Any plane tree may be encoded in a canonical way as a subtree of the Ulam-Harris tree U ∞ . The vertex set of U ∞ is given by the collection of all finite sequences of positive integers, and the offspring of a vertex (i 1 , . . . , i k ) is given by all sequences (i 1 , . . . , i k , j), j ≥ 1. The root of U ∞ is the unique sequence of length 0.
Moreover, most trees considered here carry some additional structures on their vertices from a combinatorial class. Let D be a set and size : D → N 0 be a map from D to the set of non-negative integers, associating to each object in D its size. We say D is an (unlabelled) combinatorial class, if for any n ∈ N 0 the number d n of n-sized objects in D is finite. This allows us to form the generating series Note that we use the same curvy letter D for the class and its generating series. This should hopefully not lead to confusions. Two combinatorial classes D 1 , D 2 are considered isomorphic if there is a size-preserving bijection between the two, or equivalently if they have the same generating series. Various standard operations are available for combinatorial classes. For example, whenever D has no objects of size 0, we can form the combinatorial class Seq(D), which is the collection of finite sequences of objects from D. The size of such a sequence is defined to be the sum of sizes of its components. We may also consider the subclass Seq ≥1 (D) ⊂ Seq(D) of non-empty sequences.
Definition 2.1. Let D be a combinatorial class. A D-decorated (or D-enriched) tree is a rooted locally finite plane tree T , equipped with a function dec : V int (T ) → D from the set of internal vertices of T to D such that the following holds: for each v in V int (T ), the outdegree of v is exactly size(dec(v)).
This is a (planar) variant of Labelle's enriched trees [39], which have been studied in [50,48] from a probabilistic viewpoint.
Examples of substitution (see Fig. 2 below) are conveniently presented representing permutations by their diagrams: the diagram of ν = θ[ν (1) , . . . , ν (d) ] is obtained by A decorated tree approach to random permutations in substitution-closed classes inflating each point θ(i) of θ by a square containing the diagram of ν (i) . Note that each ν (i) then corresponds to a block of ν, a block being defined as an interval of [|ν|] which is mapped to an interval by ν.
Throughout this article, the increasing permutation 12 . . . k will be denoted by ⊕ k , or even ⊕ when its size k can be recovered from the context: this is the case in an inflation ⊕[ν (1) , . . . , ν (d) ] where the size of ⊕ is the number d of permutations inside the brackets. Similarly, we denote the decreasing permutation k . . . 21 by ⊖ k , or ⊖ when there is no ambiguity. Permutations can be decomposed in a canonical way using recursively the substitution operation. To explain this, we first need to define several notions of indecomposable objects.
A permutation of size n > 2 is simple if it contains no nontrivial block, i.e., if it does not map any nontrivial interval (i.e., a range in [n] containing at least two and at most n − 1 elements) onto an interval.
For example, 451326 is not simple as it maps the interval [3,5] onto the interval [1,3]. The smallest simple permutations are 2413 and 3142 (there is no simple permutation of size 3). We denote by S all the set of simple permutations.

Remark 2.4.
Usually in the literature, the definition of a simple permutation requires n ≥ 2 instead of n > 2, so that 12 and 21 are considered to be simple. However, for decomposition trees, 12 and 21 do not play the same role as the other simple permutations, that is why we do not consider them to be simple. Theorem 2.5 (Decomposition of permutations). Every permutation ν of size n ≥ 2 can be uniquely decomposed as either: Remark 2.6. The above theorem is essentially Proposition 2 in [2], presented with a slightly different point of view. The decomposition according to Theorem 2.5 is obtained from the one of [2, Proposition 2] by merging maximal sequences of nested substitutions in 12 (resp. 21) into a substitution in ⊕ (resp. ⊖). For example, the second item above for d = 4 corresponds to 12[ν (1) , 12[ν (2) , 12[ν (3) , ν (4) ]]] with the notation of [2]. With this A decorated tree approach to random permutations in substitution-closed classes obvious rewriting, the statements of [2, Proposition 2] and of Theorem 2.5 are trivially equivalent.
This decomposition theorem can be applied recursively inside the permutations ν (i) appearing in the items above, until we reach permutations of size 1. Doing so, a permutation ν can be naturally encoded by a rooted labelled plane tree CT(ν) as follows.
(The notation CT(ν) stands for canonical tree -see Theorem 2.7.) • If ν = 1 is the unique permutation of size 1, then CT(ν) is reduced to a single leaf.
From the above theorem, the decomposition ν = β[ν (1) , . . . , ν (d) ] exists and is unique if |ν| ≥ 2. Moreover, ν (1) , . . . , ν (d) have size smaller than ν so that this recursive procedure always terminates and its result is unambiguously defined. The map CT is therefore well-defined. An example of this construction is shown on Fig. 3.  Figure 3: A permutation ν and its decomposition tree CT(ν). To help the reader understand the construction, we have coloured accordingly some blocks of ν and some subtrees of CT(ν).
Since the labels of the vertex record the permutation β in which we substitute, it is clear that CT is injective. Moreover, its inverse (once restricted to CT(S)) is immediate to describe, simply by performing the iterated substitutions recorded in the tree. We are just left with identifying the image set of CT. Recall that S all denotes the set of all simple permutations, and let M be the set of all monotone (increasing or decreasing) permutations of size at least 2. Denote S all := S all ∪ M. Definition 2.7. A canonical tree is an S all -decorated tree such that we cannot find two adjacent vertices both decorated with increasing permutations (i.e., with ⊕) or both decorated with decreasing permutations (i.e., with ⊖).
Canonical trees are also known in the literature under several names: decomposition trees, substitution trees,. . . We choose the term canonical to be consistent with [7]. The following is an easy consequence of Theorem 2.5.
Proposition 2.8. The map CT defines a size-preserving bijection from the set of all permutations to the set of all canonical trees, the size of a tree being its number of leaves. Remark 2.9. We note that the inverse map CT −1 , which builds a permutation from a canonical tree performing nested substitutions, can obviously be extended to all S alldecorated trees, regardless of whether they contain ⊕ − ⊕ or ⊖ − ⊖ edges. However, CT −1 is no longer injective on this larger class of "non-canonical" decomposition trees.
We will be interested in the restriction of CT to some permutation class. The following condition ensures that its image has a nice description. Definition 2.10. A permutation class C is substitution-closed if for every θ, ν (1) , . . . , ν (d) in C it holds that θ[ν (1) , . . . , ν (d) ] ∈ C. Proposition 2.11. Let C be a substitution-closed permutation class, and assume 2 that 12, 21 ∈ C. Denote by S the set of simple permutations in C. The set of canonical trees encoding permutations of C is the set of canonical trees with decorations in S := S ∪ M.
Proof. First, if a canonical tree contains a vertex decorated by a simple permutation α / ∈ S, then the corresponding permutation ν contains the pattern α / ∈ C, and hence ν / ∈ C. Second, by induction, all canonical trees with decorations in S encode permutations of C, because C is substitution-closed. If necessary, details can be found in [2, Lemma 11].

Packed decomposition trees
From now until the end of the article we fix a substitution-closed class C such that 12, 21 ∈ C and we denote with S the set of simple permutations in C. The assumption that we are working in C rather than in the set of all permutations is however often tacit: for example, we simply refer to canonical trees instead of canonical trees with decorations in S = S ∪ M. We let T denote the collection canonical trees with decorations in S, and T not⊕ ⊂ T the subset of canonical trees with a root that is not labelled ⊕.
In this section we introduce a new family of trees called "packed trees" and describe a bijection between the collection T not⊕ ⊂ T and packed trees. Packed trees are decorated trees, whose decorations are themselves trees. Let us define these decorations, that we call gadgets. Definition 2.12. An S-gadget is an S-decorated tree of height at most 2 such that: • The root is an internal vertex decorated by a simple permutation; • The children of the root are either leaves or decorated by an increasing permutation.
The size of a gadget is its number of leaves.
We denote with G(S) the set of S-gadgets. An example of size 7 is shown on Fig. 4. Finally, let G(S) = G(S) ∪ {⊛ k , k ≥ 2}, where, for each integer k ≥ 2, the object ⊛ k has size k. To shorten notation, G(S) is sometimes denoted Q in the following.

+ +
A decorated tree approach to random permutations in substitution-closed classes Definition 2.13. An S-packed tree is a G(S)-decorated tree, its size being its number of leaves.
Remark 2.14. We will often refer to S-packed trees simply as packed trees since in our analysis, the substitution-closed class C and its set of simple permutations S will be fixed.
An example of packed tree is shown on the right-hand side of Fig. 5. We now describe a bijection between canonical trees with a root that is not labelled with ⊕ and packed trees. Given a tree T ∈ T not⊕ the corresponding packed tree PA(T ) is obtained modifying T as follows.
• For each internal vertex v of T labelled by a simple permutation, we build an S-gadget G v whose internal vertices are v and the ⊕-children of v, the parent-child relation in G v and the left-to-right order between children are inherited from the ones in T , and we add leaves so that the outdegree of each internal vertex is the same in G v as in T . Then, in PA(T ), we merge v and the ⊕-children of v into a single vertex decorated by G v .
• The remaining vertices of T , decorated by ⊖ k or ⊕ k , are decorated by ⊛ k instead.
An example is given on Fig. 5. As a preparation for the inversion procedure, let us note the following: if a vertexṽ in PA(T ) has a decoration ⊛ k and his parent is decorated by an S-gadget, then the corresponding vertex v in T had decoration ⊖ k . Indeed, a vertex decorated by ⊕ which is the child of a vertex v labelled by a simple permutation is included in G v , and canonical trees do not contain ⊕ − ⊕ edges.

2413
- Proposition 2.16. The map PA defines a size-preserving bijection from the set T not⊕ of canonical trees with a root that is not labelled ⊕ to the set of S-packed trees.
A decorated tree approach to random permutations in substitution-closed classes Proof. We need just to show that the previous construction is invertible. Given a packed tree P , the corresponding tree T of T not⊕ (such that P = PA(T )) is obtained by modifying P as follows.
• For each internal vertexṽ of P decorated by an S-gadget G, we replaceṽ by G, merging the leaves of G with the children ofṽ, respecting their order. Namely, when doing this replacement, the root of the i-th subtree attached toṽ (from left to right) is merged with the i-th leaf of G (also from left to right).
• We replace each decoration ⊛ k with either ⊖ k or ⊕ k , with the following rule. Ifṽ is the root of P or the child of a vertex decorated by an S-gadget, it receives label ⊖ k . Otherwise, ifṽ is the child of a vertex also decorated by some ⊛, then we labelṽ in the only way that prevents the creation of ⊕ − ⊕ or ⊖ − ⊖ edges.
This shows that PA defines a bijection.
Remark 2.17. If T (or P ) is a tree with n leaves, we can label its leaves with number from 1 to n using a depth-first traversal of the tree from left to right. Then the i-th leaf of the canonical or packed tree associated to a permutation ν corresponds to the i-th element in the one-line notation of ν. We will use this identification between leaves and elements of the permutations later in the article.

Permutations are forests of decorated trees
Summing up the results obtained in the previous sections (in particular in Theorems 2.8, 2.11 and 2.16), we obtain a bijective encoding of ⊕-indecomposable permutations in C: is a size-preserving bijection from the set C not⊕ of all ⊕-indecomposable permutations in C to the set P of all S-packed trees.
By Theorem 2.5, any ⊕-decomposable permutation corresponds uniquely to a sequence of at least two ⊕-indecomposable permutations. Hence any permutation corresponds bijectively to a non-empty sequence of ⊕-indecomposable permutations. If we apply the bijection PA • CT to each we obtain a plane forest of packed trees. That is, it is an element of the collection Seq ≥1 (P) of non-empty ordered sequences of G(S)-decorated trees. We define the size of such a forest to be the total number of leaves. The function that maps a permutation of C to the corresponding forest of packed trees is denoted by DF (DF stands for decorated forest). Summing up: is a size-preserving bijection between the substitution-closed class of permutations C and the collection of forests of packed trees.

Reading patterns in trees
Let us consider a permutation ν in C not⊕ and the associated canonical and packed trees: T = CT(ν) and P = PA(T ). Let I be a subset of [n]. Using Theorem 2.17, I can be seen as a subset of the leaves of T (or P ). The purpose of this section is to explain how to read out the pattern π = pat I (ν) on the trees T or P .
Let us first note that a pattern π = π(1) . . . π(k) is entirely determined when we know, for each i 1 < i 2 , whether π(i 1 )π(i 2 ) forms an inversion (i.e., an occurrence of the A decorated tree approach to random permutations in substitution-closed classes pattern 21) or a non-inversion (occurrence of 12). Therefore, to read patterns on T (or P ), we should explain how to determine, for any two leaves ℓ 1 and ℓ 2 of I, whether the corresponding elements of ν form an inversion or not (in the sequel, we will simply say that ℓ 1 and ℓ 2 form an inversion, and not refer anymore to the corresponding elements of ν).
Looking at T , this is rather easy. We consider the closest common ancestor of ℓ 1 and ℓ 2 , call it v. By definition, ℓ 1 and ℓ 2 are descendants of different children of v, say the i 1 -th and i 2 -th. Then the following holds: ℓ 1 and ℓ 2 form an inversion in ν if and only if i 1 and i 2 form an inversion in the decoration β of v.
Let us now look at P . We consider the closest common ancestor u ∈ P of ℓ 1 and ℓ 2 and as before, we assume that ℓ 1 and ℓ 2 are descendants of the i 1 -th and i 2 -th children of u. Note that, in the packing bijection, the vertex u corresponds to v (the common ancestor of ℓ 1 and ℓ 2 in T ) potentially merged with other vertices.
Consider first the case that u is decorated by an S-gadget G. Then G contains the information of the decoration of all vertices merged into u, including v. Therefore, whether ℓ 1 and ℓ 2 form an inversion in ν can be determined by looking at the i 1 -th and i 2 -th leaves of the gadget G (see the example below).
If on the contrary u is not decorated by an S-gadget but by a ⊛, we need to determine whether v is decorated with ⊕ (implying that ℓ 1 and ℓ 2 form a non-inversion) or ⊖ (resp., an inversion).
Assume first that there is a closest ancestor u ′ of u that is decorated with an S-gadget. In this case, we claim that v is decorated by ⊖ if d(u, u ′ ) is odd, and it is decorated by ⊕ if d(u, u ′ ) is even. Indeed, decorations ⊕ and ⊖ alternate, and, by construction of the packing bijection, the vertex just above an S-gadget is decorated by a ⊖.
It remains to analyse the case where u is decorated by ⊛, as well as all vertices on the path from u to the root r of P . By construction, this implies that the root of T is decorated by ⊖. So, using again the alternation of ⊕ and ⊖ in T , the decoration of v ∈ T is ⊖ if d(u, r) is even, and ⊕ if d(u, r) is odd.
We note in particular that the pattern induced by a set I of leaves in P is determined by any fringe subtree containing all leaves of I and rooted at any vertex decorated with an S-gadget. Example 2.20. Let ν = 13 12 5 3 4 2 6 11 9 10 1 7 8 be a permutation in C not⊕ with associated canonical and packed trees T = CT(ν) and P = PA(T ) shown in Fig. 6. We explain in the following example how to read out in P the pattern induced by the leaves ℓ 1 , ℓ 2 and ℓ 3 .
The closest common ancestor u ∈ P of ℓ 1 and ℓ 2 is decorated with a ⊛, which is at distance 1 from its closest ancestor decorated with an S-gadget. We can conclude that the leaves ℓ 1 and ℓ 2 induce an inversion (the closest ancestor v of ℓ 1 and ℓ 2 in T carries a ⊖ decoration). Now consider ℓ 1 and ℓ 3 . Their closest common ancestor u ′ in P is decorated with an S-gadget. Note ℓ 1 and ℓ 3 are descendants of the first and fifth children of this S-gadget; the corresponding leaves of the S-gadget have the vertex decorated by 2413 as common ancestor and are attached to the branches corresponding to 2 and 3. We deduce that ℓ 1 and ℓ 3 do not form an inversion in ν. Similarly, ℓ 2 and ℓ 3 do not form an inversion either in ν.
Putting all together, the pattern induced by ℓ 1 , ℓ 2 and ℓ 3 is 213. Let us check that it is indeed the case, by reading this pattern on the permutation. These three leaves correspond to the 4th, 6th and 12th elements of the permutation respectively, which have values 3, 2 and 7. The induced pattern is indeed 213.

Random permutations and conditioned Galton-Watson trees
Throughout this section and the rest of the paper we assume that C is a proper substitution-closed class of permutations, that is we exclude the case where C is the class of all permutations. To avoid trivial cases, we furthermore assume that 12, 21 ∈ C.
Theorem 2.19 allows us to see a uniform random permutation ν n of size n in the substitution-closed permutation class C as a uniform random forest of packed trees with n leaves. In the present section we apply Gibbs partition methods [49] to show that a giant component with size n − O p (1) emerges, and the small fragments admit a limit distribution. This goal is achieved in Proposition 3.2. Since the size of the small fragments is stochastically bounded, this reduces the study of ν n to that of a uniform random packed tree with n vertices. The strength of this approach is that we do not need to make any additional assumptions on the class C.

Enumerative observations
Theorem 2.19 implies that the generating series of the class C satisfies .
From the definition of packed trees, we deduce the following equation for their generating series: where Q(u) = G(S)(u) is defined as the generating function of G(S). Via basic algebraic manipulations, we rewrite this as A decorated tree approach to random permutations in substitution-closed classes By definition, an S-gadget is described by a simple permutation of size say k, and k elements, which are either atoms (elements of size one) or increasing permutations of size at least two. Therefore and consequently, Since we assumed that C is proper, a celebrated result by Marcus and Tardos [41] states that the generating series C(z) has positive radius of convergence. Hence the same holds for S(z), and consequently, for Q(z) and R(z). A general result on solutions of implicit equations (such as (3.3)) [49,Lem. 3.3] implies that the n-th coefficient p n of P(z) satisfies the subexponentiality condition as n → ∞, with 0 < ρ P < ∞ denoting the radius of convergence of P(z). This even implies P(ρ P ) < 1. that the number c n of n-sized permutations in C satisfies c n ∼ p n (1 − P(ρ P )) 2 . A classical bijection due to Ehrenborg and Méndez [21] consequently allows us to identify the class P with the class of R-enriched trees. The recursive equation P(z) = zR(P(z)) with R given in Eq. (3.4) is actually a consequence of this general bijection.

A giant ⊕-indecomposable component
Let ν be a permutation in the proper substitution-closed class of permutations C.
From Theorem 2.5, we know that In the first case, we set d = 1 and ν (1) = ν for convenience. Recall that Theorem 2.18 allows us to identify the classes C not⊕ and P. The subexponentiality condition (3.6) allows us to apply the Gibbs partition result [49,Thm. 3.1] to obtain the following result (only the first part will be useful in this paper, but we state the whole version for completeness): A decorated tree approach to random permutations in substitution-closed classes Moreover, the other components converge jointly in distribution: denoting independent copies of a Boltzmann-distributed random object ν with distribution given by P(ν = ν) = ρ |ν| P /P(ρ P ). Remark 3.3. We excluded the case of uniform unrestricted n-sized permutations. In this case, it is well-known that the permutation is with high probability ⊕-indecomposable.
This follows for example from [48,Cor. 6.19] in the tree literature or from [16,Thm 3.4] in the permutation literature.

From permutations to simply generated trees
Proposition 3.2 and Lemma 2.18 reduce the study of the proper substitution-closed class C to the study of the class P of packed trees. In this section, we explain how a random tree in P can be seen as a random simply generated tree with random decorations. This result may be seen as a special case of a sampling procedure [48,Sec. 6.4] for general enriched trees with a fixed number of leaves (so called enriched Schröder parenthesizations), but we present it in our specific setting to make the article more self-contained.
We can describe a packed tree P as a pair (T, λ T ) where T is a rooted plane tree and λ T is a map from the internal vertices of T to the set Q = G(S) which records the decorations of the vertices.
In order to sample a uniform packed tree with n leaves, we first simulate a random rooted plane tree T n and then a random decoration map λ Tn as follows.
Define the weight-sequence q = (q k ) k≥0 , where, for k ≥ 2, q k denotes the k-th coefficient of the generating series Q(z) = G(S)(z), while we set q 0 = 1 and q 1 = 0. We consider the simply generated tree T n (with n leaves) associated with weight-sequence q, i.e., by definition, T n is a random rooted plane tree such that for all rooted plane trees T with n leaves (we recall that V int (T ) denotes the set of internal vertices of T ). Here, Z n is the partition function given by where the sum runs over all rooted plane trees with n leaves. For a general introduction about simply generated trees see [31,Section 2.3].
Then, given a rooted plane tree T , let λ T be the random map such that for all internal vertices v of T , A decorated tree approach to random permutations in substitution-closed classes Lemma 3.4. The random packed tree P n = (T n , λ Tn ) is uniform among all the packed trees with n leaves.
Proof. Let P = (T, λ) be a packed tree with n vertices. Then (3.11) where in the second equality we use Eqs. (3.9) and (3.10).

Random packed trees as conditioned Galton-Watson trees
Building on Theorem 3.4, in what follows we explain how to sample a uniform packed tree with n leaves as a randomly decorated Galton-Watson tree conditioned on having n leaves. Again, we refer to [48,Sec. 6.4] for a discussion in a more general context.
Let ρ q denote the radius of convergence of the generating series Q(z). As observed in Section 3.1, it holds that ρ q > 0. As we shall see, this implies that T n has the distribution of a Galton-Watson tree conditioned of having n leaves, whose offspring distribution ξ is defined below (for similar discussion with fixed number of vertices, see [31,Section 4]).
The offspring distribution ξ is given by with a, t 0 > 0 constants that are defined as follows. If lim zրρq Q ′ (z) ≥ 1, let 0 < t 0 ≤ ρ q be the unique number with Q ′ (t 0 ) = 1. If the limit is less than 1, then set t 0 = ρ q . Finally Note that the tilting in Eq. (3.12) previously appeared in [45, Proposition 2] (see also the discussion above Corollary 1 in the same paper).
We note that ξ is always aperiodic since q k > 0 for k ≥ 2 (because of the ⊛ decorations). Moreover, we have (3.13) so that the Galton-Watson tree T ξ of offspring distribution ξ is either subcritical or critical. It is a simple exercise to check that T ξ , conditioned on having n leaves, has the same distribution as the simply generated tree T n defined by Eq. (3.9).
To end this section, we characterize when this Galton-Watson tree model is critical.
Below, we write S ′ (ρ S ) for lim zրρ S S ′ (z), noting that this limit may be infinite.
In this case, For the convenience of the reader, we note that the relation between t 0 and κ can be rewritten as κ = t0 A decorated tree approach to random permutations in substitution-closed classes Proof. It holds that We perform the formally substitution z = y/(1 + y) (which implies z = ρ q ⇔ y = ρ S ). This

Semi-local convergence of the skeleton decomposition
The previous section establishes a connection between uniform random permutations and conditioned Galton-Watson trees. In this section, we provide a convergence result for skeletons induced by marked vertices in such trees. The application to permutations will be discussed in further sections.
Aldous [6,Eq. (49)] showed that the subtree spanned by a fixed number of random marked vertices in a large critical Galton-Watson tree admits a limit distribution. Here, we extend this skeleton decomposition so that it additionally describes the asymptotic local structure in o( √ n)-neighbourhoods around the marked vertices and their pairwise closest common ancestors. Note also that Aldous works with Galton-Watson trees conditioned on having n vertices, while we more generally consider Galton-Watson trees conditioned on having n vertices with out-degree in a given set Ω (see [46] or [37] for scaling limit results under such conditioning).

Extracting the skeleton with a local structure
Let k ≥ 1 denote a fixed integer and T a (rooted) plane tree. We choose an ordered sequence v = (v 1 , . . . , v k ) of vertices in T (possibly with repetitions) that we call marked vertices. The goal of this section is to associate to this data some object recording: • the genealogy between the marked vertices; • the local structure around the essential vertices of T , which we define as the root of T , the marked vertices v 1 , . . . , v k and their pairwise closest common ancestors; • the distances in the original tree between these vertices.
The reader can look at Fig. 7 to see the different steps of the construction.
The first step is to consider the subtree R(T, v) consisting of the vertices v and all their ancestors. For each 1 ≤ j ≤ k the vertex v j in R(T, v) receives the label j. Note that the tree T may be constructed from the skeleton R(T, v) by attaching an ordered sequence of branches (rooted plane trees) at each corner of R(T, v). Here we have to consider the corner below the root-vertex twice, since branches at this corner may either be added to the left or to the right of R(T, v).  Figure 7: A tree T with two marked vertices v 1 and v 2 . In the left-most picture, the subtree R(T, v) is represented in bold, while branches attached to its corner are drawn with thinner lines. The middle picture represent R [1] (T, v): the essential vertices are in blue, and only one vertex of R(T, v) is at distance more than 1 from the closest essential vertex. The branches attached to that vertex do not belong to R [1] (T, v). The right-most picture represent s.R [1] (T, v). In particular, observe that the two middle edges of the path between the root and the branching vertex have been contracted into a single edge with label 2s.
The second step is to remove the vertices of T which lie outside of the skeleton R(T, v) and are "far" from the essential vertices. For convenience, we call distance of any branch B (grafted on R(T, v)) from a vertex w ∈ R(T, v) the distance in R(T, v) from w to the corner where B is attached. For any integer t ≥ 0, we let R [t] (T, v) denote the subtree of T that contains R(T, v) and all branches grafted on R(T, v) that have distance at most t from at least one essential vertex. In particular, R [t] (T, v) contains all vertices of T that lie at distance at most t from the essential vertices.
The final step of the construction is to shrink the paths of R [t] (T, v) consisting of the vertices whose attached branches have been removed in step 2. Indeed, we are interested in a scenario where the distance between any two essential vertices is much larger than 2t. Consider two essential points x = y that are connected by a path not containing other essential vertices. Assume that x lies on the path from the root to y. If the distance between x and y is larger than 2t, then the path joining x and y consists of a starting segment of length t that starts at x, a middle segment of positive length, and an end segment of length t that ends at y. By construction, the branches attached to inner vertices of the middle segment of R(T, v) do not appear in R [t] (T, v). For any real number s > 0, we let s.R [t] (T, v) denote the result of contracting each middle segment to a single edge that receives a label given by the product of s and the number of deleted vertices in this segment.

The space of skeletons with a local structure
In the following, we will need to be more precise about the space in which s.R [t] (T, v) lives and the topology we consider on it. In the above construction, s.R [t] (T, v) is a tree with k distinguished vertices with outdegree in Ω, where at most 2k − 1 edges have a (length-)label. Moreover, the distances between successive essential vertices are at most 2t + 1 (we say that two essential vertices are successive if the path going from one to the other does not contain any other essential vertex). The set of trees (without edge-labels) with k marked distinguished vertices with outdegree in Ω such that the above distance condition holds is denoted T k,Ω . Moreover, we say that G in T k,Ω is generic if: A decorated tree approach to random permutations in substitution-closed classes • there are 2k distinct essential vertices (the root, the k distinguished vertices and k − 1 closest common ancestors of pairs of distinguished vertices); • the distances between successive essential vertices are exactly 2t + 1.
We note that the edges with (length-)label are middle edges of the paths of length 2t + 1 between essential vertices, and hence depend only on the shape of the tree. We can therefore encode these labels as a vector in R 2k−1 , that has entries equal to 0 whenever the corresponding essential vertices are at distance 2t or less. Finally, s.R [t] (T, v) can be seen as an element of k,Ω × (R + ) 2k−1 .
(A similar identification is done by Aldous throughout the article [6].) Using the discrete topology on T [t] k,Ω and the usual one on R 2k−1 , this gives a topology on T

[t]
k,Ω × (R + ) 2k−1 , and then it makes sense to speak of convergence in distribution in this space. We can also speak of density, taking as reference measure the product of the counting measure on T

[t]
k,Ω and the Lebesgue measure on (R + ) 2k−1 . Finally we denote by Sh and Lab the natural projections from T k,Ω and (R + ) 2k−1 , respectively. In words Sh erases the labels and output the shape of the tree, while Lab outputs the vector of labels.

The limit tree
Throughout Section 4 we let T denote a (non-degenerate) critical Galton-Watson tree having an aperiodic offspring distribution ξ. We also assume that ξ has finite variance σ 2 . We fix a subset Ω ⊆ N 0 satisfying P(ξ ∈ Ω) > 0.   Furthermore, for any fixed integer k ≥ 1 we say a proper k-tree is a (rooted) plane tree that has precisely k leaves, labelled from 1 to k, such that the root has outdegree 1 and all other internal vertices have outdegree 2. Note that each such tree has 2k − 1 edges and that there are k!Cat k−1 = 2 k−1 k−1 i=1 (2i − 1) such trees. Indeed, up to the single edge attached to the root, these trees are complete binary trees with k leaves and a labelling of these leaves. In the following, we order the edges of proper k-tree in some canonical order (e.g. depth first search order), so that we can speak of the i-th edge of the tree; the chosen order is not relevant though.
For each integer t ≥ 1 we can now construct a random rooted plane tree T k,t Ω with k distinguished vertices labelled from 1 to k having outdegree in Ω, and 2k − 1 edges having length-labels. We will prove later that this tree is the limit of R [t] (T Ω n , v). A special case of this construction is illustrated in Fig. 8. The general procedure goes as follows: Step 1 Step 2 Steps 3 and 4 at random with density It is easy to check that this defines a probability distribution, using classical expressions for absolute moments of Gaussian distribution. For each 1 ≤ i ≤ 2k − 1, we replace the i-th edge of the k-tree by a path of length 2t + 1 and assign label s i to the central edge of this path.

(Thicken it)
Each internal vertex receives additional offspring, independently from the rest. Here vertices with outdegree 1 receive additional offspring according to an independent copy ofξ − 1, while vertices with outdegree 2 receive additional offspring according to an independent copy of ξ * − 2. An ordering of the total offspring that respects the ordering of the pre-existing offspring is chosen uniformly at random.

(Graft branches)
Each distinguished vertex (i.e., each leaf of the original k-tree) becomes the root of an independent copy of T conditioned on having root-degree in Ω. Other leaves of the tree resulting from step 3 become the roots of independent copies of Galton-Watson trees T , without conditioning.
A decorated tree approach to random permutations in substitution-closed classes k,Ω × (R + ) 2k−1 , the random tree T k,t Ω has density where, for a generic G in T [t] k,Ω and u in (R + ) 2k−1 , we have Proof. By construction, it is clear that Sh(T k,t Ω ) and Lab(T k,t Ω ) are independent, that Sh(T k,t Ω ) is generic and that Lab(T k,t Ω ) has density h. We only need to prove that, for any generic G in T For a non-branching internal vertex of outdegree d in R(G, u), the correct number of children is chosen with probability P(ξ = d) and the correct ordering with probability d −1 . Again, multiplying these two, we get • In step 4 of the construction, we need to choose copies of T or T conditioned to have root outdegree in Ω (the black and green triangles in Fig. 8) corresponding to that in G. The probability of this event is given as a product as follows. For each distinguished vertex v of outdegree d, we have a factor P(ξ = d)/P(ξ ∈ Ω) (the denominator comes from the conditioning that the outdegree of such vertex is in Ω). For vertices in G \ R(G, u) of outdegree d, we have a factor P(ξ = d).
Summing up, since there are k − 1 branching vertices and k distinguished ones, we get that P Sh(T k,t and the factors 2 k−1 in the numerator and denominator cancel out.
A decorated tree approach to random permutations in substitution-closed classes

Convergence
The following lemma extends Aldous' skeleton decomposition [6,Eq. (49)] by keeping track of o( √ n)-neighbourhoods near the essential vertices of the skeleton. The o( √ n)-threshold is sharp (for the applications in this paper, the convergence of t nneighbourhoods for any sequence t n tending to infinity would suffice). We note that o( √ n)-neighbourhoods of the root have been previously considered in the literature, e.g. by Aldous [4,5] and Kersting [36]; see also [51,Theorem 5.2] for a result on the o( √ n)-neighbourhood of a uniform random vertex in the tree. Besides, Lemma 4.2 is also related to scaling limits obtained by Kortchemski [37] and Rizzolo [46], that imply convergence of R(T Ω n , v). We recall that we see trees of the from s.R [t] (T, u) and T k,t Ω as elements of the space k,Ω × (R + ) 2k−1 as explained in Section 4.2. Lemma 4.2. Suppose that the offspring distribution ξ is critical, aperiodic, and has finite variance σ 2 . Let v be a vector of k ≥ 1 independently and uniformly selected vertices with outdegree in Ω of the conditioned tree T Ω n . Then for each constant positive integer t it holds that with c Ω = P(ξ ∈ Ω). Even stronger, for each sequence t n = o( √ n) of positive integers it holds that Proof. We fix some sequence t n and let, for each n, (G, x) be an element in T [tn] k,Ω × (R + ) 2k−1 , with G generic and x taking integer coordinates.
We also fix constants b > a > 0 and a sequence s n with s n = o(n). The core of the proof consists in establishing that, as n → ∞, we have uniformly on pairs (G, x) such that √ n] and |G| Ω ≤ s n (recall that p G is defined in (4.5) and h(·) in (4.6)). Assume temporarily (4.10). Summing over all possible values of x (such that √ n], and making a go to 0, b go to +∞), we have uniformly on trees G with |G| Ω ≤ s n . Moreover, conditionally on the shape of this skeleton being G, Eq. (4.10) gives a local limit theorem for Lab(1.R [tn] (T Ω n , v)) with scaling factor c Ω σn −1/2 and limiting distribution with density h. This implies convergence in distribution of Lab(c Ω σn −1/2 .R [tn] (T Ω n , v)) to a random variable of density h. Comparing with Theorem 4.1, we see that Eq. (4.10) implies Eq. (4.8). Proving Eq. (4.9) needs an extra ingredient and we come back to it at the end of the proof.
To prove Eq. (4.10), we need some additional notation. First, we write ℓ = 2k−1 i=1 x i . Additionally, we let (X i ,ξ i ) i≥1 be independent copies of |T | Ω andξ. Finally, we also set  A decorated tree approach to random permutations in substitution-closed classes The proof of Eq. (4.10) is splitted in two parts, respectively of combinatorial and analytic nature. The combinatorial part shows that . (4.13) We do this by decomposing combinatorially pairs (T ⋆ , v ⋆ ) (i.e. trees with distinguished vertices) such that 1.
The analytic part, based on a standard local limit lemma, then analyzes the numerator of the last factor and shows that (4.14) uniformly on integers ℓ in [a √ n, b √ n], and on trees G with |G| Ω ≤ s n .
Finally, an estimate for the denominator in (4.13) is given e.g. in [37, Thm. 8.1]: We leave the reader check that, after many obvious cancellations, plugging in the estimates (4.14) and (4.15) into (4.13) gives indeed (4.10). The combinatorial part: proof of (4.13). We first consider the unconditioned Galton-  Good pairs (T ⋆ , v ⋆ ) can be constructed as follows.
i) We start from (G, x) and replace each edge with a length label x i by a path with x i internal vertices; in total, this operation creates ℓ new vertices, which we will refer to as the remote vertices.
ii) We choose the outdegrees (d i ) i≤ℓ in T ⋆ of the ℓ remote vertices of (T, u); iii) For each remote vertex, choose the distinguished offspring along which we have to proceed to get to the first descendant that is an essential vertex (d i possible choices).
iv) On each of the m := d i − 1 other children of the remote vertices, we attach a fringe subtree tree (A j ) j≤m . v) To ensure that |T ⋆ | Ω = n, the degrees (d i ) i≤ℓ and the subtrees (A j ) j≤m should be chosen such that |G| Ω + A decorated tree approach to random permutations in substitution-closed classes Moreover, if (T ⋆ , v ⋆ ) corresponds in this construction to sequences (d i ) i≤ℓ and (A j ) j≤m , then we have (V G denoting the set of vertices of the tree G) (This probability is independent from the choices made in step iii)). The sum over good pairs (T ⋆ , v ⋆ ) in Eq. (4.16) can be rewritten as a sum over sequences of positive integers (d i ) i≤ℓ and sequences of trees (A j ) j≤m , with an extra factor i d i coming from the choices in item iii) above. We get The sum in the last line is the probability that the total number of vertices with outdegree in Ω in m independent copies of T is n − |G| Ω − ℓ i=1 1 di∈Ω , i.e., with the notation Eq. (4.12), this is P( With the notation Eq. (4.12), the right-hand side can be simplified as Dividing by P(|T | Ω = n) gives Eq. (4.13), as wanted.
The analytic part: proof of (4.14). We are now looking for an asymptotic estimates for the probability P (S L = n − |G| Ω − Q). Since S m is a sum of m i.i.d. random variables with the same law as |T | Ω , this asymptotics depends on the tail of the distribution of |T | Ω . We recall from (4.15) that    The law of large numbers tells us that Moreover, from standard deviation estimates, there is a sequence ǫ n → 0 with Moreover, the Azuma-Hoeffding inequality implies that for large enough M > 0 Thus we obtain  Since by (4.11), we have Let us check (4.20). By construction of T k,tn Ω (see Fig. 8), we have: where the summands in the right-hand side are independent and distributed as follows: A decorated tree approach to random permutations in substitution-closed classes and, as above, S M denotes the sum of M independent random variables (X i ) i≤M of law |T | Ω , the X i being also independent of M; • S ′ k is the sum of k i.i.d. random variables of law |T | Ω , conditioned on the root of T having outdegree in Ω; • 4kt n is an upper bound for the number of vertices on the stretched skeleton of T k,tn Ω having outdegree in Ω.
Hence (4.20) holds if we select s n = o(n) such that t n / √ s n → 0, which is clearly possible since t n = o( √ n). This completes the proof.
The above proof essentially also gives a local version of Lemma 4.2, which we believe to be of independent interest, and state below as Theorem 4.3.

Lemma 4.3.
Let the offspring distribution ξ be critical, aperiodic, and have a finite variance. Let v be a vector of k ≥ 1 independently and uniformly selected vertices with outdegree in Ω of the conditioned tree T Ω n . Besides, we fix sequences t n and s n of non-negative integers satisfying t 2 n = o(s n ) and s n = o(n). Then there exists sequence a n and b n tending to 0 and +∞ respectively such that: k,Ω × (R + ) 2k−1 with G generic verifying the conditions |G| Ω ≤ s n and a n √ n ≤ x 1 ≤ b n √ n; ii) and, if we write (G, x) := 1.R [tn] (T Ω n , v), then the following properties hold with high probability as n becomes large: G is generic, |G| Ω ≤ s n and a n √ n ≤ x 1 ≤ b n √ n.
A decorated tree approach to random permutations in substitution-closed classes Proof. The estimates i) with a fixed a instead of a n and a fixed b instead of b n has been proved in Eq. (4.10) above. The existence of sequences a n and b n such that i) holds is a direct consequence, using the following elementary analysis lemma.
Let F (A, n) be a bivariate function, nonincreasing in A. We assume that for any A > 0, we have lim n→∞ F (A, n) = 0. Then, there exists a sequence A n tending to 0 such that F (A n , n) tends to 0.
Finally ii) holds for any sequences a n and b n and any s n with t 2 n = o(s n ), as a consequence of Eqs. (4.9) and (4.20).
Finally, the following statement will be useful (with Ω = {0}, i.e. marking leaves) in the special case of separable permutations. It can either be proved as a corollary of Theorem 4.3, or similarly to Lemma 6.2 in [9].

Corollary 4.4.
Let the offspring distribution ξ be critical, aperiodic, and have a finite variance. Let v be a vector of k ≥ 1 independently and uniformly selected vertices with outdegree in Ω of the conditioned tree T Ω n . Then, for any fixed t, asymptotically as n → ∞, the parities of the heights of the essential vertices induced by v (except the root of T Ω n ) converge to Bernoulli random variables of parameter 1/2, independent among themselves, and from the tree Sh(1.R [t] (T Ω n , v)).

Background on permuton convergence
As said in introduction, a permuton µ is a Borel probability measure on the unit square [0, 1] 2 with uniform marginals, that is By definition, a random permutation ν n converges weakly to a random permuton µ as n → ∞ if the random probability measure µ νn converges weakly to µ. There are different characterisations for this form of convergence [7,Thm. 2.5]. In particular, if ν n has size n, then the following statements are equivalent: i) There exists a permuton µ such that µ νn ii) For any integer k ≥ 1 the pattern pat I n,k (ν n ) induced by a uniform random kelement subset I n,k ⊆ [n] admits a distributional limit ρ k .
In this case, the limit family (ρ k ) k is consistent in the sense that ρ k has size k a.s. for all k and pat I n,k (ρ n ) d = ρ k for all 1 ≤ k ≤ n. The permuton µ may be constructed from the family (ρ k ) k≥1 , and is hence uniquely determined by it. In fact, there is a bijection between random permutons and consistent families [7,Prop. 2.9]. (Compare with a similar result for random trees [6,Thm. 18].) The following permutons were introduced in [7,9] where they were proved to be the limit of some substitution-closed classes. (See also [40] for some properties of these permutons.) A decorated tree approach to random permutations in substitution-closed classes i) The Brownian separable permuton corresponds to the case where ρ k is the image by CT −1 of a uniform binary plane tree with k leaves with uniform independent decorations from {⊕, ⊖} on its internal vertices. (Recall from Theorem 2.9 that CT −1 can be applied to {⊕, ⊖}-decorated trees, where neighbours may have the same sign.) ii) Let 0 < p < 1 be a constant. The biased Brownian separable permuton of parameter p is constructed in the same way, but instead of assigning the ⊕ / ⊖ decorations via fair coin flips, we toss a biased coin that shows ⊕ with probability p.
Putting together the pattern characterization of permuton convergence (recalled above), this description of ρ k , and the connection between patterns and subtrees explained in Section 2.5, we get a convenient sufficient condition for the convergence to a (biased) separable Brownian permuton.
To state it, we recall that, if ℓ is an ordered sequence of marked leaves in a tree T , then R(T, ℓ) denotes the subtree consisting of these marked leaves and all their ancestors. In addition, we denote by R ⋆ (T, ℓ) the tree obtained from R(T, ℓ) by successively removing all non-root vertices of outdegree 1, merging their two adjacent edges.
Lemma 5.1. Let p be a constant in [0, 1] and, for each n ≥ 1, ν n be a random permutation of size n. For each fixed k ≥ 1, we take a uniform random sequence ℓ = (ℓ 1 , . . . , ℓ k ) of k leaves in the canonical tree T n of ν n . We make the following assumptions.
• The tree R ⋆ (T n , ℓ) should converge (in distribution) to a proper k-tree.
• For each non-root internal vertex u of R ⋆ (T n , ℓ), we choose arbitrarily two leaves from ℓ, say ℓ iu and ℓ ju , whose common ancestor is u. We then assume that ℓ iu and ℓ ju form a non-inversion asymptotically with probability p, and that, when u runs over non-root internal vertices of R ⋆ (T n , ℓ), these events are asymptotically independent from each other and from the shape R ⋆ (T n , ℓ).
Then ν n converges to the biased separable Brownian permuton of parameter p.
The arbitrary choices made in the second item above are irrelevant. Indeed, when u has out-degree 2 in R ⋆ (T n , ℓ) (which is the case with high probability under the first assumption), the fact that ℓ iu and ℓ ju form an inversion or not does not depend on the choice of ℓ iu and ℓ ju (this an easy consequence of the discussion from Section 2.5).

Permuton convergence of random permutations from substitution-closed classes
We now prove our first main theorem, Theorem 1.1. We start by stating this theorem more precisely.
Proof. By Proposition 3.2, it suffices to show that the uniform n-sized permutation ν n from C not⊕ satisfies We first consider the separable case S = ∅. Let T n be the canonical tree of ν n . Here a vertex of T n is decorated with ⊖ if and only if it has even height. Without its decorations, T n has the law of a critical Galton-Watson tree with finite variance conditioned on having n leaves (see [9,Sec. 2.2] or Section 3; for the separable case, packed trees and canonical trees only differ by their decorations).
Let k ≥ 1 be given and ℓ = (ℓ 1 , . . . , ℓ k ) be a uniform random sequence of leaves in T n . It follows from Lemma 4.2 that R ⋆ (T n , ℓ) is asymptotically a uniform random proper k-tree. Corollary 4.4 yields the additional information that the parities of the lengths of the 2k − 1 paths in T n corresponding to the edges of R ⋆ (T n , ℓ) converge jointly to 2k − 1 independent fair coin flips, independently of the shape R ⋆ (T n , ℓ). Hence in the limit as n → ∞ each non-root internal vertex of R ⋆ (T n , ℓ) receives a sign ⊕ or ⊖ with probability 1/2 (meaning that the corresponding leaves, in the sense of the second item of Theorem 5.1, form an inversion with probability 1/2); moreover, these events are asymptotically independent from each other and from the shape R ⋆ (T n , ℓ). As this holds for all k ≥ 1, thanks to Theorem 5.1, it follows that ν n converges in distribution to the Brownian separable permuton µ (1/2) .
Let us now consider the case S = ∅. In this case, it is more convenient to work with packed trees rather than canonical trees (note however that both trees have the same set of leaves). In particular, the random packed tree P n = (T n , λ Tn ) associated with the uniform permutation ν n in C not⊕ is a Galton-Watson tree with a specific offspring distribution ξ conditioned on having n leaves, with independent random decorations on each vertex (see Section 3). As before, we fix k ≥ 1 and consider a uniformly selected set of distinct leaves ℓ = (ℓ 1 , . . . , ℓ k ) in T n . By Theorem 4.2, we know that R ⋆ (T n , ℓ) converges (in distribution) to a uniform proper k-tree (recall that ξ is always aperiodic and that it has expectation 1 and finite variance by assumption, as needed to apply Theorem 4.2). In particular, the tree R ⋆ (T n , ℓ) is a proper tree (with a root of outdegree 1 and other internal vertices of outdegree 2) with high probability, as n → ∞. When this is the case, since the packing construction only merges internal vertices, R ⋆ (T n , ℓ) coincide with R ⋆ (T n , ℓ), whereT n is the canonical tree associated with ν n . Therefore, although Theorem 5.1 is stated with the canonical treeT n , we can use it here with the packed tree T n instead.
Using the notation of Theorem 5.1, it remains to analyse whether ℓ iu and ℓ ju form an inversion or not (for non-root internal vertices u of R ⋆ (T n , ℓ)).
We recall (see Section 2.5) that if u is decorated with an S-gadget, then whether ℓ iu and ℓ ju form an inversion or not is determined by the decoration of u and by which branches attached to u contain ℓ iu and ℓ ju . This information is contained in Sh(s.R [0] (T n , ℓ)) for any s > 0.
A decorated tree approach to random permutations in substitution-closed classes On the other hand, if u is decorated with ⊛, then in order to determine whether ℓ iu and ℓ ju form an inversion or not, we have to recover the parity of the distance of u to its first ancestor decorated with an S-gadget (if it exists, otherwise to the root of T n ). Take t n tending to infinity, but with t n = o( √ n). By Lemma 4.2, u has asymptotically t n ancestors with out-degreesξ 1 ,ξ 2 , . . . ,ξ tn being independent copies ofξ defined in (4.2). Moreover the vertex u and each of its ancestors receive a decoration that gets drawn independently and uniformly at random among all G(S)-decorations with size equal to the out-degree of the vertex. In this setting, with high probability, one of the t n ancestors will receive an S-gadget as decoration. Therefore, with high probability, whether ℓ iu and ℓ ju form an inversion is determined by Sh(s.R [tn] (T n , ℓ)) for any s > 0. We say that two families (indexed by N) of probability distributions are close when their total variation distance tends to 0 as n tends to infinity. By Lemma 4.2, the distributions of the random trees Sh(s n .R [tn] (T n , ℓ)) and Sh(T k,tn {0} ) are close, for a wellchosen sequence s n . From the above discussion, this implies that the joint distributions of R ⋆ (T n , ℓ) and 1 ℓ iu and ℓ ju form an inversion in (T n , ℓ) u are close to the distributions of the same variables in the limiting tree T k,tn {0} . When n tends to infinity, these tend a.s. (with the obvious coupling between the T k,tn {0} ) to the same variables in T k,t * {0} , where t * denotes the minimal radius such that each internal ⊛-decorated essential vertex (different from the root) has an ancestor decorated by an S-gadget.
In the limiting tree T k,t * {0} , the neighbourhoods of the essential vertices u are independent from each other and all have the same distribution (which does not depend on k, nor on the shape R ⋆ (T k,t * {0} , ℓ), the latter being the proper k-tree taken at step 1 of the construction). Therefore the probability that ℓ iu and ℓ ju form a non-inversion tend to some parameter p in [0, 1], which depends only on the permutation class C we are working with. Moreover these events are asymptotically independent from each other and from the shape R ⋆ (T n , ℓ). From Theorem 5.1, this implies that µ νn d −→ µ (p) . It remains to calculate an explicit expression for the limiting probability p. For this, we consider k = 2, i.e., p is the probability that, in the limiting tree T 2,t * {0} , the two marked leaves ℓ 1 and ℓ 2 form a non-inversion.
For each integer m ≥ 1, let G m be drawn uniformly at random among all m-sized G(S)-objects, i.e., P(G m = G) = 1/q m , for all G ∈ G(S) of size m, where we recall that Q(z) = G(S)(s) = k≥2 q k z k is the generating function in Eq. (3.5).
We also recall the following three distributions (see Eqs. where t 0 = κ 1+κ is the parameter determined in Theorem 3.5 as the unique number such that S ′ (κ) = 2/(1 + κ) 2 − 1, S(z) being the generating functions for simple permutations in the considered substitution-closed class C.
To determine whether ℓ 1 and ℓ 2 form an inversion or not, there are two cases to consider, depending on whether the decoration λ T k,t * {0} (u) =: λ(u) of the closest common ancestor u of ℓ 1 and ℓ 2 is an S-gadget or not.
We start with the case where it is not. Let u ′ be the closest ancestor of u that is decorated with an S-gadget. The limiting probability for ℓ 1 and ℓ 2 to form a non-inversion A decorated tree approach to random permutations in substitution-closed classes in this case is given by where we used that u take offsprings according to ξ * . Since the ancestors of u (between u and u ′ ) take offsprings according toξ, we have P d(u, u ′ ) is even and > 0 λ(u) = ⊛ = P(Geom(η) is even and > 0) = where in the last equality we used that k≥2 k(k − 1)t k−1 Now consider the case where the decoration λ(u) is an S-gadget. That is, it consists of a root decorated with a simple permutation with several branches, each of which may be a leaf or a ⊕-decorated vertex to which at least 2 leaves are attached. By definition, the leaves ℓ 1 and ℓ 2 are descendants of different children of u, say the i 1 -th and i 2 -th. These i 1 -th and i 2 -th branches attached to u identify two leaves (the i 1 -th and i 2 -th) of the S-gadget λ(u) decorating u. If these two leaves belong to the same branch attached to the root of λ(u), then ℓ 1 and ℓ 2 do not induce an inversion, as their closest common ancestor in λ(u) is decorated by ⊕ (see also Section 2.5). Otherwise, when they belong to two different branches attached to the root of λ(u) (say the j 1 -th and j 2 -th), it depends on the simple permutation α appearing in the root of λ(u): ℓ 1 and ℓ 2 do not induce an inversion if and only if pat {j1,j2} (α) = 12. Therefore, in the case where the decoration λ(u) is an S-gadget, the limiting probability for ℓ 1 and ℓ 2 to form a non-inversion is given by Summing over the positions of the two uniformly chose, this probability is easily seen to be where the last equality is obtained via a computer algebra system. Summing-up, and so P λ(u) =⊛, i 1 , i 2 are in the same branch of λ(u) where we used the formal power series identity to go from the third to the fourth line, and in the last equality we used Eq. (5.3) and κ = t0 1−t0 . It remains to compute the second term in Eq. (5.5). Using the obvious notation S ≤k = ∪ j≤k S j , we start by determining P i 1 , i 2 are not in the same branch of λ(u), pat {j1,j2} (α) = 12 λ(u) = ⊛, d + (u) = k = α∈S ≤k P i 1 , i 2 are not in the same branch of λ(u), pat {j1,j2} (α) = 12 λ(u) ∈ G k α · P λ(u) ∈ G k α λ(u) = ⊛, d + (u) = k , (5.8) A decorated tree approach to random permutations in substitution-closed classes where G k α denotes the set of S-gadgets of size k with root-label α. Trivially, recalling that k−1 |α|−1 is the number of S-gadgets with k leaves and root decorated by α, we have Using again the formula (5.6), where occ(12, α) = |α| it follows that P λ(u) =⊛, i 1 , i 2 are not in the same branch of λ(u), pat {j1,j2} (α) = 12 where, to go from the third to the fourth line, we used that occ(12, α) = occ(12, α) |α| 2 and the formal power series identity k≥a and in the last equality we used that κ = t0 1−t0 (i.e. t 0 = κ 1+κ ) and the definition of Occ 12 .

Local convergence
In this section we investigate the local limits of uniform permutations in a fixed substitution-closed class C. We work under the following assumption.
We highlight that in this section we do not assume the finite variance hypothesis (as done in Section 5). See also Theorem 3.5 for an explicit characterization of this assumption.
Before stating our results, we recall in the following two sections the notions of local convergence for permutations and trees.
A decorated tree approach to random permutations in substitution-closed classes

Local limits for permutations
In this section we recall the definition of local topology for permutations recently introduced by Borga in [13]. We start by defining finite and infinite rooted permutations. Then we introduce a local distance and the corresponding notion of convergence for deterministic sequences of rooted and unrooted permutations. Finally, we extend this notion of convergence (in two non-equivalent ways) to sequences of random unrooted permutations. Definition 6.2. A finite rooted permutation is a pair (ν, i), where ν ∈ S n and i ∈ [n] for some n ∈ N.
We denote with S n • the set of rooted permutations of size n and with S • := n∈N S n • the set of finite rooted permutations. We write sequences of finite rooted permutations in S • as (ν n , i n ) n∈N . To a rooted permutation (ν, i), we associate (as shown in the right-hand side of Fig. 9) the pair (A ν,i , ν,i ), where A ν,i := [−i + 1, |ν| − i] is a finite interval containing 0 and ν,i is a total order on A ν,i , defined for all ℓ, j ∈ A ν,i by Informally, the elements of A ν,i should be thought of as the column indices of the diagram of ν, shifted so that the root is in column 0. The order ν,i then corresponds to the vertical order on the dots in the corresponding columns.
(ν = 4 6 8 5 2 1 9 7 3, i = 4) : Two rooted permutations and the associated total orders. The big red dot indicates the root of the permutation. The vertical grey strip and the relation between the two rooted permutations will be clarified later.
Clearly this map is a bijection from the space of finite rooted permutations S • to the space of total orders on finite integer intervals containing zero. Consequently and throughout the paper, we identify every rooted permutation (ν, i) with the total order (A ν,i , ν,i ).
Thanks to the identification between rooted permutations and total orders, the following definition of infinite rooted permutation is natural. Definition 6.3. We call infinite rooted permutation a pair (A, ) where A is an infinite interval of integers containing 0 and is a total order on A. We denote the set of infinite rooted permutations by S ∞ • . We highlight that infinite rooted permutations can be thought of as rooted at 0. We setS which is the set of all (finite and infinite) rooted permutations.
A decorated tree approach to random permutations in substitution-closed classes We now introduce the following restriction function around the root defined, for every h ∈ N, as follows , .
The above distance entails a notion of convergent sequences of rooted permutations.
For a sequence ν n of unrooted permutations, we consider the sequence of random rooted permutations (ν n , i n ), where i n is a uniform random index of ν n . We say that ν n converges in the Benjamini-Schramm sense if the sequence of random rooted permutations (ν n , i n ) converges in distribution for the above distance d p . This definition is inspired from Benjamini-Schramm convergence for graphs (see [10]). Benjamini-Schramm convergence can be extended in two different ways for sequences of random permutations (ν n ) n≥1 : the annealed and the quenched version of the Benjamini-Schramm convergence. These two different versions come from the fact that there are two sources of randomness, one for the choice of the random permutation ν n , and one for the random root i n . Intuitively, in the annealed version, the random permutation and the random root are taken simultaneously, while in the quenched version, the random permutation should be thought as frozen when we take the random root.
We now give the formal definitions. In both cases, (ν n ) n∈N denotes a sequence of random permutations in S and i n denotes a uniform index of ν n , i.e., a uniform integer in [1, |ν n |]. Definition 6.4 (Annealed version of the Benjamini-Schramm convergence). We say that (ν n ) n∈N converges in the annealed Benjamini-Schramm sense to a random variable ν ∞ with values inS • if the sequence of random variables (ν n , i n ) n∈N converges in distribution to ν ∞ with respect to the local distance d p . In this case we write ν n aBS −→ ν ∞ instead of (ν n , i n ) d −→ ν ∞ . Definition 6.5 (Quenched version of the Benjamini-Schramm convergence). We say that (ν n ) n∈N converges in the quenched Benjamini-Schramm sense to a random measure µ ∞ onS • if the sequence of conditional laws L (ν n , i n ) ν n n∈N converges in distribution to µ ∞ with respect to the weak topology induced by the local distance d p . In this case We highlight that, in the annealed version, the limiting object is a random variable with values inS • , while for the quenched version, the limiting object µ ∞ is a random measure onS • .
A decorated tree approach to random permutations in substitution-closed classes Theorem 6.6. For any n ∈ N, let ν n be a random permutation of size n. Then i) The sequence (ν n ) n∈N converges in the annealed Benjamini-Schramm sense to some ν ∞ if and only if there exist non-negative real numbers (∆ π ) π∈S such that E[ c-occ(π, ν n )] → ∆ π , for all patterns π ∈ S.
ii) The sequence (ν n ) n∈N converges in the quenched Benjamini-Schramm sense to some µ ∞ if and only if there exist non-negative real random variables (Λ π ) π∈S such that w.r.t. the product topology.
Since the variables c-occ(π, ν n ) take values in [0, 1], the quenched Benjamini-Schramm convergence implies the annealed one. The goal of the following sections is to prove that a sequence of uniform permutations in a substitution-closed class converges in the quenched Benjamini-Schramm sense using the packed trees representing permutations. To this end, we need to introduce a local topology for trees.

Local limits for decorated trees
In this section we introduce a local topology for decorated trees with a distinguished leaf (called pointed trees in the sequel). This is a straight-forward adaptation of that for trees with a distinguished vertex introduced by Stufler in [51].
Following the presentation in [48, Section 6.3.1], we start by defining an infinite pointed plane tree U • ∞ (see Fig. 10 below). This infinite tree is meant to be a pointed analogue of Ulam-Harris tree, so that pointed trees will be seen as subsets of it. To construct U • ∞ , we take a spine (u i ) i≥0 that grows downwards, that is, such that u i is the parent of u i−1 for all i ≥ 1. Any vertex u i , with i ≥ 1, has an infinite number of children to the left and to the right of its distinguished offspring u i−1 . The former are ordered from right to left and denoted by (v i L,j ) j≥1 , the latter are ordered from left to right and denoted by (v i R,j ) j≥1 . Each of these vertices not belonging to the spine (u i ) i≥0 is the root of a copy of the Ulam-Harris tree U ∞ . We always think of U • ∞ as a tree with distinguished leaf u 0 . Definition 6.7. A (possibly infinite) pointed plane tree T • is a subset of U • ∞ such that • Any maximal subset of T • contained in one of the Ulam-Harris trees U ∞ of U • ∞ is a plane tree.
We denote with T • the space of (possibly infinite) pointed plane trees.
We say that a pointed tree T • in T • is locally and upwards finite if every vertex has finite degree and the intersection of T • with any one of the Ulam-Harris trees U ∞ of U • ∞ is finite. The set of locally and upwards finite pointed trees will be denoted by T •,luf .
Any finite plane tree T together with a distinguished leaf v 0 may be interpreted in a canonical way as a pointed plane tree T • , such that v 0 is mapped to u 0 . In particular, the backward spine u 0 , u 1 , · · · of the associated pointed plane tree T • is finite and ends at the root of T . Next, we need to extend this notion to decorated trees. Let D be a combinatorial class. We define D-decorated locally and upwards finite pointed trees, as a tree T • in T •,luf , endowed with a decoration function dec : V int (T • ) → D, such that, for each v in V int (T • ), the outdegree of v is exactly size(dec(v)). We denote such a tree with the pair (T • , dec) and the space of such trees as T •,luf D . As above, a decorated tree with a distinguished leaf can be identified with an element of this set.
). We note that, for any given h, the image set with the classical conventions that sup ∅ = 0, sup N = +∞ and 2 −∞ = 0.
Remark 6.8. The distance defined in Eq. (6.3) can be trivially restricted also to the space of non-decorated pointed trees. We point out that this distance is not equivalent to the distance considered in [48, Section 6.3.1] for the space of non-decorated pointed trees. For instance, if S n , 1 ≤ n ≤ ∞ is a star where the root has outdegree n and its children all have outdegree 0, then the sequence (S n ) n≥1 does not converge for our metric (and has no convergent subsequences). This implies that our space is not compact. On the contrary, the space of pointed trees endowed with the distance defined by Stufler in [48] is compact. We also note (without proof since we do not need this result) that in the subspace of locally finite pointed trees the two distances are topologically equivalent. A proof of this result would be an easy adaptation of [31, Lemma 6.2]. Proposition 6.9. The space (T •,luf D , d t ) is a Polish space.
A decorated tree approach to random permutations in substitution-closed classes Proof. The separability is trivial since ⊎ h≥1 f • h (T •,luf D ) is a countable dense set. The completeness follows from the fact that the space T •,luf D is a closed subspace of a countable product of discrete sets (which is complete) via the map (T We end this section defining two versions of the local convergence (similar to those previously defined for permutations) for random decorated trees with a uniform random distinguished leaf. In both definition, (T n , λ n ) n∈N is a sequence of random (finite) D-decorated trees and ℓ n is a uniform random leaf of (T n , λ n ). Again, the quenched version is stronger than the annealed one. Remark 6.12. It would also be natural, and closer to the usual notion of Benjamini-Schramm convergence in the literature, to distinguish a uniform random vertex v n rather than a uniform random leaf ℓ n as above. The leaf version is however what we need here for our application to permutations.

Local convergence around a uniform leaf for random packed trees conditioned to the number of leaves
We begin this section by constructing the limiting random pointed packed tree . This tree will be the limit of the sequence of uniform packed trees (T n , λ Tn ) considered in Theorem 3.4 pointed at a random leaf.
We recall that ξ denotes the random variable defined in Eq. (3.12) and T denotes the associated ξ-Galton-Watson tree. Additionally, we recall that the random variableξ defined in Eq. (4.2) is the size-biased version of ξ.
We define the random tree T • ∞ in the space T •,luf as follows. Let u 0 be the distinguished leaf. For each i ≥ 1, we let u i receive offspring according to an independent copy ofξ. The vertex u i−1 gets identified with an offspring of u i chosen uniformly at random. All other offspring vertices of u i become roots of independent copies of the Galton-Watson tree T .
Conditionally on T • ∞ , the random decoration λ ∞ (v) of each internal vertex v of T • ∞ gets drawn uniformly at random among all d + T • ∞ (v)-sized decorations in G(S) independently of all the other decorations ( G(S) was introduced after Theorem 2.12). This construction yields a random infinite locally and upwards finite pointed packed tree.
We refer to the sequence of (decorated) vertices (u i ) i≥0 as the infinite spine of P • ∞ = (T • ∞ , λ ∞ ). To simplify notation, we denote the space T •,luf G(S) of (possibly infinite) locally and upwards finite pointed packed trees as P •,luf . Proposition 6.13. Let P n = (T n , λ Tn ) be the random packed tree considered in Theorem 3.4 and P • ∞ = (T • ∞ , λ ∞ ) be the limiting random pointed packed tree constructed above. It holds that L (P n , ℓ n )|P n P −→ L(P • ∞ ), (6.4) A decorated tree approach to random permutations in substitution-closed classes where ℓ n is a uniform leaf of P n chosen independently of P n . In particular, P n converges in the quenched Benjamini-Schramm sense to the deterministic measure L(P • ∞ ) and in the annealed Benjamini-Schramm sense to the random tree P • ∞ .
Note that the L(P • ∞ ) is a measure on P •,luf . Since the limiting object in quenched Benjamini-Schramm convergence is in general a random measure on P •,luf , it should be interpreted as a constant random variable, equal to the measure L(P • ∞ ).
Proof of Theorem 6.13. The sequence L (P n , ℓ n )|P n n∈N is a sequence of random probability measures on the Polish space (P •,luf , d t ). The set of closed and open balls is a convergence-determining class for the space (P •,luf , d t ), i.e., for every probability measure µ and every sequence of probability measures (µ n ) n∈N on P •,luf , the convergence µ n (B) → µ(B) for all B ∈ B implies µ n → µ w.r.t. the weak-topology. This is a trivial consequence of the monotone class theorem and the fact that the intersection of two balls in P •,luf is either empty or one of them.
Therefore, using [35,Theorem 4.11], the convergence in Eq. (6.4) is equivalent to the following convergence, for all k ∈ N and for all vectors of balls (B i ) 1≤i≤k ∈ B k : Since the limiting vector in the above equation is deterministic, the above convergence in distribution is equivalent to the convergence in probability. Finally, standard properties of the convergence in probability imply that it is enough to show the component-wise  rewrites as (The left-hand side is a function of P n , and hence, a random variable; the right-hand side is a number.) . Denoting L(P n ) the set of leaves of P n , the left-hand side writes For a vertex v of T n , we denote by f (T n , v) the fringe subtree rooted at v and by f (λ (Tn,v) ) the map λ | V int (f (Tn ,v)) . Let also T be the unpointed version of T • . Note that a leaf ℓ ∈ L(T n ) satisfies f • h (T n , ℓ) = T • if and only if its h-th ancestor v satisfies f (T n , v) = T . Additional, to any v with f (T n , v) = T corresponds exactly one leaf ℓ with f • h (T n , ℓ) = T • (which A decorated tree approach to random permutations in substitution-closed classes is determined by the pointing). Therefore we can rewrite the last term of the above equation as Noting that all fringe subtrees of T n that are equal to T are necessarily disjoint and that, conditioning on f (T n , v) = T , then f (λ (Tn,v) ) = λ T with probability p, independently from the rest (specifically p = u∈T q −1 ), we can conclude using Chernoff concentration bounds that , where the last equality follows from the construction of the map λ ∞ .

The continuity of the bijection between packed trees and ⊕-indecomposable permutations
In this section we consider a substitution-closed class C different from the class of separable permutations. The latter case will be considered separately in Section 6.5.
We recall that DT := PA • CT is the bijection presented in Theorem 2.18 between ⊕-indecomposable permutations of C and finite packed trees.
The goal of this section is to extend the bijection DT −1 as a function RP from the metric space of (possibly infinite) locally and upwards finite pointed packed trees (P •,luf , d t ) to the metric space of (possibly infinite) rooted permutations (S • , d p ).
First, we need to deal with the introduction of a root in permutations (resp. a pointed leaf in trees) on finite objects. This is very simple, and we extend DT −1 as a function RP from finite pointed packed trees to finite rooted permutations as follows. We recall (see Theorem 2.17) that the i-th leaf ℓ of a packed tree P = DT(ν) corresponds to the i-th element of the permutation ν. Therefore the following definition is natural: RP(P, ℓ) := (DT −1 (P ), i). (6.9) Given an infinite pointed packed tree P • with infinitely many S-gadget decorations on its infinite spine, we consider the sequence of pointed subtrees f • s(h) (P • ) h∈N consisting of all restrictions for s(h) ∈ N such that f • s(h) (P • ) has root decorated with an S-gadget. Lemma 6.14. Let P • be an infinite pointed packed tree. Then the (deterministic) sequence of rooted permutations RP(f • s(h) (P • )) h∈N converges in the Benjamini-Schramm sense, as h tends to +∞.
Proof. In Section 2.5, we saw that the pattern associated to a set I of leaves of a packed tree only depends on any fringe subtree containing all leaves in I and rooted at a vertex decorated with an S-gadget. This implies that the family RP(f • s(h) (P • )) h∈N of elements in S • is consistent, i.e., for all h ∈ N, there exists an integer k(h) (the half-width of the restriction strip) such that r k(h) (RP(f • s(h+1) (P • ))) = RP(f • s(h) (P • )). By [13,Proposition 2.12], this implies the existence of a limit, which is what we wanted to prove.
A decorated tree approach to random permutations in substitution-closed classes Theorem 6.16 implies that, for n ≥ N ′ (k), we have d p (RP(P • n ), RP(P • )) ≤ 2 −k . Since such a N ′ (k) exists for all k > 0, we conclude that RP(P • n ) → RP(P • ). Therefore the function RP is continuous on C RP , as claimed.
As a final preparation result for the proof of Theorem 1.2 in the non-separable case, we show that the limit object P • ∞ is in the continuity set of RP with probability 1. Proposition 6.18. We have P(P • ∞ ∈ C RP ) = 1.
Proof. Obviously we can rewrite P(P • ∞ ∈ C RP ) as contains at least k leaves before and k leaves after u 0 , and has a root decorated with an S-gadget .
Since the problem is symmetric, it is enough to show that for each fixed k > 0, contains at least k leaves before u 0 and has a root decorated with an S-gadget = 1.

Note that
contains at least k leaves before u 0 and has a root decorated with an S-gadget ≥ P P • ∞ has at least k vertices u i in the infinite spine having at least one left child and an S-gadget as decoration .
Here and after, left child means child to the left of the infinite spine.
By construction, in the infinite tree P • ∞ , the vertex u i has at least one left child when u i−1 is not identified with its first offspring. Conditioned on u i having d children (which happens with probability P (ξ = d)), this occurs with probability 1 − 1/d. Moreover, conditioning on u i having d children, the probability that u i has an S-gadget as decoration is equal to q d −1 q d , where we recall that Q(z) = G(S)(z) = k≥2 q k z k is the generating in Eq. (3.5), and that q d > 1 for some d (since we are not treating the case of separable permutations here). Therefore, for all i ≥ 1, P u i has at least one left child and is decorated by an S-gadget = d≥2 By construction, all these events (for all i ≥ 1) are independent. Since they happen with some positive probability independent of i, a.s. at least k of these events hold. Consequently, P • ∞ has a.s. at least k vertices u i in its infinite spine that have at least one left child and are decorated by an S-gadget. This concludes the proof.

The separable permutations case
For the class of separable permutations, we cannot extend as before the map DT −1 as a function RP from the metric space of (possibly infinite) locally and upwards finite pointed packed trees (P •,luf , d t ) to the metric space of (possibly infinite) rooted permutations (S • , d p ). Indeed, every packed tree obtained from a separable permutation contains only ⊛-decorations.
Instead, in this case, we have to consider two different functions RP + and RP − from the metric space of (possibly infinite) locally and upwards finite pointed rooted trees A decorated tree approach to random permutations in substitution-closed classes to the metric space of (possibly infinite) rooted permutations. We first define the maps for finite rooted trees pointed at a leaf (where all internal vertices are thought of as decorated by ⊛). Let (T, ℓ) be such a tree. We denote with (T ⊕ , ℓ) (resp. (T ⊖ , ℓ)) the pointed canonical tree obtained from (T, ℓ) labelling the parent of ℓ with ⊕ (resp. ⊖) and then labelling all the other internal vertices in the unique way that prevents the creation of ⊕ − ⊕ or ⊖ − ⊖ edges. Denoting by i the label of the leaf ℓ (in the sense of  where the existence of the two limits is justified using similar arguments to the ones used in Theorem 6.14. We now set C RP * := T • ∈ T •,luf : ∀k > 0, ∃h(k) > 0 s.t. f • h(k) (T • ) contains at least k leaves before and k leaves after the distinguished leaf. . We conclude this section with the following result dealing with the local limit of a uniform canonical tree T n associated with separable permutations, conditioned on having n leaves, and where decorations have been removed. We note that T n is distributed as the random packed tree considered in Theorem 3.4 for the case of separable permutations (where decorations, which are all ⊛, have also been removed). Therefore all the properties for the offspring distribution ξ are still valid. In particular, we remark that ξ has finite variance in the case of separable permutations. Proposition 6.20. Let T n be as above and T • ∞ be the limiting random pointed tree constructed in Section 6.3. It holds that L (T n , ℓ n ), (−1) ht(ℓn) |T n P −→ L (T • ∞ , B ± ) , (6.14) where ℓ n is a uniform leaf of T n chosen independently of T n , ht(ℓ n ) denotes the height of the leaf ℓ n and B ± is a Bernoulli random variable on {1, −1} independent of T • ∞ .
In particular, T n converges in the quenched Benjamini-Schramm sense to the deterministic measure L(T • ∞ ) and in the annealed Benjamini-Schramm sense to the random tree T • ∞ . We highlight that since we want also to keep track of the parity of the distance between the pointed leaf and the root of the tree, Theorem 6.20 does not follow as a simple adaptation from the proof of Theorem 6.13.
Proof. With similar arguments to the ones used in the first part of proof of Theorem 6.13, in order to prove Eq. (6.14), it is enough to show that for a fixed leaf-pointed tree T • , and for any fixed h, In order to prove that N T • (n) , B ± = 1 , we use the Second moment method. We start by studying the first moment, which is , (−1) ht(ℓn) = 1 .
Using the notation of Section 4.2 and Theorem 4.2, we can rewrite this probability as follows: 1,{0} such that at the h-th ancestor of the distinguished leaf of G is equal to f • h (T • ). Using Theorem 4.2 with Ω = {0}, k = 1, t = h and offspring distribution equal to the one for separable permutations, and the additional result (given by Theorem 4.4) that the parity of the height of ℓ n converges to a fair coin flip, we have By comparing the construction of T 1,h {0} in Section 4.3 and that of T • ∞ , we have Bringing everything together yields We now study the second moment. We have where ℓ n and g n are two uniform random leaves of T n , taking independently conditionally on T n . Again, using the notation of Section 4.2 and Theorem 4.2, we can rewrite this probability as follows: A decorated tree approach to random permutations in substitution-closed classes that the parities of the height of ℓ n and g n converges to two independent fair coin flips, we have where B i ± , for i = 1, 2, are two independent copies of B ± . By construction, in T 2,h {0} the neighbourhoods of the two distinguished vertices (here leaves, since Ω = {0}) are taken independently so that Bringing everything together, Comparing Eqs. (6.17) and (6.18) and using the standard second moment method, we conclude that Indeed by Chebyschev's inequality, one has, for any fixed ε > 0, and the right-hand side tends to zero.

Local limit of uniform permutations in substitution-closed classes
We now prove a quenched Benjamini-Schramm convergence result for uniform random permutations in a proper substitution-closed class C. As we shall see at the end of the section, this implies our second main result (Theorem 1.2).  If the set S of simple permutations in C is non-empty and the criticality condition  Like after Theorem 6.13, we want to emphasize the nature of the limiting objects above. The limit L RP(P • ∞ ) (resp. the limit L RP B± (T • ∞ ) ) is a measure onS • . Since the limiting object for the quenched Benjamini-Schramm convergence is in general a random measure onS • , it should be interpreted as a constant random variable, equal to the measure L RP(P • ∞ ) (resp. L RP B± (T • ∞ ) ).
Proof. We only need to prove the quenched convergence statements, the annealed versions being a simple consequence of the quenched one (see [13,Proposition 2.35]). Moreover, thanks to Theorem 3.2, it is sufficient to prove the statement for a uniform ⊕-indecomposable permutation ν n . We first consider the case when C is a proper substitution-closed class different from the class of separable permutations. Consider a uniform random leaf ℓ n in P n and a uniform random element i n in ν n . We have the following equality in distribution (recall that RP denotes the extension of the function (PA • CT) −1 to rooted permutations): ν n , i n d = RP(P n , ℓ n ). (6.22) We analyse the right-hand side conditionally on P n . By Theorem 6.13, we know that L (P n , ℓ n )|P n P −→ L(P • ∞ ).
Note that the result described in footnote 3 gives convergence in distribution; the limit being a deterministic measure, convergence in probability follows. Comparing with Eq. (6.22), we have that L (ν n , i n )|ν n P −→ L RP(P • ∞ ) , which is the quenched convergence in Eq. (6.21). It remains to prove the theorem for the class of separable permutations. In this case, we have the following equality in distribution (recall that RP + and RP − are the maps defined in Eq. (6.12)) ν n , i n d = RP sgn(ℓ) (T n , ℓ n ), (6.23) where T n is a uniform undecorated canonical tree with n leaves, ℓ n is a uniform leaf of T n and sgn(ℓ) is the sign (−1) ht(ℓ) . We analyse the right-hand side conditionally on T n . By Theorem 6.20, we know that L (T n , ℓ n ), (−1) ht(ℓn) |T n P −→ L (T • ∞ , B ± ) . L RP sgn(ℓn) (T n , ℓ n )|T n P −→ L RP B± (T • ∞ ) .
Comparing with Eq. (6.23), we have that L (ν n , i n )|ν n P −→ L RP B± (T • ∞ ) , which is exactly the quenched convergence statement in Eq. (6.19). 3 The specific result that we need is a generalization of the mapping theorem for random measures: Let (µn) n∈N be a sequence of random measures on a space E that converges in distribution to a random measure µ on E. Let F be a function from E to a second space H such that the set D F of discontinuity points of F has measure µ(D F ) = 0 a.s.. Then the sequence of pushforward random measures (µn • F −1 ) n∈N converges in distribution to the pushforward random measure µ • F −1 .
Proof of Theorem 1.2. With the assumption of Theorem 1.2, we just proved (Theorem 6.21) that a uniform permutation ν n in C converges in the quenched Benjamini-Schramm sense to some deterministic measure L ν ∞ ). As recalled in Theorem 6.6 above, the quenched Benjamini-Schramm convergence imply the (joint) convergence of the random variables c-occ(π, ν n ) to some random variables Λ π . Additionally, since the quenched Benjamini-Schramm limit is a deterministic measure, the random variable Λ π are deterministic as well (see [13,Corollary 2.38]), i.e., they are numbers γ π,C in [0, 1].
This concludes the proof.
Remark 6.22. Concretely γ π,C is the probability that the restriction of the random order RP(P • ∞ ) (or, in the case of separable permutations, RP B± (T • ∞ )) on a fixed integer interval of size |π| (e.g. [0, |π| − 1]) is equal to π (after the identification between permutations and total order on intervals given in Section 6.1). Computing this number involves a sum over countably many configurations of P • ∞ and so it is not immediate, even for simple classes C and short patterns π.