Asymptotic behavior of some statistics in Ewens random permutations

The purpose of this article is to present a general method to find limiting laws for some renormalized statistics on random permutations. The model considered here is Ewens sampling model, which generalizes uniform random permutations. We describe the asymptotic behavior of a large family of statistics, including the number of occurrences of any given dashed pattern. Our approach is based on the method of moments and relies on the following intuition: two events involving the images of different integers are almost independent.


Background
Permutations are one of the most classical objects in enumerative combinatorics. Several statistics have been widely studied: total number of cycles, number of cycles of a given length, of descents, inversions, exceedances or more recently, of occurrences of a given (generalized) pattern... A classical question in enumerative combinatorics consists in computing the (multivariate) generating series of permutations with respect to some of these statistics.
A probabilistic point of view on the topic raises other questions. Let us consider, for each N , a probability measure µ N on permutations of size N . The simplest model of random permutations is of course the uniform random permutations (for each N , µ N is the uniform distribution on the symmetric group S N ). A generalization of this model has been introduced by W.J. Ewens in the context of population dynamics [16]. It is defined by µ N ({σ}) = θ #(σ) θ(θ + 1) · · · (θ + N − 1) , (1.1) where θ > 0 is a fixed real parameter and #(σ) stands for the number of cycles of the permutation σ. Of course, when θ = 1, we recover the uniform distribution. From now on, we will allow ourselves a small abuse of language and use the expression Ewens random permutation for a random permutation distributed with Ewens measure.
Having chosen a sequence of probability disribution of S N , any statistic on permutations can be interpreted as a sequence of random variables (X N ) N ≥1 . The natural question is now: what is the asymptotic behavior (possibly after normalization) of (X N ) N ≥1 ?
The purpose of this article is to introduce a new general approach to this family of problems, based on the method of moments.
We then use it to determine the second-order fluctuations of a large family of statistics on permutations: occurrences of dashed patterns (Theorem 1.8).
Random permutations, either with uniform or Ewens distribution, are well-studied objects. Giving a complete list of references is impossible. In Section 1.5, we compare our results with the literature.

Motivating examples
Let us begin by describing a few examples of results, covered by our method.
Number of cycles of a given length p. Let Γ (N ) p be the random variable given by the number of cycles of length p in an Ewens random permutation σ in S N . The asymptotic distribution of Γ (N ) p has been studied by V.L. Goncharov [17] and V.F. Kolchin [22] in the case of uniform measure and by G.A. Watterson [30,Theorem 5] in the framework of a general Ewens distribution (see also [1,Theorem 5.1]). by linearity between the points i/N and (i + 1)/N (for 1 ≤ i ≤ N − 1). In sections 5.1 and 5.2, we explain why we are interested in this quantity: it is related to a statistical physics model, the symmetric simple exclusion process (SSEP), and to permutation tableaux, some combinatorial objects which have been intensively studied in the last few years.
Although we have no interpretation for that, let us note that the limit of F (N ) σ (x) is the cumulative distribution function of a β variable with parameters 1 and 2.
With this formulation, Theorem 1.2 is new, but the first part is quite easy while the second is a consequence of [15, Appendix A] (see Section 5). We also refer to an article of A. Barbour and S. Janson [5], where the case of the uniform measure is addressed with another method.
Adjacencies. We consider here only uniform random permutations, that is the case θ = 1. An adjacency of a permutation σ in S N is an integer i such that σ(i + 1) = σ(i) ± 1. As above, we introduce the random variable B ad,N i which takes value 1 if i is an adjacency and 0 otherwise. Then B ad,N i is distributed according to a Bernoulli law with parameter 2 N . An easy computation shows that they are not independent.
We are interested in the total number of adjacencies in σ, that is the random variable on S N defined by A (N ) =

Theorem 1.3 ([32]). A (N ) converges in distribution towards a Poisson variable with parameter 2.
This result first appeared in papers of J. Wolfowitz and I. Kaplansky [32,21] and was rediscovered recently in the context of genomics (see [33] and also [11,Theorem 10]).
In these three examples, the underlying variables behave asymptotically as independent. The main lemma of this paper is a precise statement of this almost independence, that is an upper bound on joint cumulants. This result allows us to give new proofs of the three results presented above in a uniform way. Besides, our proofs follow the intuition that events involving the image of different integers are almost inedependent.

The main lemma
From now on, N is a positive integer and σ a random Ewens permutation in S N . We shall use the standard notation [N ]  For random variables X 1 , . . . , X on the same probability space (with expectation denoted E), we define their joint cumulant As usual, [t 1 . . . t ]F stands for the coefficient of t 1 . . . t in the series expansion of F in positive powers of t 1 , . . . , t . Joint cumulants have been introduced by Leonov and Shiryaev [23]. For a summary of their most useful properties, see [20,Proposition 6.16].
Our main lemma is a bound on joint cumulants of products of elementary events. To state it, we introduce the following notations. Consider two lists of positive integers of the same length i = (i 1 , . . . , i r ) and s = (s 1 , . . . , s r ) and define the graphs G 1 (i, s) and G 2 (i, s) as follows: • the vertex set of G 1 (i, s) is [r] and j and h are linked in G 1 (i, s) if and only if i j = i h and s j = s h .
• the vertex set of G 2 (i, s) is also [r] and j and h are linked in G 2 (i, s) if and only if The connected components of a graph G form a set partition of its vertex set that we denote CC(G). Besides, if Π is a set-partition Π, we write #(Π) for its number of parts.
In particular, #(CC(G)) is the number of connected components of G. Finally, if π 1 and π 2 , we denote π 1 ∨ π 2 the finest partition which is coarser than π 1 and π 2 (here, fine and coarse refer to the refinement order; see Section 1.7). Theorem 1.4 (main lemma). Fix a positive integer r. There exists a constant C r , depending on r, such that for any set partition τ = (τ 1 , . . . , τ ) of [r], any N ≥ 1 and lists i = (i 1 , . . . , i r ) and s = (s 1 , . . . , s r ) of integers in [N ], one has: Note that the integer # CC(G 1 (i, s)) is the number of different pairs (i j , s j ). The second quantity involved in the theorem # CC(G 2 (i, s)) ∨ τ does not have a similar interpretation. However, it admits an equivalent description. Consider the graph G 2 , obtained from G 2 (i, s) by merging vertices corresponding to elements in the same part of τ . Then # CC(G 2 (i, s)) ∨ τ is the number of connected components of G 2 .
As an example, let us consider the distinct case, that is we assume that the entries in the lists i and s are distinct. We shall use the standard notation for falling factorials (x) a = x (x − 1) · · · (x − a + 1). In this case, the expectation of a product of B where a is the number of factors (the case θ = 1 is obvious, while the general case is explained in Lemma 2.1). Joint cumulants can be expressed in terms of joint moments -see [20,Proposition 6.16 (vi)] -, so the left-hand side of (1.3) can be written as an explicit rational function in N of degree −r. According to our main lemma, the sum has degree at most − − r + 1, which means that many simplifications are happening (they are not at all trivial to explain!).
This reflects the fact that the variables B (N ) ij ,sj behave asymptotically as independent (joint cumulants vanish when the set of variables can be split into two mutually independent sets).

Remark 1.5.
It is worth noticing that our proof of the main lemma goes through a very general criterion for a family of sequences of random variables to have small cumulants: see Lemma 2.2. This may help to find a similar behaviour (that is random variables with small cumulants) in completely different contexts, see Section 1.6.

Applications
Recall that, if Y (1) , . . . , Y (m) are random variables such that the law of the m-tuple (Y (1) , . . . , Y (m) ) is entirely determined by its joint moments, then the two following statements are equivalent (see [6,Theorem 30.2] for the analogous property in terms of moments).
• For any and any list i 1 , . . . , i in [m], • The sequence of vectors (X n , . . . , X (m) n ) converges in distribution towards the vector (Y (1) , . . . , Y (m) ). Moreover, we get an extension of Theorem 1.3 to any value of the parameter θ.
To give more evidence that our approach is quite general, we study the number of occurrences of dashed patterns. This notion has been introduced 1 in 2000 by E. Babson and E. Steingrímsson [3]. Definition 1.6. A dashed pattern of size p is the data of a permutation τ ∈ S p and a subset X of [p − 1]. An occurrence of the dashed pattern (τ, X) in a permutation σ ∈ S N is a list i 1 < · · · < i p such that: • for any x ∈ X, one has i x+1 = i x + 1.
The number of occurrences of the pattern (τ, X) will be denoted O (N ) τ,X (σ). Thanks to our main lemma, we describe the second order asymptotics of the number of occurrences of any given dashed pattern in a random Ewens permutation. Theorem 1.8. Let (τ, X) be a dashed pattern of size p (see definition 1.6) and σ N a sequence of random Ewens permutations. We denote q = |X|. Then, N p−q , that is the renormalized number of occurrences of (τ, X), tends almost surely towards 1 p!(p−q)! . Besides, one has the following central limit theorem:

Comparison with other methods
There is a huge literature on random permutations. While we will not make a comprehensive survey of the subject, we shall try to present the existing methods and results related to our paper.
Our Poisson convergence results have been obtained previously by the moment method in the articles [21] and [30]. Our cumulant approach is not really different from these proofs. Yet, we have chosen to present these examples for two reasons: • first, it illustrates the fact that our approach can prove in a uniform ways convergence towards different distributions ; • second, the combinatorics is simpler in the Poisson cases, so they serve as toy model to explain the general structure of the proofs.
Let us mention also the existence of a powerful method, called the Stein-Chen method, that proves Poisson convergence, together with precise bounds on total variation distances -see, e.g., [4,Chapter 4].
Let us now consider our normal approximation results. For uniform permutations, both are already known or could be obtained easily with methods existing in the literature.
• Theorem 1.2 has been proved by A. Barbour and S. Janson [5], who established a functional version of a combinatorial central limit theorem from Hoeffding [19]. This theorem deals with statistics of the form where A (N ) is a sequence of deterministic N × N matrices.
• Theorem 1.8 has been proved for some particular patterns using dependency graphs and cumulants: see [9, Theorems 10 and 17] and [18,Section 6]. The case of a general pattern (under uniform distribution) can be handled with the same arguments.
These methods are very different one from each other and none of them can be used to prove both results in a uniform way. Note also that they only work in the uniform case. Yet, going from the uniform model to a general Ewens distribution should be doable using the chinese restaurant process [1, Example 2.4] (with this coupling, an Ewens random permutation differs from a uniform random permutation by O(2|θ − 1| log(n)) values). To conclude, while less powerful in the Poisson case, our method has the advantage of providing a uniform proof for all these results and to extend directly to a general Ewens distribution.

Future work
In addition to the conjecture above, we mention three directions for further research on the topic.
It would be interesting to describe which permutation statistics can be (asymptotically) studied with our approach. This problem is discussed in Section 6.4.
Another direction consists in refining our convergence results (speed of convergence, large deviations, local limit laws) by following the same guideline.
Finally, it is natural to wonder if the method can be extended to other family of objects. The extension to colored permutations should be straightforward. A promising direction is the following: consider a graph G with vertex set [n] and take some random subset S of its vertices, uniformly among all subsets of size p. If p grows linearly with n, then the events "i lies in S" (for 1 ≤ i ≤ n) have small joint cumulants (this is easy to see with the material of Section 2).

Preliminaries: set partitions
The combinatorics of set partitions is central in the theory of cumulants and will be important in this article. We recall here some well-known facts about them.
A set partition of a set S is a (non-ordered) family of non-empty disjoint subsets of S (called parts of the partition), whose union is S.
Denote P(S) the set of set partitions of a given set S. Then P(S) may be endowed with a natural partial order: the refinement order. We say that π is finer than π or π coarser than π (and denote π ≤ π ) if every part of π is included in a part of π .
Endowed with this order, P(S) is a complete lattice, which means that each family F of set partitions admits a join (the finest set partition which is coarser than all set partitions in F , denoted with ∨) and a meet (the coarsest set partition which is finer than all set partitions in F , denoted with ∧). In particular, there is a maximal element {S} (the partition in only one part) and a minimal element {{x}, x ∈ S} (the partition in singletons).

Outline of the paper
The paper is organized as follows. Section 2 contains the proof of the main lemma. Then, in Section 3, we give two easy lemmas on connected components of graphs, which appear in all our applications. The three last sections are devoted to the different applications: Section 4 for cycles, Section 5 for exceedances and finally, Section 6 for generalized patterns (including adjacencies and dashed patterns).

Proof of the main lemma
This section is devoted to the proof of Theorem 1.4. It is organized as follows.
First, we give a simple formula for the joint moments of the elementary events (B i,s ).
Second, we establish a general criterion based on joint moments that implies that the corresponding variables have small joint cumulants. Third, this criterion is used to prove Theorem 1.4 in the case of distinct indices. The general case finally can be deduced from this particular case, as shown in the last part of this section.

Joint moments
The first step of the proof consists in computing the joint moments of the family of random variables (B ir,sr , in the case where i = (i 1 , . . . , i r ) and s = (s 1 , . . . , s r ) are two lists of distinct indices (some entry of the list i can be equal to an entry of s).
We see these two lists as a partial permutation which sends i j to s j . The notion of cycles of a permutation can be naturally extended to partial permutations: (i j1 , . . . , i jγ ) is a cycle of the partial permutation if s j1 = i j2 , s j2 = i j3 and so on until s jγ = i j1 . Note that a partial permutation does not necessarily have cycles. The number of cycles of σ i,s is denoted #( σ i,s ). Suppose that we have a random Ewens permutation σ of size N − 1. Write it as a product of cycles and apply the following random transformation.
• With probability θ/(N + θ − 1), add N as a fixed point. More precisely, σ is defined by: • For each j, with probability 1/(N + θ − 1), add N just before j in its cycle. More precisely, σ is defined by: Then σ is a random Ewens permutation of size N . Iterating this, one obtains a linear time and space algorithm to pick a random Ewens permutation.
Let us come back now to the computation of joint moments. The following lemma may be known, but the author has not been able to find it in the literature.
. i,s are given by Proof. As Ewens measure is constant on conjugacy classes of S N , one can assume without loss of generality that i 1 = N − r + 1, i 2 = N − r + 2, . . . , i r = N . Then permutations of S N with σ(i j ) = s j are obtained in the previous algorithm as follows: • Choose any permutation in S N −r .
• For 1 ≤ j ≤ r, add i j in the place given by the following rule: if s j < i j , add i j just before s j in its cycle. Otherwise, look at σ i,s (i j ), σ 2 i,s (i j ) and so on until you find an element smaller than i j and place i j before it. If there is no such element, then i j is a minimum of a cycle of σ i,s . In this case, put it in a new cycle.
It is easy to check with the description of the construction of a permutation under Ewens measure that these choices of places happen with a probability

A general criterion for small cumulants
1 ,. . . ,A (N ) be sequences of random variables. We introduce the following notation for joint moments and cumulants of subsets of these variables: for a subset We also introduce the auxiliary quantity U

(N )
A,∆ implicitly defined by the property: for any Using Möbius inversion on the boolean lattice, we have explicitly: for any subset ∆ ⊆ [ ], Proof. Let us consider the implication I ⇒ II. We denote T A,∆ − 1 and assume that T . Indeed, this corresponds to the case ∆ = [ ] of II, but the same proof will work for any ∆ ⊆ [ ].
Recall the well-known relation between joint moments and cumulants [20, Proposition 6.16 (vi)]: But joint moments can be expressed in terms of T : where the sum runs over all finite lists of distinct (but not necessarily disjoint) subsets of C of size at least 2 (in particular, the length m of the list is not fixed). When we multiply this over all blocks C of a set partition π, we obtain the sum of T (N ) ∆1 . . . T (N ) ∆m over all lists of distinct subsets of [ ] of size at least 2 such that each ∆ i is contained in a block of π. In other terms, for each i ∈ [m], π must be coarser than the partition Π(∆ i ), which, by definition, has ∆ i and singletons as blocks. Finally, The condition on π can be rewritten as Hence, by definition of the Möbius function, the sum in the parenthesis is equal to 0, On the other hand, one has Hence only summands of order of magnitude N − +1 or less survive and one has κ (N ) which is exactly what we wanted to prove.
Let us now consider the converse statement. We proceed by induction on and we assume that, for all smaller than a given ≥ 2, the theorem holds. Consider some sequences of random variables A Note that an immediate induction shows that the joint moment fulfills It remains to prove that Thanks to the estimate above for joint moments, this can be rewritten as A,∆ holds, and such that Equation (2.5) is fulfilled when A is replaced by B (the reader may wonder whether such a family B exists; let us temporarily ignore this problem, which will be addressed in Remark 2.3). By definition, the family B of sequences of random variables fulfills condition I of the theorem and, hence, using the first part of the proof, has also property II. In particular: As A and B have the same joint moments, except for M But the family B fulfills Equation (2.5) and, hence, so does family A. Remark 2.3. Let be a fixed integer and I a finite subset of (N >0 ) . Then, for any list (m i ) i∈I of numbers, one can find some complex-valued random variables X 1 , . . . , X so {T · E(X i1 1 . . . X i ), i ∈ I} correspond to different power sums of z 1 , . . . , z T . Thus we have to find a set {z 1 , . . . , z T } of complex numbers with specified power sums up to degree d j . This exists as soon as T ≥ d j , because C is algebraically closed. In particular, the family B considered in the proof above exists.
However, we do not really need that this family exists. Indeed, during the whole proof, we are doing manipulations on the sequences of moments and cumulants using only the relations between them (equation (2.3)). We never consider the underlying random variables. Therefore, everything could be done even if the random variables did not exist, as it is often done in umbral calculus [27].

Case of distinct indices
Recall that, in the statement of Theorem 1.4, we fix a set-partition τ and two lists i and s and we want to bound the quantity We first consider the case where all entries in the sequences i and s are distinct. To be in the situation of Lemma 2.2, we set, for h ∈ [ ] and N ≥ 1: where a j = |τ j |. The normalization factor has been chosen so that E(A Statistics in random permutations Therefore, we have to prove that the quantity Q a1,...,a := We proceed by induction over a . If a = 0, for any δ ⊆ [ − 1], the factors corresponding to δ and δ { } cancel each other. Thus Q a1,...,a −1 ,0 = 1 and the statement holds.
If a > 0, the quantity Q a1,...,a can be written as Using the terminology of lemma 2.2, it means that the list A h , this can be rewritten: which is Theorem 1.4 in the case of distinct indices.
Here is the technical lemma that we left behind in the proof.

Lemma 2.4.
For any positive integers a 1 , . . . , a −1 , when X is a positive number going to infinity.
Proof. Define R ev (resp. R odd ) as where the product runs over subsets of [ − 1] of even (resp. odd) size. Expanding the product, one gets The index set of the second summation symbol is the set of lists of m distinct (but not necessarily disjoint) subsets of [ − 1] of even size. Of course, a similar formula with subsets of odd size holds for R odd . Let us fix an integer m < − 1 and a list j 1 , . . . , j m . Denote j 0 the smallest integer in [ − 1] different from j 1 , . . . , j m (as m < − 1, such an integer necessarily exists). Then one has a bijection: where ∇ is the symmetric difference operator. This bijection implies that the summand (−1) m a j1 . . . a jm X 2 −2 −m appears as many times in R ev as in R odd . Finally, all terms corresponding to values of m smaller than − 1 cancel in the difference R ev − R odd and one has Remark 2.5. Thanks to a result of Leonov and Shiryaev that expresses cumulants of products of random variables as product of cumulants (see [23] or [28,Theorem 4.4]), it would have been enough to prove our result for a 1 = · · · = a = 1. But, as our proof uses an induction on a , we have not made this choice. Remark 2.6. We would like to point out the fact that our result is closely related to a result of P.Śniady. Indeed, thanks to our multiplicative criterion to have small cumulants, the computation in this section is equivalent to Lemma 4.8 of paper [28]. However,Śniady's proof relies on a non trivial theory of cumulants of observables of Young diagrams. Therefore, it seems to us that it is worth giving an alternative argument.

General case
Let A (N ) 1 , . . . , A (N ) be some sequences of random variables. We introduce some truncated cumulants: if π 0 , π 1 , π 2 and so on, are set partitions of [ ], we set k (N ) In the context of Lemma 2.2, it is also possible to bound the truncated cumulants. Proof. For the first statement, the proof is similar to the one of I ⇒ II of Lemma 2.2.
One can write an analogue of equation (2.4): The same argument as above says that only terms corresponding to lists such that The first item of the Lemma follows because, by hypothesis, For the second statement, we use the inclusion/exclusion principle: Then the second item follows from the first.
Let us come back to the proof of Theorem 1.4. We fix two lists i and s of length r, as well as a set partition τ of r. We want to find a bound for We split the sum according to the values of the partitions π 1 = π ∧ CC(G 1 (i, s)) and π 2 = π ∧ CC(G 2 (i, s)). More precisely, We call the summation index the slice determined by π 1 and π 2 .
Let us fix some partitions π 1 and π 2 . For each block C of π 1 , we consider some sequence of random variables (A . (2.6) By the same argument as in Section 2.3, this family has the quasi-factorization property and, hence, its cumulants and truncated cumulants are small (Lemma 2.2).
But, if π is in the slice determined by π 1 and π 2 , one can check easily (see the description of joint moments in Section 2.1) that the corresponding product of moments is given by: where α π1,π2 depends only on π 1 and π 2 and is given by: • 0 if π 2 contains in the same block two indices j and h such that i j = i h but s j = s h or s j = s h but i j = i h ; • θ γ otherwise, where γ is the number of cycles of the partial permutation (i, s), whose indices are all contained in the same block of π 2 .
As a consequence, But the condition π ∧ CC(G 1 (i, s)) = π 1 can be rewritten as follows: π ≥ π 1 and π π for any π 1 ≤ π ≤ CC(G 1 (i, s)). A similar rewriting can be performed for the condition π ∧ CC(G 2 (i, s)) = π 2 . Finally, the sum in equation (2.7) above is a truncated cumulant of the family (2.6) and is bounded from above by O(N −|CC(G2(i,s))∨τ |+1 ). This implies which ends the proof of Theorem 1.4 because π 1 has necessarily at least as many parts as CC (G 1 (i, s)).

Graph-theoretical lemmas
In this section, we present two quite easy lemmas on the number of connected components on graph quotients. These lemmas may already have appeared in the literature, though the author has not been able to find a reference. They will be useful in the next sections for applications of Theorem 1.4.

Notations
Let us consider a graph G with vertex set V and edge set E. By definition, if V is a subset of V , the graph G[V ] induced by G on V has vertex set V and edge set E[V ], where E[V ] is the subset of E consisting of edges having both their extremities in V .
Let f be a surjective map from V to another set W . Then the quotient of G by f is the graph G/f with vertex set W and which has an edge between w and w if, in G, there is at least one edge between a vertex of f −1 (w) and a vertex of f −1 (w ).   Proof. For each edge (w, w ) in G/f , we choose arbitrarily an edge (v, v ) in G such that f (v) = w and f (v ) = w (by definition of G/f , such an edge exists but is not necessarily unique). Thereby, to each edge of G/f or of G[f −1 (w)] (for any w in W ) corresponds canonically an edge in G.

Connected components of quotients
Take spanning forests F G/f and (F w ) w∈W of graphs G/f and G[f −1 (w)] for w ∈ W . With the remark above, to each spanning forest corresponds a set of edges in G. Consider the union F of these sets. It is an acyclic set of edges of G. Indeed, if it contained a cycle, it must be contained in one of the fibers f −1 (w), otherwise it would induce a cycle in F G/f . But, in this case, all edges of the cycles belong to F w , which is impossible, since F w is a forest.
Finally, F is an acyclic set of edges in G and Continuing the example. All fibers f −1 (i) (for i = 1, 2, 3, 4, 5) are of size 2. Three of them contains one edge (for i = 3, 4, 5) and hence are connected, while the other two have two connected components. Finally, the sum in the lemma is equal to 2, which is equal to the difference #(CC(G)) − #(CC(G/f )) = 4 − 2 = 2.

Fibers of size 2
In this section, we further assume that V = W W and that f is the canonical application W W → W consisting in forgetting to which copy of W the element belongs. Throughout the paper, for simplicity of notation, we will use overlined letters for elements of the second copy of W .
In this context, in addition to the quotient G/f , one can consider another graph with vertex set W . By definition, G//f has an edge between w and w if, in G, there is an edge between w and w and an edge betweenw andw . We call this graph the strong quotient of G.
Continuing the example. The graph G and the function f in the example above fit in the context described in this section. The strong quotient G//f is drawn on Figure 1 (bottom right picture).

Lemma 3.2.
Let G and f be as above. Then
By definition, an edge in G 1 between j and k corresponds to two edges in G. In contrast, an edge (i, j) in G 2 corresponds to at least one edge in G.
Consider a spanning forest F 1 in G 1 . As the set of edges of G 1 is smaller than the one of G 2 , F 1 can be completed into a spanning forest F 2 of G 2 . We consider the subset F of edges of G obtained as follows: for each edge of F 1 , we take the two corresponding edges in G and for each edge of F 2 \F 1 , we take the corresponding edge in G (if there is several corresponding edges, choose one arbitrarily).
We will prove by contradiction that F is acyclic. Suppose that F contains a cycle C. Each edge of C projects on an edge in F 2 and thus the projection of C is a list S = (e 1 , . . . , e h ) of consecutive edges in F 2 (consecutive means that we can orient the edges so that, for each ∈ [h], the end point of e is the starting point of e +1 , with the convention e h+1 = e 1 ). This list is not necessarily a cycle because it can contain twice the same edges (either in the same direction or in different directions). Indeed, F contains some pairs of edges of the form {w, w }, {w, w } which project on the same edge in G 2 . But as edges from these pairs have no extremities in common, they can not appear consecutively in the cycle C. Therefore, the same edge can not appear twice in a row in the list S. This implies that the list S contains a cycle C 2 as a factor. We have reached a contradiction as the edges in C 2 are edges of the forest F 2 . Thus F is acyclic.

Toy example: number of cycles of a given length p
In this section, we are interested in the number Γ (N ) p of cycles of length p in a random Ewens permutation of size N . The asymptotic behavior of Γ (N ) p is easy to determine (see Theorem 1.1), as its generating series is explicit and quite simple. We will give another proof which relies on Theorem 1. Step 1: expand the cumulants of the considered statistic. In this step, one has to express the statistic we are interested in using the variables ip,i1 is the indicator function of the event "(i 1 , . . . , i p ) is a cycle of σ". Therefore, one has Step 2: Give an upper bound for the elementary cumulants. Now, we would like to apply our main lemma to every summand of equation ( • t(i) the number of distinct entries.
Clearly, M (i) is always at least equal to t(i). In the case where τ has blocks of size p and where the list s is obtained by a cyclic rotation of the list i in each block, Theorem 1.4 writes as: Step 3: give an upper bound for the number of lists.
As the number of summands in Equation (4.1) depends on N , we can not use directly inequality (4.2). We need a bound on the number of matrices i with a given value of t(i).  Proof. If we specify which indices correspond to entries with the same values (that is a set partition in t blocks of the set of indices), the number of corresponding lists is N t and hence is bounded from above by N t . This implies the lemma, with C L being equal to the number of set partitions of [L].
By inequality (4.2) and Lemma 4.1, for each t ∈ [p · ], the contribution of lists (i r j ) taking exactly t different values is bounded from above by C p C p and hence for all ≥ 1, κ (Γ (N ) p ) = O(1). To compute the component of order 1, let us make the following remark: by the argument above, the total contribution of lists (i r . . , i r1 p ) and (i r2 1 , . . . , i r2 p ) are equal. As i r 1 is always the minimum of the i r j , the two words are in fact always equal in this case. In particular G(i) is a disjoint union of cliques. If we further assume Q(i) = 1, i.e. G(i) is connected, then G(i) is the complete graph and we get that i r j does not depend on r.  p ) tends in distribution towards a vector (P 1 , . . . , P p ) where the P i are independent Poisson-distributed random variables with respective parameters θ/i.

Number of exceedances
In this section, we look at our second motivating problem, the number of exceedances in random Ewens permutations. The first two subsections make a link between a physical statistics model and this problem, justifying our work. The last two subsections are devoted to the proof of Theorem 1.2 and related results.

Symmetric simple exclusion process
The symmetric simple exclusion process (SSEP for short) is a model of statistical physics: we consider particles on a discrete line with N sites. No two particles can be in the same site at the same moment. The system evolves as follows: • if its neighboring site is empty, a particle can jump to its left or its right with probability 1 N +1 ; • if the left-most site is empty (resp. occupied), a particle can enter (resp. leave) from the left with probability α N +1 (resp. γ N +1 ); • if the right-most site is empty (resp. occupied), a particle can enter (resp. leave) from the right with probability δ N +1 (resp. β N +1 ); • with the remaining probability (we suppose α, β, γ, δ < 1 so that, in a given state, the sum of the probabilities of the events which may occur is smaller than 1), nothing happens.
Mathematically, this defines an irreducible aperiodic Markov chain on the finite set This model is quite popular among physicists because, despite its simplicity, it exhibits interesting phenomenons like the existence of different phases. For a comprehensive introduction on the subject and a survey of results, see [14].
A good way to describe a state τ of the SSEP is the function F More precisely, we want to study asymptotically the properties of the random function F (N ) τ , where τ is distributed with µ N and N tends to infinity.

Link with permutation tableaux and Ewens measure
From now on, we restrict to the case α = 1, γ, δ = 0. In this case, thanks to a result of S. Corteel and L. Williams [13], the measure µ N is related to some combinatorial objects, called permutation tableaux.
The latter are fillings of Young diagrams (which can have empty rows, but no empty columns) with 0 and 1 respecting some rules, the details of which will not be important here. The Young diagram is called the shape of the permutation tableau. The size of a permutation tableau is its number of rows, plus its number of columns (and not the number of boxes!).
In addition with their link with statistical physics, permutation tableaux also appear in algebraic geometry: they index the cells of some canonical decomposition of the totally positive part of the Grassmannian [26,31]. They have also been widely studied from a purely combinatorial point of view [29,12,2].
To a permutation tableau T of size N + 1, one can associate a word w T in {0; 1} N as follows: we label the steps of the border of the tableau starting from the North-East corner to the South-West corner. The first step is always a South step. For the other → 101001 steps, we set w T i = 1 if and only if the i + 1-th step is a south step. Clearly, the word w T depends only on the shape of the tableau T . This procedure is illustrated on figure 2.
With this definition, the border of a tableau T of size N + 1 is the parametric broken w T the function associated to the word w T as defined in the previous section. Hence, F (N ) w T is a good way to encode the shape of the permutation tableau T .
S. Corteel and L. Williams also introduced a statistics on permutation tableaux called number of unrestricted rows and denoted u(T ). If β is a positive real parameter, this statistics induces a measure µ T N (β) on permutation tableaux of size N , for which the probability to pick a tableau T is proportional to β −u(T ) . This measure is related to the SSEP by the following result (which is in fact a particular case of [13, Theorem 3.1] but we do not know how to deal with the extra parameters there).
Theorem 5.1. [13] The steady state of the SSEP µ N is the push-forward by the application T → w T of the probability measure µ T N +1 (β).
• The number of unrestricted rows of a tableau T = Φ(σ) is the number of rightto-left minima of σ: recall that i is a right-to-left minimum of σ if σ > i for any > σ −1 (i).
We are interested in the number of cycles of permutations rather than their number of right-to-left minima. The following bijection, which is a variant of the first fundamental transformation on permutation [24, § 10.2], sends one of this statistics to the other.
Take a permutation σ, written in its cycle notation so that: • its cycles end with their minima; • the minima of the cycles are in increasing order.
The application Ψ is a bijection from S N to S N . Besides, the minima of the cycles of σ are the right-to-left minima of Ψ(σ), while the ascents in Ψ(σ) are the exceedances in σ (a similar statement is given in [24,Theorem 10.2.3]).
From now on, we assume β·θ = 1. The properties above imply that µ T N (β) is the pushforward of the Ewens measure with parameter θ by the application Φ • Ψ. Combining this with Theorem 5.1, the steady state of the SSEP µ N is the push-forward of Ewens measure by the application σ → w Φ(Ψ(σ)) . But this application admits an easy direct Recall that, as explained above, we are interested in the random function F where τ is distributed according to the measure µ N −1 . The results above imply that this random function has the same distribution as F , where σ is a random Ewens permutation of size N and F (N +1) σ is the function defined in Section 1.2.
This was our original motivation to study F (N +1) σ .

Bounds for cumulants
Let us fix some real numbers x 1 , . . . , x in [0; 1]. In this section, we will give some bounds on the joint cumulants of the random variables (F Let us begin by the following bound (step 2 of the proof, according to the division done in Section 4). • there is an edge between j and k (resp. j andk,j andk) if and only if i j = i k (resp. i j = s k , s j = s k ).
The inequality above is simply Lemma 3.2 applied to the graph G(i, s) (G 1 (i, s) and G 2 (i, s) are respectively its strong and usual quotients).
We can now prove the following bound: There exists a constant C such that, for any integer N ≥ 1 and real numbers x 1 , . . . , x , one has Proof. To simplify the notations, we suppose that N x 1 , . . . , N x are integers, so that We apply Lemma 4.1 to the list i 1 , . . . , i , s 1 , . . . , s and get that the number of pairs of lists (i, s) such that |{i 1 , . . . , i , s 1 , . . . , s }| is equal to a given number t is bounded from above by C 2 N t (step 3).
Combining this with Proposition 5.2, we get that the total contribution of pairs of lists (i, s) with |{i 1 , . . . , i , s 1 , . . . , s }| = t to the right-hand side of (5.1) is smaller than C 2 C N , which ends the proof of Proposition 5.3 (step 4).
Illustration of the proof. Set = 5 and consider the lists i = (5, 2, 2, 7, 7) and s = (8,8,2,7,7). The graph G(i, s) associated to this pair of lists is the graph G drawn of Figure 1. It follows immediately that G 1 (i, s) = G//f has 4 connected components while G 2 (i, s) = G/f has 2. Therefore, by Theorem 1.4, κ(B The same bound is valid for all sequences i and s such that G(i, s) = G. There are fewer than N 4 such sequences: to construct such a sequence, one has to choose distinct values for the four connected components of G, so that they fulfill some inequalities. Finally, their total contribution to (5.1) is smaller than C 5 N −1 . In fact, their result is more general because it corresponds to the SSEP with all parameters. This bound on cumulants can be obtained easily using our Proposition 5.2 and Lemma 4.1. A slight generalization of it (taking into account the case where some i's can be equal) implies directly Proposition 5.3. Therefore, our method does not give some new results on the SSEP. Nevertheless, it was natural to try to understand the long range correlation phenomenon directly in terms of random permutations and that is what our approach does.

Convergence results
In this section, we explain how one can deduce from the bound on cumulants, some results on the convergence of the random function F (N ) σ , in particular Theorem 1.2. In addition to the bounds above, we need equivalents for the first and second joint cumulants of the F (N ) σ (x). An easy computation gives: min(t, u)(1 − max(t, u))dtdu.
We call K(x, y) the right-hand side of the second equation. We begin with a proof of Theorem 1.2, which describes the asymptotic behavior of F By proposition 5.3, this quantity is bounded from above by O(N −2 ) and, in particular, The end of the proof is classical. First, we inverse the summation and expectation symbols (all quantities are nonnegative). As its expectation is finite, the random variable In particular, for r > 2 the left-hand side tends to 0. As the variables Z (N ) σ (x r )) tends towards a centered Gaussian vector. The covariance matrix is the limit of the covariance of the Z (N ) The previous theorem deals with pointwise convergence. It is also possible to get some results for the random functions (F (N ) σ ) N ≥1 . In the following statement, we consider convergence in the functional space (C([0; 1]), || · || ∞ ), that is uniform convergence of continuous functions.
Moreover, the sequence of random functions (x → Z (N ) σ (x)) N ≥1 converges in distribution towards the centered Gaussian process x → G(x) with covariance function Cov(G(x), G(y)) = K(x, y).

Statistics in random permutations
Proof. As, for any N ≥ 1 and any σ ∈ S N , the function x → F (N ) σ (x) is non-decreasing, the first statement follows easily from the convergence at any fixed x. The argument can be found for example in a paper of J.F. Marckert [25, first page], but it is so short and simple that we copy it here. By monotonicity of F (N ) σ and F , for any list (x i ) 0≤i≤k with 0 = x 0 < x 1 < · · · < x k = 1, one has sup x∈[0;1] which may be chosen as small as wanted.
Consider the second statement. If the sequence of random function x → Z (N ) σ (x) has a limit, its finite-dimensional laws are necessarily the limits of the ones of Z (N ) σ , that is, by Theorem 1.2, Gaussian vectors with covariance matrices given by (K(x i , x j )) 1≤i,j≤r . As a probability measure on C([0; 1]) is entirely determined by its finite dimensional laws [7, Example 1.2], one just has to prove that the sequence x → Z A simple adaptation of the proof of Proposition 5.3 shows that Indeed, in Lemma 4.1, if we ask that at least one entry of the list i is between N s and N s then the number of lists is bounded from above by C L N t |s − s |. Finally, The last inequality has been deduced from |s − s | ≥ N −1 .
We can now apply Theorem 10.2 of Billingsley's book [7] with S i = Z This Section is devoted to the applications of our method to adjacencies (Subsection 6.2) and dashed patterns (Subsection 6.3). These two statistics belong in fact to the same general framework and we discuss in Subsection 6.4 the possibility of unifying our results.
The proofs in this section are a little bit more technical than the ones before and in particular we need a new lemma for step 3, given in Subsection 6.1.

Preliminaries
Let L ≥ 1 be an integer. , whose corresponding graph has exactly t connected components is bounded from above by C L,D N t .
Proof. If we fix a graph G with vertex set L and t connected components and if we fix also, for each edge e of the graph, the actual value of δ e (i), then the corresponding number of lists i is smaller than N t . Indeed, the sequence will be determined by the choice of one value per connected component of G (with some constraints, so that no extra edges appear). But the number of graphs and of values on edges are finite (the sets D j,k are finite) and depend only on L and on the family D.

Adjacencies
In this section, we prove the following extension of Theorem 1.3. Theorem 6.2. Let σ N be a sequence of random Ewens permutations, such that σ N has size N . Then the number A (N ) of adjacencies in σ N converges in distribution towards a Poisson variable with parameter 2.
Proof. As before, we write A (N ) in terms of the B (N ) i,s (we use the convention B Hence, for ≥ 1, its -th cumulant writes as (step 1): Given two lists i and s of positive integers, we consider the three following graphs: • H 1 has vertex set [ ] and has an edge between j and k if |i j −i k | ≤ 2 and |s j −s k | ≤ 2; • H 2 has vertex set [ ] and has an edge between j and k if {i j , i j ± 1, s j , s j ± 1} ∩ {i k , i k ± 1, s k , s k ± 1} = ∅.
• H has vertex set [ ] [ ] and has an edge between j and k (resp. j andk,j andk) if |i j − i k | ≤ 2 (resp. |i j − s k | ≤ 2, |s j − s k | ≤ 2) We will use Theorem 1.4 to give a bound for is at least equal to 2#(CC(H 1 )) ≥ #(CC(H 1 )) + 1. Besides, in this case, the graph G 2 introduced in Section 1.3 has the same vertex set as H 2 and fewer edges. Hence it has more connected components. Therefore, Theorem 1.4 implies (step 2): But, using the terminology of Section 3.3, the graphs H 1 and H 2 are the strong and usual quotients of H. Therefore, by Lemma 3.2, one has: #(CC(H)) ≤ #(CC(H 1 )) + #(CC(H 2 )).

(6.2)
Besides, Lemma 6.1 implies the number of lists i and s with entries in [N ] such that H has exactly t connected components is bounded from above by C 2 ,D N t for D wellchosen (step 3). In particular the constant C 2 ,D does not depend on N . Therefore, the total contribution of these lists to equation (6.1) is bounded from above by C 2 N −t · C 2 ,D N t = C 2 · C 2 ,D . Finally, the cumulants of A (N ) converges towards those of a Poisson variable with parameter 2, which implies the convergence of A (N ) in distribution.

Dashed patterns
In this section, we prove Theorem 1.8, which describes, for any given dashed pattern (τ, X), the asymptotic behavior of the sequence (O 3) The first (resp. second) summation index is the set of matrices (i r j ) (resp. (s r j )) with (j, r) ∈ [p] × [ ] such that: • for all r, i r 1 < · · · < i r p (resp. s r τ −1 (1) < · · · < s r τ −1 (p) ); • for all r, for all x ∈ X, i r x+1 = i r x + 1 (resp. no extra condition on the s's).
Given such lists i and s, we consider the four following graphs: • H 1 has vertex set [p] × [ ] and has an edge between (j, r) and (k, t) if |i r j − i t k | ≤ 1 and s r j = s t k ; • H 2 has vertex set [p] × [ ] and has an edge between (j, r) and (k, t) if {i r j , i r j + 1, s r j } ∩ {i t k , i t k + 1, s t k } = ∅.

Conclusion: local statistics
Recently, several authors have further generalized the notion of dashed patterns into the notion of bivincular patterns [10, Section 2]. The idea is roughly that, in an occurrence of a bivincular pattern, one can ask that some values are consecutive (and not only some places as in dashed patterns). This new notion is very natural as occurrences of bivincular patterns in the inverse of a permutation correspond to occurrences of bivincular patterns in the permutation itself (which is not true for dashed patterns).
It would be interesting to give a general theorem on the asymptotic behavior of the number of occurrences of a given bivincular pattern. This seems to be a hard problem as many different behavior can occur: • The number of adjacencies is the sum of the number of occurrences of two different bivincular patterns and converge towards a Poisson distribution.
• The dashed patterns are special cases of bivincular patterns. As we have seen in the previous section, their number of occurrences converges, after normalization, towards a Gaussian law (at least for patterns of size smaller than 9, the general case relies on Conjecture 1.9). Other bivincular patterns exhibit the same behavior, for example the one considered in [10].
• Other behaviors can occur: for example, it is easy to see that the number of occurrences of the pattern (123, {1}, {1}) (we use the notations of [10]), has an expectation of order n, but a probability of being 0 with a positive lower bound.
Unfortunately, we have not been able to give a general statement. Let us however emphasize the fact that our approach unifies the first two cases. More generally, our approach seems suitable to study what could be called a local statistic. Fix a integer p ≥ 1 and a set S of constraints: a constraint is an equality or inequality (large or strict) whose members are of the form i j + d or s j + d where j belongs to [p] and d is some integer. Then, for a permutation σ of S N , we define O is suitable for the asymptotic study of joint vectors of local statistics. We have failed to find a general statement, but we are convinced that our approach can be adapted to many more examples than the ones studied in this article.
However, the method does not seem appropriate to global statistics, such as the total number of cycles of the permutation or the length of the longest cycle.