Large Deviations for Permutations Avoiding Monotone Patterns

For a given permutation $\tau$, let $P_N^{\tau}$ be the uniform probability distribution on the set of $N$-element permutations $\sigma$ that avoid the pattern $\tau$. For $\tau=\mu_k:=123\cdots k$, we consider $P_N^{\mu_k}(\sigma_I=J)$ where $I\sim \gamma N$ and $J\sim \delta N$ for $\gamma,\delta \in (0,1)$. If $\gamma+\delta\neq 1$ then we are in the large deviations regime with the probability decaying exponentially, and we calculate the limiting value of $P_N^{\mu_k}(\sigma_I=J)^{1/N}$. We also observe that for $\tau = \lambda_{k,\ell} := 12\ldots\ell k(k-1)\ldots(\ell+1)$ and $\gamma+\delta<1$, the limit of $P_N^{\tau}(\sigma_I=J)^{1/N}$ is the same as for $\tau=\mu_k$.


Introduction and Statement of Results
This paper concerns an aspect of the probabilistic properties of a class of pattern-avoiding permutations. As surveyed in the books of Bóna [4] and Kitaev [9], pattern avoidance has been of considerable interest in combinatorial theory, interacting with fields ranging from algebraic combinatorics to the theory of algorithms. In the next few paragraphs, we give a brief description of the context. For each positive integer N , let S N be the set of all permutations of 1, 2, . . . , N . We represent a permutation σ ∈ S N as a string of numbers using the one-line notation σ = σ 1 . . . σ N . We also view σ as the function on {1, . . . , N } that maps i to σ(i) = σ i . The graph of the function σ is the set of N points {(i, σ i ) : i = 1, . . . , N } in Z 2 . Given τ ∈ S k (with k ≤ N ), we say that a permutation σ ∈ S N avoids the pattern τ (or "σ is τ -avoiding") if there is no k-element subsequence of σ 1 , . . . , σ N having the same relative order as τ . (See Section 1.1 for a more formal definition.) Let S N (τ ) be the set of permutations in S N that avoid τ . For example, the permutation 24153 is not in S 5 (312) because it contains the subsequence 413, which has the same relative order as 312. In contrast, the permutation 35421 has no such subsequence, and hence 35421 ∈ S 5 (312).
We write |A| to denote the number of elements in a set A. Knuth [10] proved that |S N (τ )| is the same for all τ ∈ S 3 and is equal to the N th Catalan number, that is 2N N /(N + 1) for every N . For τ ∈ S k with k ≥ 4, the values of |S N (τ )| depend on the pattern τ and have been computed for only some cases. For example, Gessel [6] used generating functions to show that |S N (1234)| = 2 N k=0 2k k N k 2 3k 2 + 2k + 1 − N − 2kN (k + 1) 2 (k + 2)(N − k + 1) .
Recently, some researchers have taken a probabilistic viewpoint towards investigating pattern-avoiding permutations, especially for patterns in S 3 . They have been concerned with the configurational properties of a typical τ -avoiding permutation of length N -more precisely, of a permutation drawn uniformly at random from the set S N (τ ). Accordingly, we shall write P τ N to denote the uniform probability distribution over the set S N (τ ). The following result, proven independently by Miner and Pak [14] and by Atapour and Madras [2], motivates the present paper.
Theorem 1.1. [2,14] Fix numbers γ and δ in (0, 1) such that γ < 1 − δ. For each N , let I N and J N be integers in [1, N ] such that Then where we define Since G(u, v; 1) < 4 whenever u = v, we see that the probabilities P 123 N (σ I N = J N ) and P 132 N (σ I N = J N ) decay exponentially in N when γ < 1 − δ. Thus, a random 123-avoiding or 132-avoiding permutation is exponentially unlikely to contain any points N below the diagonal {(i, N −i+1) : 1 ≤ i ≤ N }; we refer to this as the "large deviations" regime. In the case that γ > 1 − δ, Equation (2) still holds (by symmetry about the diagonal), but for τ = 132 there is no exponential decay-i.e. the limit in Equation (3) is 1. In fact, [14]). Madras and Pehlivan [12] also examined joint probabilities under P 132 N , proving for example that the probability that graph of σ has two specified points below the diagonal is of order N −3 (under certain conditions on the points). Rizzolo, Hoffman, and Slivken [7] proved that for τ ∈ S 3 , the shape of a τ -avoiding random permutation can be described by Brownian excursion. Janson [8] studied the number of occurrences of another pattern π inside a random 132-avoiding permutation.
Although patterns of length 3 are amenable to precise probabilistic results, analogues for longer patterns seem to be much harder. One reason for this is that for τ ∈ S 3 , there are nice bijections from S N (τ ) to the set of Dyck paths of length 2N , and these bijections translate various configurational properties of τ -avoiding permutations into tractable properties of Dyck paths (e.g. [7], [12]). (At a more metaphysical level: when the Catalan numbers appear in a problem, nice things happen.) However, nice bijections are much harder to find for patterns of length 4. Although exact formulas for |S N (τ )| are known for some patterns τ of length 4, their proofs are much more complicated than for length 3 and do not seem to be useful for investigating properties of P τ N . In this paper our goal is to extend the large deviation result of Theorem 1.1 to the patterns µ k for k ≥ 4. In contrast to the proof for µ 3 , our derivation of the precise large deviations results does not require exact formulas for finite values of N .
We shall examine the cardinalities of sets of the form Then in terms of the uniform distribution over S N (τ ), we have Monte Carlo simulations by Gökhan Yıldırım (as seen in Figure 1) suggests as N gets larger the number of points well below the x+y = 1 line decreases. We shall typically consider the case J N − I (i.e., points "below the diagonal"); when τ = µ k , the case J N − I follows from symmetry considerations. Since we know the asymptotics of the denominator |S N (τ )| for our patterns of interest, and since our methods are essentially combinatorial, we shall henceforth discuss only the numerator, dealing directly with |F N (I, J; τ )| and related combinatorial quantities.
where we define  Remark 1.3. When J N ≈ N − I N (i.e., when we are close to the diagonal), then we are in the (limiting) case γ = 1 − δ. This is not a "large deviation," since G(u, u; (k − 2) 2 ) = L(µ k ); indeed, and it follows that is examined by Fineman, Slivken, Rizzolo, and Hoffman (in preparation).
Remark 1.4. The numerator and denominator inside the parentheses in Equation (9) are both 0 when we set c = 1. Therefore we define g(x, y; 1) by taking the limit of g(x, y; c) as c → 1 + . We then obtain which in turn implies that G(u, v; 1) is given by Equation (4). Thus our Theorem 1.2 formally recovers Theorem 1.1.
Remark 1.5. Assume that γ, δ, I N and J N are as in Theorem 1.1 except that γ > 1 − δ. Then Equation (6) still holds (by symmetry), while The term (k − 2) 2 appears in Equations (6) and (7) because it is the value of L(µ k−1 ). This is highlighted and generalized in Theorem 1.8 below.
Definition 1.6. Let N and A be positive integers, and let τ be a fixed permutation. Define Thus, the graph of a permutation in S * A N (τ ) has no point that is more than Then Theorem 1.2 of [2] implies that for every > 0, |S * N N (123)|/|S N (123)| and |S * N N (132)|/|S N (132)| converge to 1 exponentially rapidly as N → ∞.
For example, 1 3124 = 14235. Observe that 1 µ k−1 = µ k and 1 Most of the present paper will focus on the proof of the following theorem.
Theorem 1.8. Letτ be a pattern of length 3 or more, and assume that Let τ = 1 τ . Let γ, δ, I N and J N be as specified in the statement of Theorem 1.2. Then Remark 1.9. (a) Theorem 1.2 of [2] implies that Equation (10) holds for µ 3 and λ 3,1 . [2] implies that if Equation (10) holds, thenτ 1 must equal 1. The converse of this statement has neither been proved nor disproved; however, simulations in [2] and [11] suggest that (10) is false for τ = 1324.
As we shall see in Section 4, Theorem 1.2 follows from Theorem 1.8 by induction on k, with Remark 1.9(a) leading to the base case k = 4. The idea behind the proof of Theorem 1.8 consists of three main steps. An important role is played by the set F * N (I, J; τ ) of permutations in F N (I, J; τ ) for which (I, J) is a left-to-right minimum (i.e., σ i > J for all i < I). The first step is to derive an explicit upper bound to show that |F * N (I, J; τ )| 1/N is less than or equal to G(γ, 1−δ; L(τ )) in the limit. The second step is to use monotonicity of G to show that we can replace F * by F in the preceding assertion. The third step uses the dominant terms from the upper bound of the first step to construct a lower bound on |F N (I, J; τ )| 1/N that is arbitrarily close to the upper bound. Section 2 carries out the first two steps, while Section 3 performs the third step. Section 4 ties the pieces together to complete the proofs of the two theorems. Section 1.1 presents some basic definitions and a useful lemma.
We close this section with a physical analogy to help visualize our results about µ k . It is easy to verify that an N -element permutation σ is in S N (µ k ) if and only if σ can be partitioned into k − 1 decreasing subsequences. It is not hard to see that these decreasing subsequences are all likely to stay close to the decreasing diagonal of [1, N ] 2 . Think of the subsequences as k−1 elastic strings, each with one end tied to the point (1, N ) and the other end tied to (N, 1), and each string tight. Requiring σ I to equal J is like forcing one of the strings to pass through the point (I, J). With this constraint, the rest of the string deforms into two line segments, one from (1, N ) to (I, J) and the other from (I, J) to (N, 1). Tension in the string dictates how the mass of the string is balanced among the two segments, and the mass is evenly distributed within each segment. This physical picture parallels our lower bound construction in Section 3.

Some Formalities and Preliminaries
For a string ω of length k whose entries are all distinct numbers, let Patt(ω) be the permutation in S k that has the same relative order as ω. E.g., Patt(91734) = 51423. More precisely, Patt(ω 1 ω 2 · · · ω k ) is the unique permutation π in S k with the property that for all i, j ∈ {1, . . . , k}, ω i < ω j if and only if π i < π j .
Assume τ ∈ S k and σ ∈ S N . We say that σ contains the pattern τ if there exists 1 ≤ I 1 < I 2 < · · · < I k ≤ N such that Patt(σ I 1 σ I 2 · · · σ I k ) = τ . We say that σ avoids the pattern τ if σ does not contain τ . We write S N (τ ) for the set of all permutations in S N that avoid τ .
We shall also use the following well-known results.
Lemma 1.11. (i) Let s and t be integers satisfying 0 ≤ s ≤ t. Then In this lemma, we interpret 0 0 to be 1.
Proof : Part (ii ) follows from Stirling's formula, and part (i ) is proven for example in Lemma 2.1(b) in [2].

The Upper Bound
We begin with some definitions. For a given permutation σ, define That is, M is the set of points of the graph of σ corresponding to left-toright minima. Next, let σ \ M be the string consisting of those σ t such that (t, σ t ) ∈ M(σ). Figure 3 shows an example. More generally, if A is a subset of Z 2 , let σ \ A denote the string consisting of those σ t such that (t, σ t ) ∈ A. Recall from Section 1 that We shall now perform the first step in the proof of our main theorem.
In the last step, the bound |S N −l−m−1 (τ )| ≤ L(τ ) N −l−m−1 is proven in Theorem 1 in [1]. We now wish to bound H(a, b; c) for a ≤ b and c > 1. By Lemma 1.11(i ), we have We now pause to state and prove a lemma, which will also be useful later.
and the maximum value of f is where g was defined in Equation (9).
Proof of Lemma 2.3: By calculus, it is easy to see that log f is a strictly concave function of y on [0, a ∧ b], and is maximized at the (unique) point Thus Equation (17) becomes (using (20)).
Solving the quadratic equation (20) for the positive root gives  Proposition 2.2 now follows directly from the above (with c = L(τ )) and Equation (14).
Our next task is to replace F * N by F N in the statement of Proposition 2.2. We shall do this by proving a monotonicity property of G (Lemma 2.5) and then using a compactness argument.
We begin by showing that G decreases as we move away from the diagonal. We emphasize that in this lemma, "increasing" and "decreasing" are used in their strict sense.
Lemma 2.5. Fix c > 1. The function G(u, v; c) defined in Equation (8) is increasing in u and decreasing in v for 0 < u < v < 1. By symmetry, it is also increasing in v and decreasing in u for 0 < v < u < 1. In particular, G is maximized when u = v, where we have Proof: Recall that Equation (25) was proved in Remark 1.3. Since c is fixed, we shall suppress it in the following notation. Let By routine calculus and some algebraic manipulation, we obtain .
Using this and Equation (26), we can show that From this and Equation (27), we also obtain for every u and v in (0, 1). Therefore G(u, v; c) is strictly concave in u for fixed v (and, by symmetry, it is strictly concave in v for fixed u).
Since h(u, u) = 2c − 2 √ c for every u, it follows that the partial derivative in Equation (28) is zero whenever u = v. By symmetry, the same is true for the partial derivative with respect to v. Combining this with the concavity result of the previous paragraph completes the proof of the lemma.

The Lower Bound
To get the lower bound on |F N (I, J; τ )|, we shall perform an explicit construction of some permutations in F * N (I, J; τ ) (this is done in the proof of Proposition 3.3 below). The construction is motivated by examining the dominant terms in our proof of the upper bound, and showing that they are approximately achieved.
The main result of this section is the following.
The proof of Proposition 3.1 relies on Proposition 3.3 and Lemma 3.4. We shall first state these two auxiliary results, then prove Proposition 3.1, and conclude the section by proving the two auxiliary results.
The construction of Proposition 3.3 uses a positive parameter A, which will afterwards be of the order N for fixed small . We start with a definition.
with 0 < θ < α ∧ β and > 0. Then and (for f defined by Equation (17) Lemma 2.3 assures us that y * 1 < γ ∧ (1 − δ − 2 ) and y * 2 < (1 − γ − 2 ) ∧ δ, and therefore Equation (32) holds for all sufficiently large N (where I is interpreted to be I N , etc.). Using these sequences in Proposition 3.3 and invoking Lemma 3.4 and Equations (19) and (10), we see that the N th root of the right hand side of Equation (33) converges to Thus Equation (37) is a lower bound for lim inf N →∞ |F * N (I N , J N ; τ )| 1/N for all sufficiently small positive . Now let decrease to 0. By the continuity of g, the expression of Equation (37) converges to G(γ, 1 − δ; c). This proves the proposition.
Proof of Proposition 3.3: Fix N , I, J, A, w 1 and w 2 as specified. We shall prove the proposition by constructing an injection from D into F * N (I, J; τ ), where We claim that For (x, y) = (I, J), this follows from our assumption J < N − I − 2A. For (x, y) in B 1 + (0, J), we have (x, y − J) ∈ B 1 and hence , which verifies the claim in this case. A similar argument works if (x, y) ∈ B 2 + (I, 0). Therefore the claim (38) is true. Given Ψ and a permutation φ ∈ S * A N −w 1 −w 2 −1 (τ ), we shall define a permutation σ ∈ S N such that Ψ is contained in the graph of σ (i.e., y = σ x whenever (x, y) ∈ Ψ) and Patt(σ \ Ψ) = φ. Let w = w 1 + w 2 + 1, and write the elements of Ψ as ( i.e., where m satisfies x(m) < i + m < x(m + 1).
It only remains to prove the Key Claim. For j ∈ Ψ x , say j = x( ), we have σ j = y( ), and the assertion of the Key Claim follows from Equation (38). Now suppose j ∈ Ψ x . Then for some i ∈ [1, N − w] we have j = Γ x (i) and σ j = Γ y (φ i ). Since φ ∈ S * A N −w (τ ), we know that φ i > (N − w) − i − A. Following the notation in the definitions of Γ x and Γ y , let m = Γ x (i) − i and n = Γ y (φ i ) − φ i . Then x(m) < i + m < x(m + 1) and y(w−n+1) < φ i + n < y(w−n). Also, we have Thus, to show σ j > N − j − A, as required for proving the Key Claim, we need to show that m ≥ w − n.
Assume that m ≥ w − n is false, i.e. that m + 1 ≤ w − n. Since y( ) ≥ y( + 1) + 1 for every , we see that Using this inequality and those of the preceding paragraph, we obtain which is a contradiction. Therefore m ≥ w − n. This proves the Key Claim, and hence the proposition.
This proves Property I. Now, Property I implies that |Dec * B (w; M 1 , M 2 )| ≥ |Seq * A (w; M 1 )| |Seq * A (w; M 2 )|. Recalling Equation (39), we see that Equation (34) will follow if we can prove We shall prove Property II by converting it into a probabilistic statement. Let p ∈ (0, 1). Let G 1 , G 2 , . . . be a sequence of independent random variables having the geometric distribution with parameter p; that is, Pr(G i = ) = p(1 − p) −1 for = 1, 2, . . .. Next, let T i = G 1 + G 2 + · · · + G i for each i. These random variables have negative binomial distributions Moreover, for any x ∈ Seq(w; N ) (writing x(0) = 0 and x(w + 1) = N + 1), Equation (41) says that the conditional distribution of (T 1 , . . . , T w ) given that T w+1 = N + 1 is precisely the uniform distribution on Seq(w; N ). This assertion is true for any p. Let us now fix p = (w + 1)/N ; we shall soon see why this is a convenient choice. By Equation (41) It is straightforward to derive the asymptotic behaviour Pr(T w+1 = N + 1) using Stirling's Formula m! ∼ √ 2πm(m/e) m and p = (w + 1)/N , with w = w(N ) ∼ θN , as follows.
For the numerator of the right-hand side of Equation (42), we use Kolmogorov's Inequality [5], along with the property that the random variables G i have mean 1/p and variance (1 − p)/p 2 : Pr max =1,...,w Applying Equations (43) and (44) to Equation (42) proves Property II. This completes the proof of Lemma 3.4.

Conclusion
Recalling Remark 1.9(b), we see that Theorem 1.8 follows immediately from Propositions 2.4 and 3.1. We now show that Equation (6) of Theorem 1.2 follows from Theorem 1.8 by induction. Remark 1.9(a) tells us that we can apply Theorem 1.8 when τ is 1 µ 3 , which shows that Equation (6) holds for k = 4. Now assume that Equation (6) is true for a given k ≥ 4. Lemma 2.5 and Remark 1.3 prove that G(γ, 1 − δ; (k − 2) 2 ) < (k − 1) 2 whenever γ < 1 − δ. This means that Equation (6) implies Equation (10) whenτ is µ k , using and a compactness argument as in the proof of Proposition 2.4. Hence Equation (11) holds when τ is µ k+1 , in which case L(τ ) equals (k − 1) 2 . This says that Equation (6) holds with k replaced by k + 1. This completes the induction, showing that Equation (6) holds for every k ≥ 4. Finally we shall prove Equation (7) for k ≥ 4 and 1 ≤ ≤ k − 2. The proof of Proposition 2.3 in [3] shows that there is a bijection from S N (1 . . . ( +1) · · · (k−1)k) to S N (1 . . . k(k−1) . . . ( + 1)) that preserves all the left-to-right minima of each permutation. (To see this, observe that when A = J in the proof of [3], each right-to-left minimum and everything below it and to its right are all coloured blue, and hence are unchanged by the bijection α.) It follows that Equations (45) and (46) together imply Equation (7). This completes the proof of Theorem 1.2.