On moment sequences and mixed Poisson distributions

In this article we survey properties of mixed Poisson distributions and probabilistic aspects of the Stirling transform: given a non-negative random variable $X$ with moment sequence $(\mu_s)_{s\in\mathbb{N}}$ we determine a discrete random variable $Y$, whose moment sequence is given by the Stirling transform of the sequence $(\mu_s)_{s\in\mathbb{N}}$, and identify the distribution as a mixed Poisson distribution. We discuss properties of this family of distributions and present a simple limit theorem based on expansions of factorial moments instead of power moments. Moreover, we present several examples of mixed Poisson distributions in the analysis of random discrete structures, unifying and extending earlier results. We also add several entirely new results: we analyse triangular urn models, where the initial configuration or the dimension of the urn is not fixed, but may depend on the discrete time $n$. We discuss the branching structure of plane recursive trees and its relation to table sizes in the Chinese restaurant process. Furthermore, we discuss root isolation procedures in Cayley-trees, a parameter in parking functions, zero contacts in lattice paths consisting of bridges, and a parameter related to cyclic points and trees in graphs of random mappings, all leading to mixed Poisson-Rayleigh distributions. Finally, we indicate how mixed Poisson distributions naturally arise in the critical composition scheme of Analytic Combinatorics.


INTRODUCTION
In combinatorics the Stirling transform of a given sequence (a s ) s∈N , see [7,68], is the sequence (b s ) s∈N , with elements given by b s = s k=1 s k a k , s ≥ 1.
The inverse Stirling transform of the sequence (b n ) n∈N is obtained as follows: Here s k denote the Stirling numbers of the second kind, counting the number of ways to partition a set of s objects into k non-empty subsets, see [66] or [25], and n m denotes the unsigned Stirling numbers of the first kind, counting the number of permutations of n elements with m cycles [25]. These numbers appear as cofficients in the expansions relating ordinary powers x s to the so-called falling factorials x s = x(x−1) . . . (x− (s − 1)), s ∈ N 0 . On the level of exponential generating functions A(z) = s≥0 a s z s /s! and B(z) = s≥0 b s z s /s!, the Stirling transform and the relations (1) and (2)  On the level of exponential generating functions: B(z) = A ρ(e z − 1) and A(z) = B log(1 + z ρ ) . The aim of this work is to discuss several probabilistic aspects of a generalized Stirling transform with parameter ρ > 0 in connection with moment sequences and mixed Poisson distributions, pointing out applications in the analysis of random discrete structures. Given a non-negative random variable X with power moments E(X s ) = µ s ∈ R + , s ≥ 1, we study the properties of another the random variable Y , given its sequence of factorial moments E(Y s ) = E Y (Y − 1) . . . (Y − (s − 1)) , which are determined by the moments of X, E(Y s ) = ρ s E(X s ) = ρ s µ s , s ≥ 1, where ρ > 0 denotes an auxiliary scale parameter. Moreover, we discuss relations between the moment generating functions ψ(z) = E(e zX ), ϕ(z) = E(e zY ) of X and Y , respectively.
1.1. Motivation. Our main motivation to study random variables with a given sequences of factorial moments (6) stems from the analysis of combinatorial structures. In many cases, amongst others the analysis of inversions in labelled tree families [56], stopping times in urn models [47,45], node degrees in increasing trees [44], block sizes in k-Stirling permutations [45], descendants in increasing trees [42], ancestors and descendants in evolving k-tree models [57], pairs of random variables X and Y arise as limiting distributions for certain parameters of interests associated to the combinatorial structures. The random variable X can usually be determined via its (power) moment sequence (µ s ) s∈N , and the random variable Y in terms of the sequence of factorial moments satisfying relation (6). An open problem was to understand in more detail the nature of the random variable Y . In [56,47] a few results in this direction were obtained. The goal of this work is twofold: first, to survey the properties of mixed Poisson distributions, and second to discuss their appearances in combinatorics and the analysis of random discrete structures, complementing existing results; we will also add a few entirely new results. It will turn out that the identification of the distribution of Y can be directly solved using mixed Poisson distributions, which are widely used in applied probability theory, see for example [53,70,52,37,15]. In the analysis of random discrete structures mixed Poisson distributions have been used mainly in the context of Poisson approximation, see e.g. [26]. In this work we point out the appearance of mixed Poisson distributions as a genuine limiting distribution, and also present closely related phase transitions. In particular, we discuss natural occurrences of mixed Poisson distributions in urn models of a non-standard nature -either the size of the urn, or the initial conditions are allowed to depend on the discrete time.
1.2. Notation and Terminology. We denote with R + the non-negative real numbers. Here and throughout this work we use the notation x s = x(x − 1) . . . (x − (s − 1)) for the falling factorials, and x s = x(x + 1) . . . (x + s − 1) for the rising factorials. 1 Moreover, we denote with s k the Stirling numbers of the second kind. We use the notation U L = V for the equality in distribution of random variables U and V , and U n L − → V denotes the converge in distribution of a sequence of random variables U n to V . The indicator variable of the event A is denoted by 1 A . Throughout this work the term "convergence of all moments" of a sequence of random variables refers exclusively to the convergence of all non-negative integer moments. Furthermore, we denote with E v the evaluation operator of the variable v to the value v = 1, and with D v the differential operator with respect to v.
1.3. Plan of the paper. In the next section we state the definition of mixed Poisson distributions and discuss its properties. In Section 3 we collect several examples from the literature, unifying and extending earlier results. Furthermore, in Section 4 we present a novel approach to balanced triangular urn models and its relation to mixed Poisson distributions. Section 5 is devoted to new results concerning mixed Poisson distributions with Rayleigh mixing distribution; in particular, we discuss a node isolation in Cayley trees, directed lattice paths and zero contacts, and also cyclic points in random mappings. Finally, in Section 6 we discuss multivariate mixed Poisson distributions.

MOMENT SEQUENCES AND MIXED POISSON DISTRIBUTIONS
2.1. Discrete distributions and factorial moments. In order obtain a random variable Y with prescribed sequence of factorial moments, given according to Equation (6) by E(Y s ) = ρ s µ s , a first ansatz would be the following. Let Y denote a discrete random variable supported on the non-negative integers, and p(v) its probability generating function, The factorial moments of Y can be obtained from the probability generating function by repeated differentiation, 1 The notation x s and x s was introduced and popularized by Knuth; alternative notations for the falling factorials include the Pochhammer symbol (x)s, which is unfortunately sometimes also used for the rising factorials.
Consequently, we can describe the probability mass function of a random variable Y as follows: This implies that Up to now the calculations have been purely symbolic, no convergence issues have been adressed. In order to put the calculations above on solid grounds, and to identify the distribution, we discuss mixed Poisson distributions and their properties in the next subsection.

Properties of mixed Poisson distributions.
Definition 1. Let X denote a non-negative random variable, with cumulative distribution function Λ(.), then the discrete random variable Y with probability mass function given by has a mixed Poisson distribution with mixing distribution X, in symbol Y L = MPo(X).
The boundary case X L = 0 leads to a degenerate distribution with all mass concentrated at zero. A more compact notation for the probability mass function of Y is sometimes used instead of the one given above: P{Y = } = 1 ! E(X e −X ). One often encounters a slightly different definition, which includes a scale parameter ρ ≥ 0: arises as a mixed Poisson distribution with degenerate mixing distribution X L = 1.
Example 3. A Rayleigh distributed r.v. X L = Rayleigh(σ) with parameter σ has the probability density function and is fully characterized by its (power) moment sequence: A discrete random variable Y with probability mass function arises as a mixed Poisson distribution Y L = MPo(ρX) with mixing distribution X L = Rayleigh(1) and scale parameter ρ. We call Y a Poisson-Rayleigh distribution with parameter ρ. Note that for ρ < 1 we can expand e −ρx and obtain a series representation of P{Y = }. Another representation valid for all ρ > 0 can be stated in terms of the incomplete gamma function Γ(s, x) = ∞ x t s−1 e −t dt: Example 4. The Neyman Type A Distribution is a discrete probability distribution often used in biology and ecology [53,52]. It is a mixed Poisson distribution with mixing distribution X L = Po(λ) given by an (ordinary) Poisson distribution with parameter λ, scaled by ρ: For a very comprehensive list of examples of mixed Poisson distributions we refer the reader to the article of Willmot [70]. Since by (3) the factorial moments E(Y s ) are related to the ordinary moments in terms of the Stirling numbers of the second kind, the moment sequence of Y is the (scaled) Stirling transform of the moment sequence of X. Next we collect similar basic properties of mixed poisson distributions.
The power moments of Y and X are related by the generalized Stirling transform with parameter ρ, and its inverse, respectively: Similarly, the cumulants of Y and X are related by the generalized Stirling transform with parameter ρ, and its inverse, respectively. (c) The moment generating functions ϕ(z) = E(e zY ) and ψ(z) = E(e zX ) are related by the (generalized) Stirling transform of functions and its inverse, respectively: (b) By converting Y s into ordinary powers (3) the sequence of ordinary power moments (E(Y s )) s∈N of a mixed Poisson distributed random variable Y is given by the Stirling transform of the moments of the mixing distribution in the following way: (10) The result concerning the moment generating function in (c) can be shown similar to (4) by directly computing E(e zY ), interchanging integration and summation: By definition, the latter expression is exactly ψ ρ(e z −1) , where ψ(z) = E(e zX ) denotes the moment generating function of the mixing distribution X. If the cdf of X is not known, we can compute the moment generating function ϕ(z) of Y utilizing only the moments sequences: Using the bivariate generating function identity of the Stirling numbers of the second kind (see Wilf [69]) n≥0 k≥0 n k z n n! u k = e u(e z −1) , we obtain further The latter expression is exactly the Stirling transform of ψ(z) = j≥0 µ j z j j! -in other words of the moment generating function of X, evaluated at ρ(e z − 1). The relation for the cumulants now follows readily from (c), since the cumulant generating functions k X (z) and k Y (z) of X and Y are given by k X (z) = log(ψ(z)) and k Y (z) = log(ϕ(z)). For a proof of part (d) we refer the reader to Johnson, Kotz and Kemp [36].
In the applied probability literature, see [37,70], given Y L = MPo(ρX) it is usually assumed that the cumulative distribution function of the mixing distribution X of is known. However, in many cases in the analysis of random discrete structures the mixing distribution X is solely determined by the sequence of moments E(X s ) = µ s ∈ R + , s ≥ 1. Hence, it is beneficial to we express the probability mass function of a mixed Poisson distributed random variable solely in terms of the moments of X, justifying (8). Note that for specific mixed Poisson distributions different simpler formulas may exist (compare with Corollay 2). Proposition 2. Let X denote a random variable with moment sequence given by (µ s ) s∈N such that ψ(z) = E(e zX ) exists in a neighbourhood of zero, including the value z = −ρ. A random variable Y with factorial moments given by E(Y s ) = ρ s µ s has a mixed Poisson distribution Y L = MPo(ρX) with mixing distribution X and scale parameter ρ > 0, and the sequence of power moments of Y is the Stirling transform of the moment sequence (µ s ) s∈N . The probability mass function of Y is given by Proof. By our assumption on the existence of ψ(z) in a neighbourhood of zero, it follows that ϕ(z) is also analytic around z = 0, and the random variable Y is uniquely determined by its (factorial) moments. Consequently, Y has a mixed Poisson distribution. Moreover, the probability mass function of Y is obtained by Alternatively, the formula for the probability mass function can formally be obtained directly from the definition Interchanging summation and integration leads to the stated result.
2.3. The method of moments and basic limit laws. The method of moments is a classical way of deriving limit laws (see for example Hwang and Neininger [28] and the references therein). Given a sequence of random variables (X n ) n∈N one first derives asymptotic expansions of the power moments; assume that the moments satisfy the asymptotic expansion with λ n denoting non-negative scale parameters. Then, one considers the scaled random variables Xn λn , and tries to prove convergence in distribution of Xn λn by using the Fréchet-Shohat moment convergence theorem [48]: if the power moments of Xn λn converge to the moments (µ s ) s∈N , and the moment sequence (µ s ) s∈N determines a unique non-degenerate distribution, then the random variable Xn λn converges in distribution to X. A well-known sufficient criterion for the uniqueness of the distribution of X is Carleman's condition: the distribution of X is uniquely determined if Note that (14) is satisfied, whenever E(e zX ) exists in a neighbourhood of zero. We obtain the following result concerning mixed Poisson distributions. Proof. Note first that the second part follows directly from Proposition 1 part (c). Assume now that the moments of Y satisfy Carleman's condition. We observe that the moments (µ s ) s∈N of X are bounded by the scaled power moments of Y , Consequently, the distribution of X is also uniquely determined by its moment sequence: Conversely, assume that the moments of X satisfy Carleman's condition: The s th power moment of Y can be estimated using the s th factorial moment of Y the following way This implies that Consequently, By Hölder's inequality, the moments of X ≥ 0 satisfy for 0 < r < s the inequality Hence, for integer 0 < r < s we have µ which immediately implies the required result; note that we omitted the additional factor 1 2 for the sake of simplicity. If m s is bounded away from zero this is immediately true. Hence, we assume in the following that (m s ) s∈N is a null sequence. Let N = I 1 ∪ I 2 , with I 1 ∩ I 2 = ∅, such that for all s ∈ I 1 we have 1 s ≤ m s , and for s ∈ I 2 we have 1 s > m s . We obtain By our initial assumption s≥1 m s = ∞ the equation 15 is directly satisfied if either I 1 or I 2 is finite. Hence, we assume that both sets are infinite. Assume further that s∈I 1 1 s is finite. We can write the set I 1 as the disjoint union of infinitely many finite length intervals with [a , b ] := {a , a + 1, . . . , b } and a , b ∈ N for all ∈ N. If all but finitely many intervals are of length one, such that a = b , the values s with min{ 1 s , m s } = 1 s are essentially isolated. In this case we note that ∈ I 1 and − 1 ∈ I 2 and use for ≥ 2 the inequality This implies that also s∈I 1 m s is finite too, such that s∈I 2 m s is infinite. Finally, we assume that infinitely many intervals are of length greater or equal two. By our earlier assumption s∈I 1 1 s is finite and satisfies Furthermore, ln b a < for all sufficiently large such that b < e a . This implies that for k ∈ [a , b ] and sufficiently large Hence, m k ≤ 2e k . Combining this with our previous argument for the essentially isolated values we deduce that s∈I 1 m s is finite too, such that s∈I 2 m s = ∞.
Concerning random discrete structure one usually encounters discrete distributions, which are supported on (a subset of) the non-negative integers. It is convenient to use factorial moments instead of the power moments, since they can be directly obtained from the probability generating function by repeated differentiation, see (7). Mixed Poisson distributions and a related phase transition naturally occur if the factorial moments satisfy asymptotic expansions similar to (13) instead of the power moments.
Lemma 2 (Factorial moments and limit laws of mixed Poisson type). Let (X n ) n∈N denote a sequence of random variables, whose factorial moments are asymptotically of mixed Poisson type satisfying for n tending to infinity the asymptotic expansion E(X s n ) = λ s n · µ s · (1 + o(1)), s ≥ 1, with µ s ≥ 0, and λ n > 0. Assume that and the moment sequence (µ s ) s∈N determines a unique distribution X, satisfying Carleman's condition. Then, the following limit distribution results hold: (i) if λ n → ∞ for n → ∞ the random variable Xn λn convergences in distribution, with convergence of all moments, to X. (ii) if λ n → ρ ∈ (0, ∞) for n → ∞ the random variable X n convergences in distribution, with convergence of all moments, to a mixed Poisson distributed random variable Y L = MPo(ρX).
Moreover, the random variable Y L = MPo(ρX) converges for ρ → ∞, after scaling, to its mixing distribution X: Y ρ L − → X, with convergence of all moments.
Remark 1. It may be possible to unify cases (i) and (ii) to arbitrary sequences λ n by a suitable result for the distance between random variabled X n and Y n = MPo(λ n X).

Remark 2.
The results above complement the standard case when the distribution of X degenerates X = 1. The random variables X n are then asymptotically Poisson distributed with parameter λ n . Thus, the distribution of Xn λn degenerates for λ n → ∞, since we expect a central limit theorem for (X n − λ n )/ √ λ n . It might also be necessary for non-generate X to consider centered random variables similar to X * n = X n − λ n , and its (factorial) moments, instead of X n . Remark 3. The result above can be strengthened to also include the degenerate case λ n → 0, such that X n L − → 0. It suffices to prove that E(X n ) → 0 and E(X Remark 4 (Moment generating functions and limit laws of mixed Poisson type). Let ψ(z) = E(e zX ) denote the moment generating function of X. If the moment generating function ϕ(z) = E(e zXn ) satisfies for n → ∞ the asymptotic expansion ϕ(z) = ψ λ n e z − 1 · (1 + o(1)), then the conclusion of the lemma above -convergence in distribution -still holds, but a priori without moment convergence. On the other hand, if the moments of (µ s ) s∈N do not determine a unique distribution, one still obtains by the Lemma above convergence of integer moments, but one cannot deduce convergence in distribution.
Remark 5. In the analysis of random discrete structures the random variables X n often depend on an additional parameter describing or measuring a certain local aspect of the combinatorial structure, such that X n = X n,j . Moreover, the expansion of the factoral moments often depend on this parameter in a crucial way. A quite common situation (see [42,44,57,47] and also [32,16]) is the following dichotomy for the asymptotic expansion of the factorial moments: where λ n,0 is independent of j, but λ n,1 = λ n,1 (j) also depends on the growth of this additional parameter j compared to n. Consequently, one encounters one additional family of limit laws when j is fixed, determined by the moment sequence (µ s,j ) s∈N . Note that in all presented examples the additional property holds for s ≥ 1: Λ s j µ s,j → µ s , j → ∞, where Λ j denotes an additional scale parameter; compare with the Remarks 6, 7, and 13.
Proof. By (3) the power moments of X n satisfy the following asymptotic expansion (1)).
If λ n → ∞ for n → ∞, we obtain further the expansion Consequently, the moments of Xn λn convergence to the moments µ s of the mixing distribution. By the Fréchet-Shohat moment convergence theorem and the moments of X satisfying Carleman's condition, this proves convergence in distribution. Furthermore, for λ n → ρ for n → ∞, we directly obtain Consequently, the moments of X n converge to the moments of a mixed Poisson distributed random variable Y L = MPo(ρX), which is uniquely determined by its moments sequence, according to Lemma 1, and our assumption on the moments of X. Finally, an identical argument proves that a mixed Poisson distributed random variable Y L = MPo(ρX) converges to its mixing distribution for ρ → ∞.

EXAMPLES AND APPLICATIONS
We present several appearances of mixed Poisson distributions in the analysis of random discrete structures, in particular various families of random trees, k-Stirling permutations, and urn models. We discuss several families of random trees where a mixed Poisson law arises as the limit law of a discrete random variable X n,j . The parameter n ∈ N usually measures the size of the investigated trees, and j denotes an additional parameter measuring or marking a certain aspect of the combinatorial structure, i.e. a node with a certain label j of interest, often satisfying a natural constraint of the type 1 ≤ j ≤ n [42,44,45,56]. In the limit n → ∞, with j = j(n), phase transitions where observed according to the relative growth of j with respect to n, i.e. j = 1, 2, . . . being a constant independent of n, j → ∞ but with j = o(n), or j ∼ ρ · n, for fixed ρ. As mentioned in the introduction, we will unify and simplify earlier arguments, starting from explicit formulas for the factorial moments from the various works. These explicit formulas directly lead to mixed Poisson laws, using Lemmas 1 and 2, and Stirling's formula for the Gamma function Besides, whenever possible we interpret the random variables in terms of urn models.
3.1. Block sizes in k-Stirling permutations. Stirling permutations were defined by Gessel and Stanley [24]. A Stirling permutation is a permutation of the multiset {1, 1, 2, 2, . . . , n, n} such that, for each i, 1 ≤ i ≤ n, the elements occurring between the two occurrences of i are larger than i. E.g., 1122, 1221 and 2211 are Stirling permutations, whereas the permutations 1212 and 2112 of {1, 1, 2, 2} aren't. The name of these combinatorial objects is due to relations with the Stirling numbers, see [24] for details, and [39] for a bijections with certain tree families. A straightforward generalization of Stirling permutations is to consider permutations of a more general multiset {1 k , 2 k , . . . , n k }, with k ∈ N (we use in this context j := j, . . . , j , for ≥ 1), such that for each i, 1 ≤ i ≤ n, the elements occurring between two occurrences of i are at least i. Such permutations called k-Stirling permutations have already been considered by Brenti [8,9], Park [59,60,61] and Janson [33,35]. These k-Stirling permutations can be generated in a sequential manner: we start with 1 k = 11 and insert the string (n + 1) k at any position (anywhere, including first or last) in a given k-Stirling permutation of {1 k , 2 k , . . . , n k }, n ≥ 1. In the case k = 3, we have for example one permutation of order 1: 111; four permutations of order 2: 111222, 112221, 122211, 222111; etc.
A block in a k-Stirling permutation σ = σ 1 · · · σ s is a substring σ a · · · σ b , with σ a = σ b , that is maximal, i.e., which is not contained in any larger such substring. There is obviously at most one block for every j ∈ {1, 2, . . . , n}, extending from the first occurrence of j to the last one; we say that j forms a block if this substring is indeed a block, i.e., when it is not contained in a string j · · · j , for some j < j. It can be shown easily by induction that any k-Stirling permutation has a unique decomposition as a sequence of its blocks. For example, the 3-Stirling permutation σ = 112233321445554777666, has block decomposition The number of blocks of size k · in a random k-Stirling permutation of order n was studied in [45]. There, an simple exact expression for the factorial moments was derived: Depending on the growth of = (n) as n → ∞, two random variables X and Y arose as limiting distributions of X n, . The random variable X with moment sequence could be characterized using observations by Janson et al. [35], and Janson [32]. It has a density function f (x) that can be written as However, the characterization of random variable Y was incomplete, only the (factorial) moments were known: Using Lemma 2, we can fill this gap, extending the results of [45].
Corollary 1. The factorial moments of random variable X n, , counting the number of blocks of size k · in a random k-Stirling permutation of order n, are for n → ∞ asymptotically of mixed Poisson type, with mixing distribution X, determined by its moments and density given by (17) and (18) and scale parameter λ n, = λ n, convergences in distribution, with convergence of all moments, to X. (ii) for = (n) such that λ n, → ρ ∈ (0, ∞) the random variable X n, convergences in distribution, with convergence of all moments, to a mixed Poisson distributed random variable Y L = MPo(ρX). Its probability mass function is given by Moreover, for ρ → ∞, the random variable Y /ρ converges in distribution to X, with convergence of all moments.
The result above can also be interpreted in terms of a suitable urn model. First we recall the definition of Pólya-Eggenberger urn models. We start with an urn containing n white balls and m black balls. The evolution of the urn occurs in discrete time steps. At every step a ball is drawn at random from the urn. The color of the ball is inspected and then the ball is returned to the urn. According to the observed color of the ball there are added/removed balls due to the following rules. If a white ball has been drawn, we put into the urn α white balls and β black balls, but if a black ball has been drawn, we put into the urn γ white balls and δ black balls. The values α, β, γ, δ ∈ Z are fixed integer values and the urn model is specified by the 2 × 2 ball replacement matrix α β γ δ . This definition readily extends to higher dimensions, leading to r × r ball replacement matrices. Note that one can we may consider α, β, γ, δ ∈ R, defining the urn process as a Markov process; see Remark 1.11of Janson [32]. One usually assumes that the urns are tenable: the process of drawing and adding/removing balls can be continued ad infinitum, never having to remove balls which are not present in the urn. Starting with W 0 = w 0 white balls and B 0 = b 0 black balls, one is then interested in the composition (W n , B n ) of the urn after n draws. For a few recent results we refer the reader to [5,16,17,29,32,63].
Urn I. Consider a balanced urn with balls of + 2 colours and let the random vector (U n,0 , . . . , U n, +1 ) count the number of balls of each color at time n with × ball replacement matrix M given by The initial configuration of the urn (it is here convenient to start here at time 1) is given by (U 1,0 , . . . , U 1, +1 ) = (2, k − 1, 0, . . . , 0). It can be shown that the random variables U n,i , with 1 ≤ i ≤ , described by the urn model are related to the random variables X n,i , 1 ≤ i ≤ , which count the number of blocks of size ki in a random k-Stirling permutation of order n, as follows: By Theorem 1 and the results of [45], this implies that the random variables U n,i occurring in the urn model undergo a phase transition according to the growth of with respect to n, from continuous to discrete, where the moments of the occuring random variables X and Y are related by the Stirling transform.

3.2.
Diminishing Pólya-Eggenberger urn models. A classical example of a nontenable urn model is the sampling without replacement urn with ball replacement matrix given by −1 0 0 −1 . The process of drawing and replacing balls ends after n + m steps, starting with n white and m black balls. Here, one is interested in the number of white balls, after all black balls have been drawn. Several urn models of a similar non-tenable nature have recently received some attention under the name diminishing urn models, see [47] and the references therein.
Urn II. Consider a possibly unbalanced generalized sampling without replacement urn model with ball replacement matrix The initial configuration of the urn consists of α · n white balls and δ · m black balls. The random variable X δm,αn counts the number of white balls, when all black balls have been drawn.
It was shown in [47] that the factorial moments of the random variableX δm,αn = X δm,αn /α are given by Moreover, a random variable Y arises in the limit, whose factorial moments are given by Using a special case of Theorem 2 it was shown that Y has a discrete distribution. However, the result of [47] contains a small gap: the moments (Γ(1 + αs δ )) s∈N only determines a unique distribution for α/δ ≤ 2, see [27]. Hence, only in this case the (factorial) moments of Y determine a unique distribution. Since a Weibull distributed random variable X L = W δ/α,1 , with shape parameter δ α , scale parameter 1, and density f (t) = δ α t δ α −1 e −t δ α , t ≥ 0, has moments E(X s ) = Γ(1 + αs δ ), we obtain the following characterization of Y , extending the result of [47]. : Assume that a/δ ≤ 2: (i) for λ m,n → ∞ the random variableX δm,αn λm,n convergences in distribution, with convergence of all moments, to X. (ii) for λ n, → ρ ∈ (0, ∞) the random variableX δm,αn convergences in distribution, with convergence of all moments, to a mixed Poisson distributed random variable Y L = MPo(ρX). Its probability mass function is given as follows: Remark 6. As shown in [47], for fixed m the random variableX δm,αn /n converges to power of a beta-distributed random variable B, with moments E( with convergence of all moments. Note that the results above can be extended to all α, δ ∈ N; however, for α/δ ≥ 2 the method of moments cannot be used anymore. Instead, has to directly analyze the probability generating function h n,m (v), which can be derived using stochastic processes [46].
Proof. According to the definition of a mixed Poisson distributed random variable Y L = MPo(ρX), it has factorial moments given by (20). In order to derive the integral-free series representation we proceed as follows. In the first case α/δ < 1 we can directly use Theorem 2, since the moment generating function of the mixing Weibull distribution X exists at −ρ. In the remaining cases α/δ ≥ 1 we use the definition and the density function of the Weibull distribution to get first The case α/δ = 1 readily leads to the stated geometric distribution after using the obvious simplification The Gamma-function type integrals are readily evaluated and the stated result follows.
3.3. Descendants in Increasing trees. Increasing trees are labeled trees where the nodes of a tree of size n are labelled by distinct integers of the set {1, . . . , n} in such a way that each sequence of labels along any branch starting at the root is increasing. They have been introduce by Bergeron et al. [6], and can be combinatorially described as follows: Given a so-called degree-weight sequence (ϕ k ) k≥0 , the corresponding degree-weight generating function ϕ(t) is defined by ϕ(t) := k≥0 ϕ k t k . The simple family of increasing trees T associated with a degreeweight generating function ϕ(t), can be described by the formal recursive equation where 1 denotes the node labelled by 1, × the Cartesian product,∪ the disjoint union, * the partition product for labelled objects, and ϕ(T ) the substituted structure (see e. g., the books [23], [19]). Note that the elements of T are increasing plane trees, and that a tree with (out-)degrees d 1 , . . . , d n is given weight n 1 ϕ d i . A tree of order n is chosen randomly with probabilities proportional to the weights, leading to the random trees of size n from T .
Let T n be the total weight of all such trees of order n. It follows from (21) that the exponential generating function T (z) := n≥1 T n z n n! of the total weights satisfies the autonomous first order differential equation We consider tree families having degree-weights of one of the three following forms, as studied by [58]: where we used the abbreviations RECT for recursive trees, GPORT for generalized plane recursive trees, and d−INCT for d-ary increasing trees.
Consequently, by solving (22), we obtain exponential generating function T (z) and the total weights T n , Note that changing ϕ k to ab k ϕ k for some positive constants a and b will affect the weights of all trees of a given order n by the same factor a n b n−1 , which does not affect the distribution of a random tree from the family. Hence, when considering random trees from these three classes, ϕ 0 is irrelevant and c 1 and c 2 are relevant only through the ratio c 1 /c 2 . (We may thus, if we like, normalize ϕ 0 = 1 and either c 1 or |c 2 |, but not both.) It is convenient to set c 1 = 1 for (random) recursive trees, to use the parameter α := −1 − c 1 c 2 > 0 for (random) generalized plane recursive trees, and d := c 1 c 2 + 1 ∈ 2, 3, . . . for (random) d-ary increasing trees. As shown by Panholzer and Prodinger [58], random trees in the three classes of families given in (23) can be grown as an evolution process in the following way.
The process, evolving in discrete time, starts with the root labelled by 1. At step i + 1 the node with label i + 1 is attached to any previous node v (with out-degree d(v)) of the already grown tree of order i with probabilities p(v) given by Moreover, it has been shown [58] that there are only the three classes of simple families that can be grown in this way (for suitable p(v)).
Let D n,j denote random variable the counting the number of descendants -the size of the subtree rooted at node j -of a specific node j, with 1 ≤ j ≤ n, in a tree of size n. In [42] this random variable has been studied for the three forehand mentioned tree families using a generating functions approach. In the following we collect, and somewhat simplify the earlier results. One obtains a simple exact formula for the factorial moments ofD n,j = D n,j − 1 directly from the results of [42]: with c 1 , c 2 as given in (23). Hence, for n → ∞ and j = j(n) → ∞, the factorial moments ofD n,j are of mixed Poisson type by Stirling's formula for the Gamma function (16), and Lemma 2 can be applied.
Corollary 3. The random variableD n,j , counting the number of descendants minus one of node j in a random increasing tree of size n, has for n → ∞ and j = j(n) → ∞, factorial moments of mixed Poisson type with a Gamma mixing distribution X L = γ(1, 1 + c 2 c 1 ), and scale parameter λ n,j = n−j j : (1)).
(i) for λ n,j → ∞ the random variableD n,j λ n,j convergences in distribution, with convergence of all moments, to X.
(ii) for λ n,j → ρ ∈ (0, ∞) the random variableD n,j convergences in distribution, with convergence of all moments, to a mixed Poisson distributed random variable Y L = MPo(ρX), which has a negative binomial distribution.
Remark 7. Note that for fixed n the random variableD n,j /n is asymptotically beta-distributed (see [42]). One readily recovers the mixing distribution X from Z by taking the limit j → ∞, using a well known result for beta-distributed random variables: with convergence of all moments.
Remark 8. Panholzer and Seitz [57] studied labelled families of evolving k-tree models, generalizing simple families of increasing trees. An identical phase change and factorial moments of mixed Poisson type with a Gamma mixing distribution can be observed when studying the number of descendants of a specific nodes in labelled families of evolving k-tree models.
The parameter descendants of node j can be modelled using urn models: we encounter classical Pólya urns with non-standard initial values, depending on the number of draws. Note that Mahmoud and Smythe [49] used a similar approach to study the descendants of node j in recursive trees, for fixed j compared to n.
Urn III (Descendants in Increasing trees -Pólya urn). Consider a Pólya urn with ball replacement matrix and initial conditions The number D n,j of descendants of node j in an increasing tree of size n has the same distribution as the (shifted and scaled) number of white balls W n−j in the Pólya urn after n − j draws This implies that the number of white balls in the standard Pólya urn model exhibit a phase transition according to the growth of the initial number of black balls present in the urn compared to the discrete time. 3.4. Node-degrees in plane-oriented increasing trees. Let X n,j denote the random variable counting the outdegree of node j in a generalized plane-oriented recursive tree of size n.
It has been shown in [44] using a generating functions approach that the factorial moments of the random variable X n,j are given by for j ≥ 2 with c 1 , c 2 as given in (23) such that α = −1 − c 1 c 2 . Lemma 2 and an application of Stirling's formula for the Gamma function (16) leads to the following result.
Corollary 4. The random variable X n,j , counting the out-degree of node j in a random generalized plane-oriented increasing tree of size n, 1 ≤ j ≤ n, has for n → ∞ and j = j(n) → ∞, falling factorial moments of mixed Poisson type with a Gamma mixing distribution X L = γ(1, α), and scale parameter λ n,j = n j 1/(α+1) − 1: (1)).
(i) for λ n,j → ∞ the random variable X n,j λ n,j convergences in distribution, with convergence of all moments, to X. (ii) for λ n,j → ρ ∈ (0, ∞) the random variable X n,j convergences in distribution, with convergence of all moments, to a mixed Poisson distributed random variable Y L = MPo(ρX), which has a negative binomial distribution.
Remark 9. The limit law for fixed j has been determined in [44]: the random variable X n,j /n 1/(α+1) convergences in distribution to a random variable Z characterized by its moments with a given density. In the next section we will discuss this model in more detail.
Urn IV. Consider a balanced triangular urn with ball replacement matrix The out-degree X n,j of node j in a generalized plane-oriented increasing tree of size n has the same distribution as the shifted number of white balls W n−j in the Pólya urn after n − j draws This implies that the number of white balls in the standard Pólya urn model exhibit several phase transitions according to the growth of the initial number of black balls present in the urn with respect to the total number of draws; this will be discussed in detail in a more general setting in Section 4.

3.5.
Branching structures in plane-oriented recursive trees. Let X n,j,k denote the random variable, which counts the number of size-k branches (= subtrees) attached to the node labelled j in a random increasing tree of size n. The random variables X n,j,k are thus related to the random variable X n,j counting the outdegree of node labelled j by X n,j = n−j k=1 X n,j,k . This parameter was studied in Su et al. [67] for the case of the root node j = 1 and for the instance of random recursive trees: they derived the distribution of X n,1,κ and a limit law for it. Further they stated results for joint distributions. The analysis was extended in [43] to increasing tree families generated by a natural growth process (see Subsection 3.3). In particular, for generalized plane-oriented recursive trees with parameter α the following result was obtained for the factorial moments of X n,j,k : In [43] only the case of fixed k was considered. We can easily use Lemma 2 and Stirling's formula for the Gamma function (16) to obtain the following result.
Corollary 5. The random variable X n,j,k , counting the the number of size-k branches attached to node j in a random generalized plane-oriented increasing tree of size n has for fixed j, n → ∞ and 1 ≤ k ≤ n − j, falling factorial moments of mixed Poisson type with mixing distribution Z supported on [0, ∞), and scale parameter (1)).
(i) for λ n,j,k → ∞ the random variable X n,j,k λ n,j,k convergences in distribution, with convergence of all moments, to Z. (ii) for λ n,j,k → ρ ∈ (0, ∞) the random variable X n,j,k convergences in distribution, with convergence of all moments, to a mixed Poisson distributed random variable Y L = MPo(ρZ).
Remark 10. The results above can be generalized to growing j = j(n), leading to results similar to our earlier findings for the ordinary outdegree X n,j . The random variable L j is exactly the limit law of X n,j for fixed j as discussed in Remark 9. Thus, the density functions of f j (x) of X j are explicitly known, see [44].
We can interpret our findings in terms of an urn model reminiscent to the urn model for block sizes in k-Stirling permutations.
Urn V. Consider a balanced urn with balls of k + 3 colors and let the random vector (U n,1 , . . . , U n,k+3 ) count the number of balls of each color at time n with × ball replacement matrix M given by The initial configuration of the urn (it is here convenient to start here at time 0) is given by (U 0,0 , . . . , U 0,k+3 ) = ((j − 1)(α + 1), α, 0, . . . , 0). The random variables U n,i , with 3 ≤ i ≤ k + 2, described by the urn model are related to the random variables X n,j,i , 1 ≤ i ≤ k, which count the number of size-i branches attached to the node labelled j in a random increasing tree of size n, as follows: Moreover, U n−j,2 is related to the outdegree X n,j by U n−j,2 = X n,j + α.
This implies that the random variables U n,i occurring in the urn model undergo a phase transition according to the growth of k with respect to n, from continuous to discrete. Table sizes in the Chinese restaurant process. The Chinese restaurant process with parameters a and θ is a discrete-time stochastic process, whose value at any positive-integer time n is one of the B n partitions of the set [n] = {1, 2, 3, . . . , n} (see Pitman [62]). The parameters α and θ satisfy 0 < a < 1 and θ > −a. Here B n denote the Bell number counting the number of partitions of an n-element set B 0 = B 1 = 1, B 2 = 2, B 3 = 5, etc. 2 One images a Chinese restaurant with an infinite number of tables, and each table has an infinite number of seats. In the beginning the first customer takes place at the first table. At each discrete time step a new customer arrives and either joins one of the existing tables, or takes place and the next empty table in line. Each table corresponds to a block of a random partition. In the beginning at time n = 1, the trivial partition {{1}} is obtained with probability 1. Given a partition T of [n] with |T | = k parts t i , 1 ≤ i ≤ k, 1 ≤ k ≤ n of sizes |t i |. At time n + 1 the element n + 1 is either added to one of the existing parts t i ∈ T with probability

Distribution of
or added to the partition T as a new singleton block with probability This model thus assigns a probability to any particular partition T of [n]. We are interested in the distribution of random variable C n,j counting the number of parts of size j in a partition of [n] generated by the Chinese restaurant process.
We will not directly study the Chinese restaurant process. in order to analyze the number of tables of a certain size we study instead a variant of the growth rule for generalized plane-oriented recursive trees. Combinatorially, we consider a family T α,β of generalized plane-oriented recursive trees where the degree weight generating function ϑ(t) = 1 (1−t) β , β > 0, associated to the root of the tree, is different to the non-root nodes in the tree ϕ(t) = 1 (1−t) α , α > 0, and the corresponding family T The weight of the trees T in T α,β are defined as where d(v) denotes the out-degree of node v. The generating functions T α,β (z) = n≥1 T α,β;n z n n! and T (z) = n≥1 T n z n n! of trees in T α,β and T thus satisfies the differential equations The growth process generating randomly a tree in T α,β can be described in the following way: The process, evolving in discrete time, starts with the root labelled by zero. At step n + 1, with n ≥ 0, the node with label n + 1 is attached to any previous node v with out-degree d(v) of the already grown tree with probabilities p(v), given for non-root nodes v by and for the root by This growth process is similar to the Chinese restaurant process considered before. Indeed, if we remove the root labelled zero, the remaining branches contain the nodes with labels given by [n] = {1, . . . , n}. Remark 11. In the relation above θ cannot be negative since β is assumed to be positive. The correspondance above can be extended to the full range β > −1 using a different degree-weight generating function ϑ(t) for the root node. Assume that −1 < β ≤ 0. Then, we cannot directly use ϑ(t) = (1 − t) −β = 1 + βt + . . . due to the negative or zero weight. Since the root connectivity is similar to the choice β → 1 + β for an outdegree of the root larger than one, we use a shifted connectivity of the root node: for −1 < β < 0, and Proof of Proposition 3. Assume that a size n tree T of the family T α,β with labels {0, 1, . . . , n} has has k branches t i , 1 ≤ i ≤ k of sizes |t i |. By the considerations of [58] at time n + 1 the element n + 1 is either attached to one of the existing non-root nodes v with probability or to the root of the tree with probability .
Consequently, element n + 1 is attached to one of the branches t i ∈ T with probability Thus, setting a = 1 1+α and θ = β 1+α proves the stated result. Theorem 1. The random variable C n,j counting the number of parts of size j in a partition of {1, . . . , n} generated by the Chinese restaurant process is distributed as X n+1,j the number of branches of size j of the root in a size n + 1 generalized plane-oriented recursive tree T α,β with a = 1 1+α and θ = β 1+α : Assume that β > 0. Then, X n+1,j has falling factorial moments of mixed Poisson type with mixing distribution Z supported on [0, ∞), and scale parameter λ n,j = (1)).
(i) for λ n,j → ∞ the random variable X n+1,j λ n,j convergences in distribution, with convergence of all moments, to Z. (ii) for λ n,j → ρ ∈ (0, ∞) the random variable X n+1,j convergences in distribution, with convergence of all moments, to a mixed Poisson distributed random variable Y L = MPo(ρZ).

Remark 12.
A similar result holds true for −1 < β ≤ 0. The analysis is identical, one has to use the adapted degree-weight generating functions stated in Remark 11.
Proof of Theorem 1. We can study X n+1,j the number of branches of size j attached to the root of a size n+1 using the variable v as a marker and the generating function We have Solving the differential equation for T (z) leads to We can access the s th moment of X n+1,j as follows: Since T α,β;n n! = (α + 1) n β α+1 +n−1 n , we obtain the explicit result Application of Stirling's formula for the Gamma function then leads to the stated result.

TRIANGULAR URN MODELS
In the study of node degree in generalized plane-oriented recursive trees we encountered a triangular urn model, leading to factorial moments of mixed Poisson type. Here we study a more general triangular urn.
The initial configuration of the urn consists of w 0 white balls and b 0 black balls, and the random variable W n counts the number of white balls after n draws.
This urn model has been studied by Puyhaubert [65,16] who derived the probability mass function of W n , and a limit law for n → ∞. The results of [65,16] were extended by Janson [32] to unbalanced triangular urn models. Here, using a simple closed formula for the rising factorial moments of W n , we point out several phase transitions, involving amongst others moments of mixed Poisson type, for non-standard initial values b 0 = b 0 (n), which may depend on the discrete time n. Due to the balanced nature of the urn the total number T n of balls after n draws is a deterministic quantity: Our starting point is the analysis of the normalized number of white balls X n = W n /α, such that X 0 = w 0 /α. Let F n denote the σ-field generated by the first n steps. Moreover denote by ∆ n = X n − X n−1 ∈ {0, 1} the increment at step n.
Since the probability that a new white ball is generated at step n is proportional to the number W n−1 = X n−1 · α of existing white balls (at step n − 1), we obtain further Hence, let Consequently, X n is a positive martingale. By taking the unconditional expectation, this implies that the expected value of X n is given by More generally, we similarly have for any positive integer s Hence, this implies that the s th binomial moment is given by . Theorem 2. The s th rising factorial moment of the random variable X n = W n /α, where W n counts the number of white ball in a balanced triangular urn with ball replacement matrix given by α β 0 γ , α, β, γ ∈ N, γ = α + β, is given by the exact formula where w 0 , b 0 denote the initial number of white and black balls, respectively. The factorial moments ofX n = X n − w 0 α are for min{n, b 0 } → ∞ asymptotically of mixed Poisson type with a gamma mixing distribution X L = γ( w 0 α , 1), and scale parameter λ n,b 0 = ( (1)).
(i) for λ n,b 0 → ∞ the random variableX n λ n,b 0 convergences in distribution, with convergence of all moments, to X. (ii) for λ n,b 0 → ρ ∈ (0, ∞) the random variableX n convergences in distribution, with convergence of all moments, to Y L = MPo(ρX).
Remark 13. It is well known from the works of Puyhaubert [65,16] and Janson [32], that for fixed b 0 the random variable X n /n α γ tends to a random variable Z with moments for more details about the nature of this random variable we refer the reader to [16,32]. This result can easily be re-obtained using the explicit expression for the rising factorial moments of X n and the method of moments. We obtain the gamma mixing distribution X with convergence of all moments.
Proof. Let Y denote a random variable with rising factorial moments E(Y s ) = E(Y (Y + 1) . . . (Y + s − 1)) satisfying an expansion of mixed Poisson type, E(Y s ) = ρ s · µ s , for s ≥ 1, with µ s ≥ 0. We obtain the (falling) factorial moments using the binomial theorem for rising factorials (see [25]): Moreover, we can obtain the rising factorial moments of the shifted random vari-ableX n = X n − w 0 a by using again the binomial theorem This implies that we can express the factorial moments ofX n in terms of the rising factorial moments of X n by combining the two identities above in the following way.
Next we use the asymptotic expansion of the rising factorial moments of X n , Interchanging summations, and collecting powers of (1)).
Next, using the hypergeometric form of the Vandermonde convolution (see [25], p. 212), we obtain for the inner sum We get further where δ s,i denote the Kronecker-delta function. This proves the stated result.

MIXED POISSON RAYLEIGH LAWS
In the analysis of various combinatorial objects as, e.g., lattice paths, trees and mappings, the Rayleigh distribution occurs frequently. In this section we give several examples, where during the study of such objects, a mixed Poisson distribution with Rayleigh mixing distribution occurs in a natural way. Apart from the first example, the occurrence and proof of the mixed Poisson distribution is novel, best to our knowledge. 5.1. The number of inversions in labelled tree families. Consider a rooted labelled tree T , where the nodes of T are labelled with distinct integers (usually of the set {1, 2, . . . , |T |}, with |T | the size, i.e., the number of vertices, of T ). An inversion in T is a pair (i, j) of vertices (we may always identify a vertex with its label), such that i > j and i lies on the unique path from the root node of T to j (thus i is an ascendant of j or, equivalently, j is a descendant of i). Given a tree family, we introduce the r.v. I n,j , which counts for a random tree of size n the number of inversions induced by the node labelled j, 1 ≤ j ≤ n, i.e., it counts the number of inversions of the kind (i, j), with i > j an ancestor of j. See Figure 5 for an illustration of the quantity considered.
Panholzer and Seitz [56] studied the r.v. I n,j for random trees of so-called labelled simply generated tree families (see, e.g., [19]; note that in the probabilistic literature such tree models are more commonly called Galton-Watson trees), which contain many important tree families as, e.g., ordered, unordered, binary and cyclic labelled trees as special instances.
Formally, a class T of labelled simply generated trees is defined in the following way: One chooses a sequence (ϕ ) ≥0 (the so-called degree-weight sequence) of nonnegative real numbers with ϕ 0 > 0. Using this sequence, the weight w(T ) of each labelled ordered tree (i.e., each labelled rooted tree, in which the children of each node are ordered from left to right) is defined by w(T ) := v∈T ϕ d(v) , where by v ∈ T we mean that v is a vertex of T and d(v) denotes the number of children of v (i.e., the out-degree of v). The family T associated to the degreeweight sequence (ϕ ) ≥0 then consists of all trees T (or all trees T with w(T ) = 0) together with their weights. We let T n := |T |=n w(T ) denote the the total weight of all trees of size n in T ; for many important simply generated tree families (T n ) n≥1 is a sequence of natural numbers, and then the total weight T n can be interpreted simply as the number of trees of size n in T .
When analysing parameters in a simply generated tree family T it is common to assume the random tree model for weighted trees, i.e., when speaking about a random tree of size n one assumes that each tree T in T of size n is chosen with a probability proportional to its weight w(T ), i.e., is chosen with probability w(T ) Tn . Under mild conditions on the degree-weight sequence (ϕ ) ≥0 of a family T of labelled simply generated trees and assuming the random tree model, in [56] the following asymptotic formula for the factorial moments of I n,j has been obtained: where the constant κ depends on the particular tree family, i.e., on the degreeweight sequence, and is given in [56]. Consequently, an application of Lemma 2 and taking into account Example 3 yields the following result, which adds to the results of [56] the characterization of the limiting distribution as a mixed Poisson distribution.
Corollary 6. The random variable I n,j , which counts the number of inversions induced by node j in a random labelled simply generated tree of size n has for (i) for λ n,j → ∞ the random variable I n,j λ n,j convergences in distribution, with convergence of all moments, to X. n, whereas E(I n,j ) → 0, for n − j √ n.

5.2.
Record-subtrees in Cayley-trees. Given a rooted labelled tree T , a minrecord (or simply record, for short) is a node x ∈ T , which has the smallest label amongst all nodes on the (unique) path from the root-node of T to x. Let us assume that {r 1 , . . . , r k } is the set of records of T ; then this set naturally induces a decomposition of the tree T into what is called here record-subtrees {S 1 , . . . , S k }: is the largest subtree rooted at the record r i not containing any of the remaining records r 1 , . . . , r i−1 , r i+1 , . . . , r k . In other words, a record-subtree S is a maximal subtree (i.e., it is not properly contained in another such subtree) of T with the property that the root-node of S has the smallest label amongst all nodes of S. See Figure 6 for an illustration of these quantities. In the following we will study the occurrence of record-subtrees of a given size for one of the most natural random tree models, namely random rooted labelled unordered trees, often called random Cayley-trees. A Cayley-tree is a rooted tree T , where the nodes of T are labelled with distinct integers of {1, 2, . . . , |T |} and where the children of any node x ∈ T are not equipped with any left-to-right ordering (i.e., we may think that each node in T has a possibly empty set of children). Combinatorially, the family T of Cayley-trees can be described formally via the SET construction as T = * SET(T ).
Note that Cayley-trees are a particular family of labelled simply generated trees as described in Subsection 5.1, where the degree-weight sequence (ϕ ) ≥0 is given by ϕ = 1 ! . It is well-known that there are exactly T n = n n−1 different Cayleytrees of size n (see, e.g., [19,66]) and in the random tree model, which we will always assume here, each of these trees may occur with the same probability when considering a size-n tree.
The r.v. R n counting the number of records in a random size-n Galton-Watson tree (i.e., a simply generated tree) has been studied by Janson [31] showing (after a suitable scaling by 1 √ n ) a Rayleigh limiting distribution result; in particular, for Cayley-trees it holds Rn √ n L − → Rayleigh(1). Here we introduce the r.v. R n,j , which counts the number of record-subtrees of size j in a random Cayley-tree of size n. Of course, the random variables R n,j , 1 ≤ j ≤ n, are a refinement of R n and are related by the identity R n = n j=1 R n,j .
As has been pointed out already in [31], records in trees are closely related to a certain node removal procedure for trees. Starting with a tree T one chooses a node x ∈ T at random and cuts off the subtree T of T rooted at x, and iterates this cutting procedure with the remaining subtree T until only the empty subtree remains. The r.v. C [v] n counting the number of (vertex) cuts required to cut-down the whole tree by this cutting procedure when starting with a random Cayley-tree of size n is then distributed as R n , i.e., R n n . We can extend this relation by considering the r.v. C [v] n,j counting the number of subtrees of size j, which are cut-off during the (vertex) cutting procedure when starting with a random Cayleytree of size n, where it holds R n,j n,j . This can be seen easily by means of coupling arguments given in [31]: consider the node-removal procedure, where, starting with a tree T , in each step the node with smallest label amongst all nodes in the remaining tree is selected and together with all its descendants detached from the tree. It holds then that node x is a min-record in the tree T if and only if node x is selected as a vertex cut during this node-removal procedure and in this case the record-subtree (and thus its size) rooted at x corresponds to the subtree (with its respective size), which is removed in this cut.
We will show that R n,j and thus also C n,j has factorial moments of mixed Poisson type yielding the following theorem.
Theorem 3. The random variable R n,j counting the number of record-subtrees of size j in a random Cayley-tree of size n has, for n → ∞ and arbitrary 1 ≤ j = j(n) ≤ n, asymptotically factorial moments of mixed Poisson type with a Rayleigh mixing distribution X and scale parameter λ n,j = (i) for λ n,j → ∞ the random variable R n,j λ n,j convergences in distribution, with convergence of all moments, to X. Proof. We consider the description of the problem via the node-cutting procedure. This immediately yields the following stochastic recurrence for the r.v. R n,j n,j : with R n,j = 0, for 0 ≤ n < j, and where the r.v. K n measures the size of the subtree remaining after selecting a random node and removing the subtree rooted at it from a randomly selected size-n Cayley-tree.
In the following we will compute the splitting probabilities p n,k := P{K n = k}, with 0 ≤ k ≤ n − 1, and by doing this we also show that the recurrence (29) is indeed valid, i.e., that the subtree T (let us assume of size k ≥ 1) remaining after removing the subtree containing the selected node x of a random size-n Cayley-tree T is (after an order-preserving relabelling of the nodes) again a random Cayley-tree of size k (the so-called random preservation property holds). This can be done by simple combinatorial reasoning. Consider a pair (T, x) of a size-n Cayley-tree T and a node x ∈ T . When detaching the subtree rooted at x from T , we obtain a pair (T , T ) of subtrees with T containing x and T the possibly empty remaining subtree. Of course, T is the empty subtree exactly if x is the root node of T and thus there are exactly T n pairs (T, x) yielding |T | = k = 0. Let us now assume that 1 ≤ |T | = k ≤ n−1. After an order-preserving relabelling of T and T with labels {1, . . . , k} and {1, . . . , n − k}, respectively, both subtrees are Cayley-trees of size k and n − k, respectively. Consider now a particular pair (T ,T ) of Cayley-trees of sizes |T | = k and |T | = n−k, respectively, and let us count the number of pairs (T, x), with T a size-n Cayley-tree and x ∈ T , yielding the pair (T ,T ) of subtrees after cutting. By constructing such pairs (T, x), one obtains that there are exactly k n k possibilities (k possible ways of attachingT to a node inT and n k possibilities of distributing the labels {1, . . . , n} orderpreserving to the subtrees), independent of the chosen pair of trees; thus the random preservation property holds.
Moreover, one obtains that there are exactly k n k T k T n−k pairs (T, x) splitting after a cut into a pair (T , T ) of subtrees with respective sizes k and n − k, for 1 ≤ k ≤ n − 1. Of course, in total there are nT n pairs (T, x) and thus we get the following result for the splitting probabilities p n,k : where we use throughout this section the abbreviatioñ In order to treat the stochastic recurrence (29) and to compute the asymptotic behaviour of the (factorial) moments of R n,j we find it appropriate to introduce the generating function Starting with (29), straightforward computations (which are omitted here) yield then the following differential equation where the so-called tree function (the exponential generating function of the number of size-n Cayley-trees) appears: Simple manipulations and using the well-known functional equation of the tree function (which is thus closely related to the Lambert-W function): give then the following explicit formula for the derivative of F (z, v) w.r.t. z: To get the (factorial) moments of R n,j we use the substitution w := v − 1 and introduceF (z, w) := F (z, w + 1). Extracting coefficients [w s ], s ≥ 1, from It is not difficult to extract coefficients from (33) and stating an explicit formula for [z n w s ]F (z, w) and the factorial moments of R n,j ; however, for asymptotic considerations it is easier to use well-known analytic properties of the tree function T (z) and deduce from it the desired asymptotic growth behaviour of E(R s n,j ). Namely, we use standard applications of so-called singularity analysis, see [19], to transfer the local behavior of a generating function in a complex neighbourhood of the dominant singularity (i.e., the singularity of smallest modulus; we are here only concerned with functions with a unique dominant singularity) to the asymptotic behaviour of its coefficients. It holds (see, e.g., [19]) that the tree function T (z) has a unique dominant singularity (a branch point) at z = e −1 , where the function evaluates to T (e −1 ) = 1 and where it admits the following local expansion: Thus the function (1 − T (z)) −s−1 also has a unique dominant singularity at z = e −1 with the following local bound: Singularity analysis yields then Therefore, (33) yields This together with Stirling's formula for the Gamma function (16) shows the following bound for the s-th moments of R n,j , which holds uniformly for all 1 ≤ j ≤ n: To get the mixed Poisson behaviour for j = o(n) we use the refined expansion locally around z = e −1 , which can be obtained from (34). Singularity analysis gives then the expansion Thus we get for j = o(n) the stated behaviour of the the s-th factorial moments of R n,j : where we used in the final step the duplication formula of the factorials: The mixed Poisson limit law with Rayleigh mixing distribution as stated in Theorem 3 follows then from (35) and (37) by applying Lemma 2.

5.3.
Edge-cutting in Cayley-trees. The following prominent edge-cutting procedure for trees is closely related to the node-removal procedure considered in Subsection 5.2. Starting with a tree T one chooses an edge e ∈ T and removes it from T . After that T decomposes into two subtrees T and T , where we assume that T contains the original root of T . We discard the subtree T and continue the edgecutting procedure with T until we have isolated the root-node of the original tree T . This cutting-down procedure has been introduced in [50], where the number of random cuts C n to isolate the root-node of a random Cayley-tree of size n, where in each cutting-step an edge from the remaining tree is chosen uniformly at random, has been studied yielding asymptotic formulae for the first two moments of C n . The Rayleigh limiting distribution of C n for Cayley-trees and other families of simply generated trees has been obtained in [54,55] and in a more general setting by Janson [31]; in particular, for Cayley-trees one obtains Cn √ n L − → Rayleigh(1). Moreover, in [31] it was shown in general that for Galton-Watson tree families (thus containing Cayley-trees) the random variables C n and C [v] n (as introduced in Subsection 5.2) for the edge-cutting procedure and the node-removal procedure, respectively, have the same limiting distribution behaviour. A number of works have analyzed the edge-cutting procedure and related processes using the connection between Cayley-trees and the so-called Continuum Random Tree, in particular see the work of Addagio-Berry, Broutin and Holmgren [1], and the recent works of Bertoin [3,4].
In this section we consider a refinement of the r.v. C n for Cayley-trees, namely we study the behaviour of the r.v. C n,j counting the number of subtrees of size j cut-off during the random edge-cutting procedure when starting with a random size-n Cayley-tree until the root-node is isolated. Of course, it holds Before continuing we want to remark that an alternative description of the problem can be given via edge-records in edge-labelled trees: given a size-n tree T we first distribute the labels {1, . . . , |T | − 1} randomly to the edges of T . An edgerecord in T is then an edge e = (x, y), where y is a child of x, with smallest label amongst all edges on the path from the root-node of T to y. Analogous to Subsection 5.2 (and stated already in [31]) one gets that the r.v. R

[e]
n counting the number of edge-records in a random size-n Cayley-tree is distributed as C n , i.e., R [e] n L = C n . Moreover, the edge-records e 1 , . . . , e k of an edge-labelled tree T naturally decompose T into the root-node and k edge-record subtrees S 1 , . . . , S k , obtained from T by removing the root-node of T and all edges e 1 , . . . , e k . Again we can introduce the r.v. R n,j , which counts the number of edge-record subtrees of size j in a random edge-labelled Cayley-tree of size n. It is then immediate to see that R [e] n,j L = C n,j .
In Figure 7 we illustrate the edge-cutting procedure for a particular tree. In the following theorem we state that C n,j (and thus also R n,j ) has factorial moments of mixed Poisson type with a Rayleigh mixing distribution. The method of proof is analogous to the one presented in Subsection 5.2, but due to the less explicit nature of the formulae occurring, the proof steps are more technical and a bit lengthy.
Theorem 4. The random variable C n,j counting the number of subtrees of size j, which are cut-off during the edge-cutting procedure starting with a random Cayleytree of size n has, for n → ∞ and arbitrary 1 ≤ j = j(n) ≤ n − 1, asymptotically factorial moments of mixed Poisson type with a Rayleigh mixing distribution X and scale parameter λ n,j = √ nj j−1 j!e j : (i) for λ n,j → ∞ the random variable C n,j λ n,j convergences in distribution, with convergence of all moments, to X. (ii) for λ n,j → ρ ∈ [0, ∞) the random variable C n,j convergences in distribution, with convergence of all moments, to a mixed Poisson distributed random variable Y L = MPo(ρX).
Remark 16. According to Theorems 3 and 4 the r.v. C n,j and C [v] n,j (and thus also R n,j ) and R n,j , for the edge-and vertex-versions of the cutting procedures as considered in Subsection 5.2-5.3, have the same limiting distribution behaviour. Janson [31] was able to bound the difference between the random variables R n and R n (i.e., between the number of node-and edge-records) in a suitable metric and thus to show directly the same limiting behaviour of these r.v. It would be interesting to extend his proof technique to the refined quantities studied here.
Proof. Decomposing a tree according to the first cut of the edge-cutting procedure immediately yields the following stochastic recurrence for the r.v. C n,j : with C n,j = 0, for 1 ≤ n ≤ j, and where the r.v. K [e] n measures the size of the subtree containing the root of the original tree after cutting a random edge from a randomly selected size-n Cayley tree. It is well-known [50] (and can be shown completely analogous to the computations in the proof of Theorem 3) that the random preservation property of Cayley-trees also holds for the edge-cutting procedure (thus implying correctness of (39)) and that the splitting probabilities r q=1 α j, q (z), for s ≥ 1.
The following bounds on the growth of the coefficients of the functions appearing, which all can be obtained in a straightforward way by applying standard techniques as singularity analysis or approximating sums by integrals (we omit here some of the details), will play a key rôle in the asymptotic evaluation of the moments. First, it holds, for arbitrary but fixed and uniformly for 1 ≤ j ≤ n and n → ∞: This implies The sum occurring in the latter bound (42) can itself be bounded as follows: the summand α j,s (z) gives the main contribution and implies the following bound, which holds uniformly for 1 ≤ j ≤ n (with s arbitrary but fixed and n → ∞): Now we are in a position to derive the stated bound on the s-th integer moments of C n,j . First, by using (41) and (43) we get for s ≥ 1: (n − k) thus showing the bound (which holds uniformly for 1 ≤ j ≤ n) To give the refined asymptotic expansion of E C s n,j yielding factorial moments of mixed Poisson type one has to spot and evaluate the main contribution of the coefficients of (41) in more detail and to bound the remaining contributions. In order to do this we split Completely analogous to the previous computations one shows for s ≥ 2 (of course, Q j,1 (z) = 0): and furthermore (uniformly for 1 ≤ j ≤ n) the following bound on the contribution of the remainder: Now we consider the term in (45) yielding the main contribution, where we assume from now on that j = o(n). We get We split the summation interval of (48) at k = n 2 and consider the contributions separately. Additionally we only require the already computed asymptotic bounds (36) and (44). The first part yields The main contribution comes from T s j e n e js n Starting with (48) and combining the contributions (49) and (50) with δ s,i the Kronecker-delta function. Together with Stirling's formula for the Gamma function we thus obtain from (46), (47) and (51) Finally, applying the duplication formula of the factorials (38), we get the stated result for the asymptotic expansion of the factorial moments: The mixed Poisson limit law with Rayleigh mixing distribution as stated in Theorem 4 follows then from (45) and (52) by applying Lemma 2.

5.4.
Parking functions and growth of the initial cluster. Parking functions are objects introduced by Konheim and Weiss [40], which are of interest in combinatorics (e.g., due to connections to various other combinatorial structures as forests, acyclic functions or hyperplane arrangements), see, e.g., [66], and computer science (e.g., due to close relations to hashing variants), see, e.g., [38]. A vivid description of parking functions is as follows: consider a one-way street with n  1, 8, 1, 3, 4, 3, 1) with the respective parking positions when carrying out the parking procedure.
parking spaces numbered from 1 to n and a sequence of n drivers (we will here exclusively deal with the case that the number of parking spaces is equal to the number of drivers) with preferred parking spaces s 1 , s 2 , . . . , s n . The drivers arrive sequentially and driver k tries to park at its preferred parking space s k . If it is free he parks, otherwise he moves further in the allowed direction (thus examining parking spaces s k + 1, s k + 2, . . . ) until he finds a free parking space, where he parks; if there is no such parking space he leaves the street without parking. A parking function is then a sequence (s 1 , . . . , s n ) ∈ {1, . . . , n} n such that all drivers are able to park. It has been shown already in [40] that there are exactly P n = (n + 1) n−1 parking functions of size n. Figure 8 gives an example of a parking function.
We note that there are many alternative ways of defining parking functions (s 1 , . . . , s n ) ∈ {1, . . . , n} n , e.g., via the characterization |{j : s j ≤ k}| ≥ k, for all 1 ≤ k ≤ n; however, in what follows the description given above is more intuitive and seems to be advantageous. Namely, we may start with a parking function (s 1 , . . . , s n ) and consider the filling of the parking spaces during the parking procedure, where at the beginning (step 0) we have an empty street, and where in step k the k-th driver arrives and successfully parks, until (after step n) eventually all parking spaces are occupied. When carrying out the parking procedure, at each time step we may define the initial cluster as the maximum sequence of consecutive occupied parking spaces starting with parking space 1; if parking space 1 is empty we say that the initial cluster is empty. The size of the initial cluster is then simply the number of consecutive occupied parking spaces containing parking space 1. In this section we are interested in the growth of the initial cluster during the parking procedure starting with a random parking function: let the r.v. X n denote the number of increments of the initial cluster and the refinement X n,j measure the number of increments of amount j of the initial cluster during the parking procedure of a random parking function of size n; of course X n = n−1 j=1 X n,j . It turns out that the r.v. X n and X n,j are closely related to C n and C n,j , respectively, studied in Subsection 5.3 during the analysis of the edge-cutting procedure of Cayley-trees. Figure 9 illustrates the parking procedure and the growth of the initial cluster.
It is well-known [66] that the number P n of parking functions of size n coincides with the number of rooted labelled forests, i.e., forests of rooted labelled unordered trees, of size n. There are various known bijections between these objects; however, it seems that the following bijective relation between the growth of the initial cluster during the parking procedure and records in forests of rooted labelled trees has not been observed earlier. We mention that a similar bijection has been used by Chassaing and Louchard in [11], where they related the growth of clusters in parking functions with the additive coalescent model for particle coagulation. Analogous to the definition in Subsection 5.2 a max-record in a forest F of labelled trees is a node x ∈ F , which has the largest label amongst all nodes on the unique path from the root-node of the tree component containing x and x. Proposition 4. There is a bijection which maps parking functions of size n to forests of rooted labelled unordered trees of size n, such that the number of increments of the initial cluster during the parking procedure of a parking function corresponds to the number of max-records in the forest. Moreover, the number of increments of amount j of the initial cluster during the parking procedure corresponds to the number of max-record-subtrees of size j in the forests.
Proof. Given a parking function (s 1 , . . . , s n ) ∈ {1, . . . , n} n of size n we descibe the mapping, i.e., the construction of the corresponding rooted labelled forest F of size n, in an iterative way, which reflects the parking procedure of the n drivers. In order to describe the construction we assume that after step k the first k drivers are parked; then the "parking street" consists of a set of clusters of parking spaces (a cluster is here a maximal sequence of consecutive occupied parking spaces) separated by empty parking spaces. In the construction the k-th driver of the parking function will correspond to the node labelled k in the forest and after step k we will obtain a rooted labelled forest F (k) of size k (with nodes labelled by {1, . . . , k}). Moreover, the forest F (k) has the property, that each cluster of parking spaces occurring in the parking procedure after step k corresponds to a subset of rooted labelled trees in F (k) . It follows the description of the construction of the forest F := F (n) : Step 0: We start with the empty forest F (0) = ∅.
Step k: According to the parking of driver k we distinguish between two cases.
• Driver k parks at his preferred parking space s := s k : let us consider the parking procedure after Step (k − 1) and the cluster of parking spaces starting with parking space s + 1 (i.e., the cluster of parking ⇒ ⇒ FIGURE 10. Constructing a forest of labelled rooted trees from a parking function during the parking procedure illustrating the cases, where the preferred parking space of a driver is free (first picture) or occupied (second picture).
spaces right to the parking space of driver k); if parking space s + 1 is empty the cluster is ∅. According to the construction this cluster corresponds to a subset G of trees of the forest F (k−1) . Let T be the tree rooted at the new node labelled k with G its subtrees. Then the forest F (k) is defined by F (k) := (F (k−1) \ G) ∪ T . • Driver k cannot park at his preferred parking space s k , since it is occupied by the -th driver ( < k), but parks at the first empty space s > s k : let us consider the parking procedure after Step (k − 1) and the cluster of parking spaces starting with parking space s + 1 (i.e., the cluster of parking spaces right to the parking space of driver k); if parking space s + 1 is empty the cluster is ∅. According to the construction this cluster corresponds to a subset G of trees of the forest F (k−1) . Furthermore, let T be the tree of the forest F (k−1) containing label (by construction T ∈ G). Then, construct the rooted tree T by letting G be the subtrees of the (new) node labelled k and attaching node k to node ∈ T . Then the forest F (k) is defined by From the construction it follows that node k is a max-record in the forest F iff the first k − 1 drivers occupy all parking spaces left to the parking space s, where the k-th driver has parked, which is equivalent to the event that Step k was an increment of the initial cluster. Moreover, in this case the subtree rooted at node k in F (k) corresponds to a record-subtree in the forest F , whose size corresponds then to the amount of increment of the initial cluster. It is not difficult to see that this construction is indeed a bijection from the set of parking functions of size n to the set of rooted labelled forests of size n, but we omit here to state the inverse mapping (which could be formulated also in an iterative way).  1, 8, 1, 3, 4, 3, 1) and the forest of labelled rooted trees obtained by the mapping described in the proof of Proposition 4 as well as the edge-labelled rooted tree described in the proof of Theorem 5. The increments of the initial cluster of the parking function correspond to the maxrecords in the respective forest as well as to the cuts in the respective edge-labelled rooted tree (visualized as dotted lines).
In Figure 11 we give a parking function and the corresponding forest of labelled rooted trees under this bijection.
Proposion 4 yields thus a coupling between records in forests of rooted labelled unordered trees and increments in parking functions. This coupling can be extended easily to one between increments in parking functions and edge-cuts to isolate the root-node in Cayley-trees.
Theorem 5. The random variable X n counting the number of increments of the initial cluster in a random parking function of size n is equally distributed as the random variable C n counting the number of edge-cuts to isolate the root-node in a random Cayley-tree of size n, i.e., X n L = C n . Moreover, the number of increments of amount j in a random parking function of size n is equally distributed as the number of subtrees of size j cut-off during the edge-cutting procedure when starting with a random size-n Cayley-tree, i.e., X n,j L = C n,j , 1 ≤ j ≤ n − 1. After suitable normalization X n is asymptotically, for n → ∞, Rayleigh distributed with parameter 1: X n,j has, for n → ∞ and arbitrary 1 ≤ j = j(n) ≤ n − 1, asymptotically factorial moments of mixed Poisson type with a Rayleigh mixing distribution X and scale parameter λ n,j = (i) for λ n,j → ∞ the random variable X n,j λ n,j convergences in distribution, with convergence of all moments, to X. (ii) for λ n,j → ρ ∈ [0, ∞) the random variable X n,j convergences in distribution, with convergence of all moments, to a mixed Poisson distributed random variable Y L = MPo(ρX).
Proof. Starting with a labelled forest F of size n we get a rooted labelled tree T of size n + 1 by attaching the trees in F as subtrees of the root-node labelled 0.
Next we label the edges of T by labels {1, . . . , n}, where each edge e = (x, y), with x the parent of y, gets the label of the child y. When applying the edge-cutting procedure to the edge-labelled tree T in a way that at each step the edge with largest label is chosen and cut-off, each max-record of the original forest F corresponds to a cut in T and furthermore a record-subtree of size j in F corresponds in T to a cut-off of a branch of size j. Together with Proposition 4 this yields X n L = C n and X n,j The limiting distribution results for X n and X n,j follow thus from the corresponding results for C n and C n,j given in [31,55] and Theorem 4, respectively.

Zero contacts in bridges.
We consider directed lattice paths from left to right starting at (0, 0) and ending at (2n, 0). At each horizontal unit step we can either go one unit up (step (1, 1)) or down (step (1, −1)). Such lattice paths are called bridges of length 2n starting and ending at height zero, and the steps are stemming from so-called Dyck paths. Of course, such lattice paths are in bijection with lattice paths on a square grid, starting at (0, 0) and ending at (n, n), with allowed steps (1, 0) (right) and (0, 1) (up). Apparently, there are B n = 2n n such lattice paths and thus bridges of length 2n.
Flajolet and Sedgewick [19][Example IX.40, page 707] considered the random variable X n counting the number of visits of the x-axis in a random bridge of size 2n, i.e., the number of k, 1 ≤ k ≤ n, with (2k, 0) contained in the bridge, by selecting one of the B n bridges of length 2n uniformly at random. By using a combinatorial decomposition of bridges (the so-called arch decomposition), they have shown that X n follows asymptotically a Rayleigh distribution, i.e., Xn . We consider here a refinement of the r.v. X n by introducing the r.v. X n,j counting the number of j-visits of the x-axis, where a j-visit is simply a visit after an excursion of length 2j, i.e., a return to the x-axis after exactly 2j, j ≥ 1, steps. Of course, X n = n j=1 X n,j . Figure 12 illustrates these quantities. In order to examine the limiting behaviour of X n,j we start with a combinatorial description of the problem using the before mentioned arch decomposition. Let B be the combinatorial family of bridges of length ≥ 0 and D be the family of positive Dyck path excursions of length ≥ 2, i.e., Dyck paths of positive length starting and ending on the x-axis, where all points in between are above the xaxis. Analogous, let D be the family of negative Dyck path excursions of length ≥ 2, i.e., Dyck paths of positive length starting and ending on the x-axis, where all points in between are below the x-axis. Then, B consists of a sequence of positive and negative Dyck path excursions, i.e., it can be described combinatorially by the SEQ construction: Furthermore, the family D can be described formally as Of course, by extracting coefficients we reobtain B n = 2n n , whereas the sequence D n = 1 n 2(n−1) n−1 is enumerated by the (shifted) Catalan numbers. But more interestingly, the above combinatorial description (53)-(54) can be extended easily to enumerate a suitably introduced bivariate generating function of the distribution of X n,j : B j (z, v) := n≥0 m≥0 B n P{X n,j = m}z n v m .
We get then for j ≥ 1 the solution In order to obtain the factorial moments of X n,j we set w := v − 1 and introduce the functionB j (z, w) := B j (z, w + 1), thus yielding Extracting coefficients by using standard singularity analysis easily gives E(X Thus, by an application of Lemma 2 we get the following characterization of the limit law of X n,j . Theorem 6. The random variable X n,j counting the number of j-visits of the xaxis in a random bridge of length 2n has, for n → ∞ and arbitrary 1 ≤ j = j(n) ≤ n with j = o(n), asymptotically factorial moments of mixed Poisson type with a Rayleigh mixing distribution X and scale parameter λ n,j = (i) for λ n,j → ∞ the random variable X n,j λ n,j convergences in distribution, with convergence of all moments, to X. (ii) for λ n,j → ρ ∈ (0, ∞) the random variable X n,j convergences in distribution, with convergence of all moments, to Y L = MPo(ρX).
Moreover, the random variable Y L = MPo(ρX) converges for ρ → ∞, after scaling, to its mixing distribution X: Y ρ L − → X, with convergence of all moments.
Of course, the result above can be readily adapted to obtain joint distributions for the number of j-visits and the total number of visits of the x-axis as considered by Flajolet and Sedgewick [19]; see also Subsection 7.1. 5.6. Cyclic points and trees in graphs of random mappings. We call a function f : [n] → [n] from the finite set [n] := {1, 2, . . . , n} into itself an n-mapping (or an n-mapping function); let us denote by F n the set of n-mappings. When selecting one of the n n n-mappings at random (i.e., if we assume that each of the n n n-mappings can occur equally likely) one speaks about a random n-mapping. There exists a vast literature (see, e.g., [13,14,18] and references therein) devoted to reveal the typical behaviour of important quantities (as, e.g., the number of components, the number of cyclic nodes, etc.) of random n-mappings and the corresponding mapping graphs, respectively.
The mapping graph, i.e., the functional digraph, of an n-mapping f ∈ F n is the directed graph G f = (V, E) with set of vertices V = [n] and set of directed edges E = {(i, f (i)), i ∈ [n]}. The structure of the mapping graph G f of an arbitrary mapping function f is well known [13,18]: the weakly connected components of G f are cycles of rooted labelled trees, i.e., Cayley-trees, which means that each connected component consists of rooted labelled trees (with edges oriented towards the root nodes) whose root nodes are connected by directed edges such that they are forming a cycle.
This description allows to interpret a mapping f as a set of cycles of labelled trees. Hence, in order to describe the family F = n≥0 F n of all mappings, we can apply the combinatorial constructions SET and CYCLE to the family of rooted labelled trees T , as introduced and discussed in Subsection 5.2. This yields the following combinatorial description of the family of mappings F: F = SET(CYCLE(T )).
Hence, the exponential generating function F (z) := n≥0 n n z n n! = n≥0F n z n , withF n := n n n! , of the number of n-mappings satisfies where the tree function T (z) is defined in (31). Equation (56) suggests the alternative combinatorial description of mappings as sequences of rooted labelled trees: F = SEQ(T ). This can be justified easily by using an analogue of the canonical cycle representation of permutations: for each cycle of trees order the cycle by starting with the tree having the largest root-label amongs all these trees (let us call it the cycle leader) and then rank the different cycles in descending order of the cycle leaders.
The random variable X n counting the number of cyclic points in a random nmapping f (i.e., elements k ∈ [n], such that there exists a > 0 with k = f (k)), which of course coincides with the number of rooted labelled trees in the decomposition of the mapping graph G f given above, has been analyzed by Drmota and Soria [13]. They have shown a Rayleigh limit law: Xn √ n L − → Rayleigh(1). We are considering here a refinement of X n , namely, we introduce the random variables X n,j counting the number of trees of size j occurring in the decomposition of the mapping graph G f of a random n-mapping f ∈ F n ; of course, it holds X n = n j=1 X n,j . The quantities considered are visualized in Figure 13. When introducing a suitable generating function of the distribution of X n,j via F j (z, v) := n≥0 m≥0F n P{X n,j = m}z n v m , the combinatorial decomposition of the family of mappings F as given above immediately yields an explicit formula for F j (z, v), j ≥ 1: Introducing w := v − 1 andF j (z, w) := F (z, w + 1) gives from which the factorial moments of X n,j can be obtained easily by extracting coefficients. This can be done completely analogous to the computations in Subsection 5.2 leading for j = o(n) to the following result: which together with Lemma 2 shows the theorem stated below.
Theorem 7. The random variable X n,j counting the number of trees of size j occurring in the decomposition of the mapping graph of a random n-mapping has, for n → ∞ and arbitrary 1 ≤ j = j(n) ≤ n with j = o(n), asymptotically factorial moments of mixed Poisson type with a Rayleigh mixing distribution X and scale parameter λ n,j = j j−1 √ n j!e j : E(X s n,j ) = (λ n,j ) s 2 s 2 Γ s 2 + 1 · 1 + o(1) .
(i) for λ n,j → ∞ the random variable X n,j λ n,j convergences in distribution, with convergence of all moments, to X. (ii) for λ n,j → ρ ∈ (0, ∞) the random variable X n,j convergences in distribution, with convergence of all moments, to Y L = MPo(ρX).
Moreover, the random variable Y L = MPo(ρX) converges for ρ → ∞, after scaling, to its mixing distribution X: Y ρ L − → X, with convergence of all moments.

MULTIVARIATE MIXED POISSON DISTRIBUTIONS
The definition 1 readily extends to multivariate distributions, compare with [15].
where µ s 1 ,...,sm = E(X s 1 1 . . . X sm m ) for s 1 , . . . , s m ≥ 0; this can readily be seen by a direct computation. of generating functions of the form G(H(z)). Combinatorially, this amounts to substitution between structures of the form F = G • H, We measure the size of the so-called core X n , and additionally taking into account the contribution of parts of size j to the core: Here the variable u marks as usual the total size of the so-called core X n , P{X n = m} = [z n u k ]F (z, u, 1) [z n ]G(H(z)) , and the new variable v marks the contribution of parts of size j measured by the random variable X n,j to the core such that X n = n j=1 X n,j . Using the semi-large power theorem of [19] (Theorem IX.16, page 709) one can study this j-part core X n,j . More generally, it is desirable to study the joint distributions (X n ; X n,j 1 , . . . , X n,j k ) via We will report on our findings on this refined analysis of compositions elsewhere [41]. 7.2. Open problems. It should be possible to use mixed Poisson approximation and to derive distances, i.e. total variation distance -between the random variables of interest and the corresponding mixed Poisson distributions.