On the topological support of species sampling priors

: In Bayesian nonparametric statistics, it is crucial that the sup- port of the prior is very large. Here, we consider species sampling priors. Such priors are widely used within mixture models and it has been shown in the literature that a large support for the mixing prior is essential to ensure the consistency of the posterior. In this paper, simple conditions are given that are necessary and suﬃcient for the support of a species sampling prior to be full. In particular, for proper species sampling priors, the condition is that the maximum size of the atoms of the corresponding process is small with positive probability. We apply this result to show that the main classes of species sampling priors known in literature have full support under mild conditions. Moreover, we ﬁnd priors with a very simple construction still having full support.


Introduction
The main motive of Bayesian nonparametrics is to avoid restrictive parametric assumptions about the distribution generating the data. This is done constructing instead random probability measures whose distributions to be used as priors have a large support (see, for instance, Ghosal [10]). This paper considers the support (with respect to the weak topology) of the distribution of species sampling processes, which are a relevant class of discrete random probability measures. Species sampling models, which have been developed by Pitman [31], Hansen and Pitman [14], have been extensively studied in Bayesian nonparametric statistical literature. See for instance Ishwaran and James [17], Ongaro [26], Navarrete, Quintana and Mueller [24], Jang, Lee and Lee [19], Hjort et al. [15] and citations provided therein.
So, the large support is an essential requirement for any Bayesian nonparametric model to be appropriate. Moreover, the support of species sampling processes plays a relevant role for the consistency of mixed models. Indeed, these processes are often used as mixing distributions in Bayesian mixture models for density estimation. Ghosal, Ghosh and Ramamoorthi [11] find sufficient conditions for the weak consistency of posteriors for normal mixture models. Their only requirement for the prior of the mixing distribution is to contain the true mixing distribution. More generally, the large support is usually the basic and unique assumption about the mixing prior in the literature about Bayesian consistency of Bayesian mixture models. Such assumption is made, for instance, by Datta [4], who considers a general mixture model and proves the consistency of the predictive distribution and the estimator of the mixing distribution under mild conditions. This is also the case of Wu and Ghosal [39], who obtain sufficient conditions, under which a prior obtained by mixing a wide range of kernels ensures weak consistency of the posterior. The same assumption is made by De Blasi, James and Lau [5], who consider mixed multinomial logit models, and by Bhattacharya and Dunson [1], who study consistency for a more general mixture models.
In spite of the amount of literature regarding species sampling priors, their support has not generally received much attention. To the best of our knowledge, this generally is the case of the most known priors in such class, apart from the Dirichlet prior. Furthermore, one might wonder if it is possible to consider other discrete random probability measures with a much simpler structure, still having a prior with large support.
The most well-known species sampling prior is the Dirichlet prior introduced and studied by Ferguson [8]. Ferguson [8] states that the topological support (with respect to the topology of weak convergence) of a Dirichlet prior is the set of all probability measures whose topological supports are contained in that one of the parameter measure of the Dirichlet prior, which is the marginal distribution of one observation. Later on, it will be shown that the support of a prior with the same marginal cannot be larger. For this reason, if a prior has the same support of the Dirichlet prior with the same marginal, then we can say that that prior has full support. Indeed, the support of the Dirichlet prior contains essentially all probability measures on the relevant sample space, as it will be clear later on.
Ferguson's [8] statement about the support of the Dirichlet prior on the real line has been extended to broader classes of prior distributions. Majumdar [23] generalizes it replacing the real line with an arbitrary separable metric space. Ongaro and Cattaneo [27] find sufficient conditions for species sampling priors to have full support. The present paper provides two much weaker and simple conditions that are sufficient and necessary for such result. In particular, for proper species sampling priors, one such condition is that the maximum size of the atoms of the corresponding process is small with positive probability. We shall also show that if such condition is not satisfied, then the support is remarkably poor, not including for instance continuous probability measures.
Verifying this condition, it will be proved that the support is full under mild conditions for the most relevant species sampling priors, such as the stickbreaking priors and the Poisson-Kingman priors, which in turn include the normalized random measures with independent increments, the Pitman-Yor process priors and the Gibbs-type priors with infinitely many species. In particular, for the Poisson-Kingman model, this condition cannot be verified directly as there is not a closed expression for the distribution of the maximum atom size. A proof is given based on specific properties of non-homogeneous Poisson processes.
By means of our results, we shall find priors with full support that have a very simple construction and are expected to be easier to implement than the more complex Bayesian nonparametric models. This is the direction pointed by Fuentes-García, Mena and Walker [9], who consider the species sampling prior with geometric weights. Namely, they assess the random sizes of the atoms as P j = W (1 − W ) j−1 (j ≥ 1). They use such prior for density estimation showing that its inferential performance is at least as good as the usual nonparametric models, the estimation algorithm being more efficient. We prove that for this prior to have full support, the only condition is that zero belongs to the support of the random variable W . Another simple way to construct a species sampling prior with full support consists of taking P j = 1/K, for j = 1, . . . , K, where K is an unbounded positive random variable valued into the natural numbers.
The outline of the paper is the following. Section 2 presents the general results, Section 3 shows that the support is full for different priors, including the most important nonparametric priors among the species sampling models. The Appendix contains the proof of Theorem 2 and Proposition 6.

General results
Let X be separable metrizable space and X its Borel σ-field. Let P be the set of all probability measures on (X, X ). Recall that P equipped with the topology of weak convergence is metrizable as a separable metric space [see 28] and let B(P) be its Borel σ-field. Moreover, for every µ in P, an open base of neighborhoods of µ for the topology of weak convergence on P is the class of sets where k is a positive integers, (ε 1 , . . . , ε k ) belongs to (0, ∞) k , and (A 1 , . . . , A k ) is a k-tuple of µ-continuity sets in X . Let (Ω, F , P) be a probability space on which all the random variables considered through the present paper are defined. A random probability measure is a measurable map from (Ω, F , P) into (P, B(P)).
Recall that the topological support S µ of a finite Borel measure µ on a second countable topological space is defined as the intersection of all closed set F such that µ(F c ) = 0. As usual, the support of a random variable is defined to be the support of its probability distribution. It is known that s belongs to S µ if and only if µ is positive on every neighborhood of s. So, a probability measure µ in P belongs to the support of a random probability measure P if and only if P(P ∈ U ε1,...,ε k (µ; A 1 , . . . , A k )) > 0, for every k-tuple (A 1 , . . . , A k ) of µ-continuity subsets of X and every integer k ≥ 1.
Majumdar [23] proves that the topological support of the Dirichlet prior with parameter α is the class of all probability measures µ such that S µ ⊂ S α .
Species sampling processes are discrete random probability measures that generalize the Dirichlet Process. A species sampling process is a random discrete probability measure of the following form: where α is a probability measure on X which is diffuse (that is α{x} = 0 for every x in X), (P j ) j≥1 and (Z j ) j≥1 are two independent sequences of random variables such that P j ≥ 0 for every j ≥ 1 and ∞ j=1 P j ≤ 1, P-a.s., and (Z j ) j≥1 is an i.i.d. sequence whose common distribution is α. If ∞ j=1 P j = 1, P-a.s., then (1) becomes To be consistent with Pitman's terminology, here a species sampling process that admits a representation of the form (2) is called proper. The Dirichlet Process is a proper species sampling process as shown by Ferguson [8] and Sethuraman [36]. The support of a Dirichlet prior is essentially as big as possible, as shown in the next Proposition. Proposition 1. Let Q be a random probability measure on (X, X ) such that α(A) = E(Q(A)), for every A in X . If µ is a probability measure on (X, X ) belonging to the support (in the weak topology) of the distribution of Q, then S µ ⊂ S α .
Proof. We need to show that the set C := {µ ∈ P : S µ ⊂ S α } is closed and that Q belongs to C almost surely. To prove the second claim, consider that Q(S c α ) ≥ 0 a.s. and 0 = α(S c α ) = E(Q(S c α )), where E denotes the expectation w.r.t. P. Hence, Q(S c α ) = 0 a.s. and therefore, S Q ⊂ S α a.s. To prove closeness, we assume that (Q n ) n≥1 is a sequence of elements of C such that Q n weakly converges to some Q ∞ in P and we show that Q ∞ belongs to C. To this aim, note that Q n (S α ) is equal to one a.s. since Q n belongs to C, for every n ≥ 1. Moreover, being S α a closed set, we can apply the Portmanteau theorem to obtain that Hence, Q ∞ (S α ) is equal to one a.s., i.e. S Q∞ ⊂ S α , and the proof is complete.

Proposition 1 leads to the following definition:
Definition. Let Π be a prior distribution on P, let P be a random probability measure with distribution Π and let α the measure on (X, X ), such that α(A) = E(P (A)) for every A in X . Π (or equivalently P ) is said to have full support if and only if the support (in the weak topology) of Π is the set of all measures µ on (X, X ) such that S µ ⊂ S α .
To make clearer the sense of the previous definition, let X be an observation from P , i.e. P is the conditional distribution of X given P . Then, X is the sample space for X and, in the most natural case, X is also the support of the marginal distribution α of X. In this case, the support of P is full if it coincides with the whole class of probability measures on X.
Our aim is to find necessary and sufficient conditions for a species sampling prior to have full support. Ongaro and Cattaneo [27] have essentially shown that a sufficient condition for a proper species sampling prior is that for every n, the support of the law of (P 1 , . . . , P n ) is full, i.e. is equal to the n-dimensional simplex {(p 1 , . . . , p n ) : n j=1 p j ≤ 1, p j ≥ 0, 1 ≤ j ≤ n}. The next theorem identifies two much weaker conditions that are necessary and sufficient for the distribution of a species sampling process to have full support.
Theorem 2. Consider a prior distribution Π on P that is the distribution of the process (1).
The following facts are equivalent: (3)

P. G. Bissiri and A. Ongaro
Clearly, condition iii. improves the sufficient condition given by Ongaro and Cattaneo [27]. Moreover, it is a convenient alternative when it is difficult to deal with the distribution of max j≥1Pj . Theorem 2, which is proved in the Appendix, yields the following corollary: Corollary 3. If j≥1 P j = 1, P-a.s. then the corresponding (proper) species sampling prior has full support if and only if for every ε > 0.
In other words, for a proper species sampling prior to have full support is necessary and sufficient that zero belongs to the support of the maximum atom size. It will be now shown that if such condition is not satisfied then the support of the prior, besides being not full, is substantially smaller. Indeed, if this is the case, then neither continuous distributions nor discrete distributions whose atoms have too small sizes belong to the support of the prior, as shown in the following proposition: (4) is not satisfied for some ε > 0 and µ is a probability measure on X such that µ({x}) < ε, for every x ∈ X, then µ does not belong to the support of the prior.

Proposition 4. Given a proper species sampling prior, if
Proof. To begin with, recall that, for every µ in the space P of probability measures on (X, X ), a base of neighborhoods of µ generating the topology of weak convergence is made of the sets of the form where ε > 0 and {F 1 , . . . , F m } is an m-tuple of measurable closed subsets of X [see 2, p. 236]. Therefore, to prove the statement we need to find a set of such form with zero prior probability under the assumption that µ({x}) < ε, for every x ∈ X.
To this aim, let B the union of a countable dense subset of X, which exists by separability of X, and the set of the atoms of µ. So, B is countable and we denote by (x j ) j≥1 an arrangement of the elements of B. Now, let d be a metric on X which induces the Borel sigma-field X and setB Clearly, the sets in the sequence (B j ) j≥1 cover the space X. In fact, B contains a dense subset of X, say D, and therefore for every r > 0 and every So, there is a neighborhood of µ with zero P-probability and the proof is complete.
We now state the following proposition, which shows a connection between proper and not proper species sampling priors related to their supports.
Proposition 5. Let P be a species sampling process defined by (1) and let If P( j≥1 P j > w) > 0 for every w < 1 and the conditional distribution of Q given l≥1 P l has full support almost surely, then the distribution of P has full support, too.
We do not report the proof of this proposition, which is an immediate consequence of Lemma 10 in the Appendix. Proposition 5 suggests a simple way to construct a not-proper species sampling prior with full support from a proper species sampling prior with full support. In fact, if Q is a proper species sampling process, its prior has full support and its marginal is α, then one can take W Q + (1 − W )α, where W is a (0, 1)-valued random variable with 1 in its support and Q and W are independent.

Illustrations
We now apply Theorem 2 and Corollary 3 to show that several species sampling priors have full support. We shall consider the two most general and studied priors among the species sampling models, namely the Poisson-Kingman and the stick-breaking model, but also two priors with a very simple construction, one based on geometric frequencies and the other one on a finite number of species.

Poisson-Kingman models
The class of Poisson-Kingman priors was introduced and studied by Pitman [32] and it includes as special cases another known class, such as the homogeneous normalized random measures with independent increments [see 34]. Moreover, Poisson-Kingman models are related to the Gibbs-type random probability measures with infinitely many species [see 13,22,7]. Normalized random measures with independent increments and Gibbs-type priors are two of the most studied and used for applications.
To properly define the class of Poisson-Kingman priors, let us introduce some notation. Let ρ be a measure on R + such that and Moreover, let α be a probability measure on X, let (T j ) j≥1 be the decreasing arrangement of the points of a Poisson process Π with mean measure ρ, and let (Z j ) j≥1 be a sequence of independent and identically distributed random variables with common distribution α. Set S = j≥1 T j and assume that the probability distribution of S is absolutely continuous with respect to the Lebesgue measure on R + . Denote by Q ρ,s the regular conditional distribution of the sequence (T j /S) j≥1 given S = s as constructed by Pitman [32]. Let (P j ) j≥1 be a sequence of nonnegative random variables with distribution where µ is a probability measure on R + . Then (7) is termed a Poisson-Kingman distribution with Lévy intensity ρ and mixing distribution µ and denoted by PK(ρ, µ). Moreover, the random probability measure ∞ j=1 P j δ Zj is termed a Poisson-Kingman random probability measure with Lévy intensity ρ and mixing distribution µ.
Condition (5) is required to ensure that S is finite almost surely [see 20, p. 28], and (6) to ensure that S is positive almost surely. To show this last statement, one can just apply Campbell's theorem to obtain that for every u > 0, and then apply the monotone convergence theorem, letting u diverge to infinite, to obtain that P(S = 0) = e −ρ(0,∞) .
An important class of priors related to the Poisson-Kingman priors is that one of Gibbs-type priors with infinitely many species. In fact, as proved by Gnedin and Pitman [13], a Gibbs-type prior with infinitely many positive frequencies is either a mixture of Dirichlet priors or a Poisson-Kingman prior with stable Lévy density ρ σ (x) = σx −σ−1 /(Γ(1 − σ)), for some 0 < σ < 1, and arbitrary mixing distribution µ. Another relevant prior belonging to the Poisson-Kingman class is the Pitman-Yor process prior, also known as two parameter Poisson-Dirichlet prior, introduced by Perman, Pitman and Yor [29] and further studied by Pitman [30], Pitman and Yor [33] and Ishwaran and James [17]. In fact, a Pitman-Yor process prior with parameters 0 < σ < 1 and θ > −σ is a PK(ρ σ , µ σ,θ ), and therefore a Gibbs-type prior, where µ σ,θ dt = σΓ(θ)/Γ(θ/σ)t −θ f σ (t)dt, and f σ is a density of a σ-stable random variable. The Pitman-Yor process prior is widely used in applications. See, for instance, Teh and Jordan [38] and citations provided therein.
We are now ready to state and prove the following proposition: Proposition 6. If µ is absolutely continuous with respect to the Lebesgue measure, then the PK(ρ, µ) prior has full support.
By Proposition 6, the whole class of NRMII's and Pitman-Yor process priors have full support. In fact, if µ coincides with the probability distribution of S, then the corresponding Poisson-Kingman random probability measure is a normalized random measure with independent increments. To prove Proposition 6, it is convenient to split the half line of positive real numbers into two intervals and to consider separately the restriction of Π to each interval. By an asymptotic argument, it is possible to deal with the infinitely many jumps around zero, letting the interval containing zero decrease to the singleton {0}. Then, one can deal with the interval which does not contain zero converting the Poisson process Π into a Bernoulli process by conditioning. The details of the proof of Proposition 6 are deferred to the Appendix.

Stick-breaking priors
The class of stick-breaking priors is a relevant one in Bayesian nonparametrics [see 16,17,18,6]. A stick-breaking prior is such that for every integer m ≥ 1, and therefore is between zero and one. Stick breaking priors have full support under minimal assumptions as shown in the next proposition.
Proposition 7. If (V j ) j≥1 is a sequence of independent random variables and for every ε > 0 there is δ > 0 such that P(δ < V j < ε) is positive for every integer j ≥ 1, then the corresponding stick-breaking prior has full support.
is a sequence of independent random variables and for some ε > 0, the support of each V j includes the interval (0, ε), then the corresponding stick-breaking prior has full support.
Generally, for each j ≥ 1, V j has a Beta distribution with parameters a j and b j , where (a j ) j≥1 and (b j ) j≥1 are two sequences of positive numbers [see for instance 16]. Clearly, in this case the support of each V j is the whole unit interval and Corollolary 8 can be applied.
The sequence (V j ) j≥1 is often taken i.i.d., which yields, by the way, a proper species sampling model. This is the most common case, which is considered in the following corollary: Corollary 9. Let (V j ) j≥1 be a sequence of independent and identically distributed random variables. Then, P(0 < V 1 < ε) is positive for every ε > 0 if and only if the corresponding stick-breaking prior has full support.
Proof. To prove the "only if" part, consider that lim c→0 P(c < V 1 < ε) = P(0 < V 1 < ε) > 0. This implies that P(δ < V 1 < ε) is positive for some δ > 0 and then it is sufficient to apply Proposition 7. To prove the "if" part, we can just show that P(V 1 ∈ {0} ∪ (c, ∞)) = 1 implies that the support of the corresponding stick-breaking prior is not full. To this aim, denote by φ the map from [0, 1] ∞ into itself such that φ((v j ) j≥1 ) is the sequence obtained from (v j ) j≥1 removing the zeroes and let (W j ) j≥1 = φ((V j ) j≥1 ). Clearly, substituting (V j ) j≥1 with (W j ) j≥1 , one obtains the same stick-breaking prior. Moreover, P(W 1 > c) = 1. Being W 1 = P j for some j ≥ 1 a.s., we have that P j > c for some j ≥ 1 a.s. By Corollary 3, this implies that the corresponding stick-breaking prior does not have full support.

Geometric frequencies
Let us now introduce an example of a proper species sampling prior with a very simple structure still having the full support. Let for j ≥ 1, where W is a random variable such that 0 < W < 1 with probability one and zero belongs its support. This prior has been proposed and used by Fuentes-García, Mena and Walker [9], letting W having a Beta distribution. To apply Corollary 3 it is sufficient to note that (P j ) j≥1 is a decreasing sequence almost surely and P(0 < W < ε) > 0 for every ε > 0. For this prior, all the weights P j , j ≥ 1, are obtained from a real valued random variable W and the support of (P 1 , . . . , P n ) is not full, for n ≥ 1, i.e. the condition for the full support given by Ongaro and Cattaneo [27] is not satisfied. As shown by Fuentes-García, Mena and Walker [9], such model is easier to implement than standard models and performs well at least in the density estimation context. In the following subsection, other examples are given of priors with a simple structure and full support, whose inferential behaviour are worth exploring.

Finite number of species
Let us consider a finite number of species, in other words a proper species sampling process (2) such that the number of P j 's which are positive is finite. Define the random variable K = max{j : P j > 0} with the convention that K = ∞ if P j is infinitely often positive. When K is finite, (2) can be equivalently and more conveniently expressed as where (P 1,k , . . . , P k,k ) is a random vector whose distribution coincides with the conditional distribution of (P 1 , . . . , P K ) given K = k, for every positive integer k. So, (P 1,k , . . . , P k−1,k ) takes values into the (k − 1)-dimensional simplex, for every k ≥ 1. This formulation is the one used for instance by Ongaro [26]. It is reasonable that K is finite almost surely, i.e. the number of species is finite. This model is used within Bayesian nonparametric mixture models Richardson and Green [35]. See also Stephens [37] and Nobile and Fearnside [25]. Moreover, a model of this kind has been studied by Gnedin [12]. Priors which arise from processes of the form (11) are considered by Bissiri, Ongaro and Walker [3]. This family of priors includes the Gibbs-type priors with finitely many species, which are obtained if (P 1,k , . . . , P k−1,k ) has a symmetric finite-dimensional Dirichlet distribution. This is proved by Gnedin and Pitman [13]. It is not hard to verify that this prior has full support if and only if for every ε > 0, there is a positive integer k such that P(K = k) > 0 and P(P j,k < ε, 1 ≤ j ≤ k | K = k) > 0. In fact, is positive if and only if at least one term of the sum is positive and one can just apply Corollary 3. Therefore, being the support of the Dirichlet distribution the whole simplex, Gibbs-type priors have full support provided that K is unbounded. A remarkably simple example where the prior has full support is obtained taking P j,K = 1/K where K is a random variable such that P(K > k) > 0 for every positive integer k.

Discussion and conclusions
A large support is clearly an essential requirement for any Bayesian nonparametric model to be appropriate. In particular, it is an important requirement for species sampling priors that are used as mixing distributions in Bayesian mixtures models for density estimation to ensure consistency for such models. For instance, if the support (in the weak topology) of the mixing distribution is full and the true density to be estimated satisfies some mild conditions, then consistency is satisfied for a wide range of mixing location scale kernels (see Wu and Ghosal [39]).
This paper provides necessary and sufficient conditions for a species sampling prior to have full support (Theorem 2). In particular, for proper species sampling priors, it turns out to be a very simple condition (Corollary 3). These results can be apply to show that the most relevant classes of species sampling priors have full support, but they are also useful to construct species sampling priors with a much simpler structure still having full support. This is the case of the prior based on geometric weights (10), which has full support as shown in Section 3.3 and was used by Fuentes-García, Mena and Walker [9] within a Bayesian mixed model for density estimation. Namely, the prior for the unknown density is assessed as the distribution of the following random density function: where P is a proper species sampling process with geometric weights (10), the distribution of W is Beta, and h(·, z) is a density function for each z.
The simple structure of this prior for P sets it apart from the priors mostly studied for Bayesian nonparametric mixture modeling, which are generalizations of the Dirichlet process model and in most cases yield complex models hard to implement and to apply in real situations. See Lijoi, Mena and Prünster [21] for an example of how complex these generalizations can be. Fuentes-García, Mena and Walker [9] propose a relatively easy Gibbs sampler algorithm for the model based on the prior with geometric weights, which results in a simpler alternative to those typically used for Bayesian nonparametric mixtures, but with similar inferential performance. To this aim, they consider the following alternative representation for f : where q W (m) = mW 2 (1 − W ) m−1 for m ≥ 1. This representation suggests to write the model in hierarchical form: • W is Beta distributed; • the Z j 's are i.i.d. with a given density function and are independent of W ; • K 1 , . . . , K n are conditionally i.i.d. given W with common probability mass function q W and are independent of the Z j 's; • the observations X 1 , . . . , X n are conditionally i.i.d. given K 1 , . . . , K n , W and the Z j 's, and the conditional density of X i is for i = 1, . . . , n.
Fuentes-García, Mena and Walker [9] derive the full conditionals of the random variables defined in each level of the hierarchy to construct a Gibbs sampler for the model. In Section 3.4, another prior is considered with an even simpler structure, namely with finitely many species, and equal weights P j,K = 1/K, for j = 1, . . . , K. Indeed, this is probably the simplest Bayesian nonparametric model. When it is used within mixed models, the random variables K 1 , . . . , K n in the above representation are replaced by a single one, that is K. In this case, the random density takes the form: This model resembles the form of the basic classical kernel density estimator.
In spite of its semplicity, for an appropriate choice of the class of densities h, it approximates a large variety of densities, ensuring posterior consistency within mixed models. To our knowledge, this model has not been studied for Bayesian mixture modeling and it would deserve furhter investigation.
In general, we believe that for many applicative purposes, such as density estimation, a sophisticated modelization for the weights does not always provide a real gain in terms of inferential performance, implying on the other hand an heavy computational cost.

Acknowledgment
The authors are grateful to Stephen G. Walker, who inspired this paper. This work was partially supported by ESF and Regione Lombardia (by the grant "Dote Ricercatori").

Appendix A: Proof of Theorem 2
In order to prove Theorem 2, the three following lemmas are useful.
To this aim, we shall show by induction that for some k-tuple (n 1 , . . . , n k ) of positive integers and every 1 ≤ j ≤ k − 1.
Lemma 12. Let P be a proper species sampling process with frequencies (P j ) j≥1 such that, for every ε > 0, for some C in the sigma-algebra generated by (P j ) ≥1 . Then for every measurable partition {A 1 , . . . , A k } of X with α(A j ) > 0 (1 ≤ j ≤ k), every k-tuple (p 1 , . . . , p k ) of positive real numbers that sum up to one and every ε > 0, for every 1 ≤ j ≤ k and some positive integer n, then, summing up each term w.r.t. j = 1, . . . , k, the first inequality in (22) yields: l>n P l ≤ kε/(k + 1), and therefore, being P (A j ) = l≥1:Y l ∈Aj P l (for 1 ≤ j ≤ k), p j −ε/(k+1) ≤ 1≤l≤n:Y l ∈Aj P l ≤ P (A j ) ≤ 1≤l≤n:Y l ∈Aj P l + l>n P l ≤ p j +ε. (23) So, (22) implies (23) and therefore, denoting P C (A) = P(A ∩ C), for every A ∈ F , where m 1 = 0 and m j = j−1 l=1 n l for j = 2, . . . , k. To complete the proof, it is sufficient to show that the last term in (24) is positive. To this aim, note that there is a set in the union in (24) with positive P C -probability. In fact, by Lemma 11, and therefore, for some k-tuple (n 1 , . . . , n k ), Proof of Theorem 2. The condition ii. is equivalent to the following condition: ii'. For every ε > 0 and every w < 1, P(max j≥1Pj < ε, l≥1 P l > w) > 0.
Let us prove that i. implies ii'. Denote by P the species sampling process (1) and let {A 1 , . . . , A k } be a partition of X such that 0 < α(A j ) < 1, A j is α-continuous and define for every A ∈ X and every integer k ≥ 1. Hence, for every k ≥ 1, α k is a probability measure such that S α k = S α and belongs to the support of P by ii'. At this stage fix ε > 0, an integer k > 1/ε, and where α * = max j=1,...,k α(A j ). Hence, one can write: which by (25) yields: Let us prove that ii'. implies i. To this aim, let S µ ⊂ S α . By Lemma 10, we can complete the proof showing that is positive, for every (ε 1 , . . . , ε k , w) ∈ (0, 1) k+1 , for every k-tuple (A 1 , . . . , A k ) of µ-continuity subsets of X and every integer k ≥ 1. Let {B 1 , . . . , B 2 k } be the partition generated by A 1 , . . . , A k . Since S µ ⊂ S α and A j is µ-continuous (1 ≤ j ≤ k), if α(B l ) = 0, then µ(B l ) = 0 (1 ≤ l ≤ 2 k ). To show this note that B 1 , . . . , B 2 k are µ-continuous, being µ-continuity closed under intersection (as ∂(A ∩ B) ⊂ ∂A ∪ ∂B for every two sets A and B), and therefore µ(∂B l ) = 0, for 1 ≤ l ≤ 2 k . Moreover, for every 1 ≤ l ≤ 2 k , the interior of B l (say B • l ) is an open set with zero α-measure, which implies that (B • l ) c ⊃ S α ⊃ S µ , and therefore µ(B • l ) = 0. So α(B l ) = 0 implies that µ(B l ) = 0 (1 ≤ l ≤ 2 k ), and therefore (26) is greater than or equal to which is positive by assumption i., in virtue of Lemma 12 with C = { l≥1 P l > w}.
Let us prove that ii'. implies iii. By definition, the series j≥1 P j is less than or equal to one and therefore it converges, almost surely. Hence, Applying ii'. with w = 1 − ε/2 and then (27), one obtains that: This implies that for some m ≥ 1, Being m l=1 P j = l≥1 P l − l>m P l and l≥1 P l ≤ 1, almost surely, (28) implies that: P(P j ≤ ε, j ≥ 1, m l=1 P l > 1 − ε) > 0, which in turn implies iii.
To ensure (30), it will be sufficient to prove that P(T 1 < εa, a < S < b) > 0, for every pair (a, b) where 0 < a < b.
Recall that by (29), ρ(0, x) is infinite for every x > 0. So, in particular, it is not zero. Hence, for every x > 0, there is a point of the support of ρ inside the interval (0, x). On the basis of this fact, we can fix a point x in the support of ρ such that 0 < x < min{aε, qb − a}. Let k > 0 be integer such that a < kx < qb. Moreover, fix c > 0 smaller than x and set 0 < δ < min{b/(2k) − x, x − a/k, x − c, εa − x}. In this way, (k(x − δ), k(x + δ)) ⊂ (a, qb) and (x − δ, x + δ) ⊂ (c, εa).