Random distributions via Sequential Quantile Array

: We propose a method to generate random distributions with known quantile distribution, or, more generally, with known distribution for some form of generalized quantile. The method takes inspiration from the random Sequential Barycenter Array distributions (SBA) proposed by Hill and Monticino (1998) which generates a Random Probability Measure (RPM) with known expected value. We deﬁne the Sequential Quantile Array (SQA) and show how to generate a random SQA from which we can derive RPMs. The distribution of the generated SQA-RPM can have full support and the RPMs can be both discrete, continuous and diﬀerentiable. We face also the problem of the eﬃcient implementation of the procedure that ensures that the approximation of the SQA-RPM by a ﬁnite number of steps stays close to the SQA-RPM obtained theoretically by the procedure. Finally, we compare SQA-RPMs with similar approaches as Polya Tree.


Introduction
Random probability measures (RPM) find their applications in several different fields, such as statistics, mathematical finance, stochastic processes.
In nonparametric Bayesian statistics, for example, the construction of random probability distributions permits to draw a prior at random from the space of probability measures. In this context, the Dirichlet process (DP) model represents a rather common choice despite being only able to generate discrete distributions with probability one (see [12] and [13] for a review on the topic). Several generalizations of the Dirichlet processes have been proposed, among which we can mention the wider class of the Stick-breaking priors ( [19]). Methods that permit to generate continuous prior distributions have also been studied: for example the Polya Tree models (including DP as particular cases) and Bernstein processes (see [26] for a more detailed review on RPM used in Bayesian data analysis). A method to produce RPM with given mean is proposed by [17], while [4] studied a method to construct RPM with given mean and variance.
RPM also occur in some stochastic processes describing mass in space ( [25] and references therein) or in random dynamical systems ( [9]). In particular, random dynamical systems are defined through a random measurable map ϕ : T × Ω × X → T × Ω × X that, for all (t, ω) ∈ T × Ω maps Borel measures on X into a random Borel measure on X. RPM can be used to generate scenarios on which taking decisions under uncertainty or ambiguity. For example, [25] suggests that RPM may be used to solve a random optimal stopping problem.
Inspired by the Sequential Barycenter Array distributions (SBA) of [17], in this paper we propose an alternative to the SBA procedure, which we denote as the Sequential Quantile Array (SQA), that allows to construct random distribution functions with the τ -quantile following a given distribution (for some τ ∈ (0, 1)). For example in finance, this method can be used to generate random probability measures whose associated Value at Risk (VaR), that is a risk measure largely used, follows a specified distribution. Moreover, by exploiting their link with the quantiles, it is possible to generate RPMs with specified distribution for alternative risk measures based on the general notion of M -quantiles defined by [5]. The expectiles belong to the family of M -quantiles, and might be seen as a generalization of the expectation, because in particular the expectile of level τ = 1/2 coincides with the mean. In this sense, our method can be seen as a possible way to broaden the principle behind the work of [17]. We note that [6] simulate RPMs with a specified value of a certain risk measure (they focus in particular on the VaR and the expected shortfall (ES)). However, except for a common interest in controlling for quantiles of RPMs, our work is different from [6] in all respects. In fact, we construct a new procedure for the generation of RPMs, that allows for the chosen quantile to have a given distribution λ 1 , which needs not be concentrated on a constant as in [6]. Further, we study the properties of the random distributions generated through our procedure, by assessing conditions allowing for continuity, full support and differentiability of the RPMs. Conversely, their approach to the problem is purely algorithmic and is based on an ad-hoc transformation of a realized RPM obtained from the implementation of either [11], [15] or [17] procedures.
The paper is organized as follows: in Section 2, we present the notion of Sequential Quantile Array and we show that it shares most of the properties of the SBA. Section 2.2 focuses in particular on the case when the base measure generating the SQA sequence is discrete. In Section 3 we describe how to construct SQA-RPM and study their properties, namely we study conditions for continuity, differentiability and full support. Moreover, we show that the SQA-RPM behaves differently from the SBA-RPM, because it can produce families of distributions that include both discrete and continuous distributions with positive probability. In Section 3 we also present a stopping rule and a truncation algorithm for computational issues relative to the efficient implementation of the procedure. In this regard, we prove that the approximated probability measure obtained by truncation is in a γ-neighborhood relative to strong topologies of the true SQA-RPM with probability arbitrarily close to one. Section 4 discusses how the SQA-RPM can be used to obtain a distribution with prescribed generalized forms of quantiles and we describe how to generalize the procedure for two quantiles and to generate random distributions with unbounded support. Section 5 presents some simulations of RPM and we consider also the comparison of SQA-RPM to other methods proposed in the literature, such as Polya trees and quantile pyramids. The latter, proposed by [18], is based on the similar idea of generating dyadic quantiles. Finally Section 6 concludes with few possible applications.

Sequential Quantile Array (SQA)
Let τ ∈ (0, 1) and G an arbitrary distribution function. We denote by q τ (G) the τ -th level quantile of G, namely q τ (G) = G −1 (τ ), where G −1 is replaced by the generalized inverse function, that is q τ (G) = inf{y : G(y) ≥ τ }, if G is not invertible everywhere in its support.

Definition 1.
We call quantile of G in (a, c] of level τ the following function: In what follows, we shall write G(· | (a, c]) = G(· | X ∈ (a, c]) to shorten the notation.
The above lemma is analogous to Lemma 2.2. in Hill and Monticino (1998) [17] and it is useful to study the properties of the sequential quantile arrays defined below. Definition 2. The Sequential Quantile Array (SQA) of level τ of the distribution function G is the triangular array Q(G, τ ) := {q n,k (G, τ )} = {q n,k } n≥1,k≤2 n defined by induction: with q n,0 = −∞ and q n,2 n = ∞ for all n.
Let us define the intervals I n,k = (q n,k−1 , q n,k ].

SQA of discrete distributions
In this section, we consider the case of a discrete distribution G, with jumps {α j } at points {x j }, j ∈ J. We prove that, although the SQA is able to reconstruct the support of G, it is not necessarily able to find the masses α j in general, and thus there is not a one-to-one correspondence between G and its SQA of level τ . The definition of B τ G ((a, c]) in the case of a discrete G is the same as (2.1), where the identity always holds if G(a) = G(c), but not in general.
Proof. Points (iii) and (vi) are straightforward, while (vii) is a direct consequence of (i).
and consequently also G(x) = G(a) and the equality holds. Let then b > a. From the definition of b and from monotonicity of G, then necessarily a < b < x and thus B τ ((a, x]) would not be the smallest point in (a, x] to satisfy G(B τ ((a, x])) ≥ τ (G(x) − G(a)) + G(a), thus contradicting the definition of B τ ((a, x]).
(v) We prove that for every (a, c] ∩ X = ∅, there exists a n, k such that q n,k ∈ [a, c] ∩ X , if and only if τ ≤ min j α j /(α j + α j+1 ). Using then (i) yields immediately (v). If either a or c belong to X the claim is proved, then let us assume that neither of them is in X . Since we assume that (a, c] To prove the only if part, it is enough to show that if τ > α j * /(α j * + α j * +1 ) for some j * ∈ J, then there is a x j ∈ Q. Let in fact q n,k = x j * +1 and q n, Part (iii) of Lemma 2.3 prevents the SQA sequence to completely define an arbitrary discrete distribution G. The following theorem is a generalization of Theorem 2.2 and includes the case of discrete distributions G. Theorem 2.3. A sequence Q = {q n,k } is a SQA for some distribution function G if and only if: (i) q n,2k = q n−1,k ; (ii) q n,k−1 ≤ q n,k , for all n ≥ 1, k = 1, . . . , 2 n ; (iii) q n,2k−1 = q n,2k if and only if q n−1,k−1 = q n−1,k .
The simple counterexample in Example 1 is sufficient to show that the definition of G is not necessarily unique for a given discrete set Q. And in fact, if Q is generated by an arbitrary discrete distribution G 0 , we might have that G = G 0 . The construction of G given by (2.4) and (2.5) is however the unique G with support in Q that guarantees the following recursive property: for every n, k such that q n,2k−1 = q n−1,k : G(q n,2k−1 ) = τ G(q n−1,k ) + (1 − τ )G(q n−1,k−1 ).
Thus, X n has a limiting distribution X,

SQA random probability measures
In this section, we define the random SQA distributions and derive some of their properties. First we introduce a random SQA, to be used to generate an RPM. To construct a random SQA distribution, we select two distributions λ 1 and λ. Under this definition we show in Section 3.1 some properties including full support under the weak topology. Then in Section 3.2 we generalize random SQA with λ changing with n, denoted λ n , n ≥ 1, and under specific hypotheses we show that, in this case, SQA-RPM has full support under the Kullback-Lieber topology and it is differentiable.
The procedure follows closely the ideas in [17] and consists of the following steps: (i) choose distributions λ 1 and λ; (ii) generate the first quantile of level τ according to λ 1 and proceed with the other terms of the SQA sequence by extracting from λ; (iii) use Theorem 2.1 to obtain the distribution G.
Without loss of generality, we consider the generation of random distributions with support in [0, 1]. The extension to distributions with unbounded support will be considered in Section 4.1. Let λ 1 and λ be two probability measures with support [0, 1] and [0, 1) respectively.
(3.3) Let us introduce some notation. We endow the space of triangular arrays with the product topology and let A be the subset of A satisfying the conditions (i)-(iii) in Theorem 2.3. Let P λ1,λ be the probability distribution of Q on A and let T be the mapping described in Theorem 2.3, that transforms Q into a probability distribution on [0, 1]. Then, T is Borel-measurable, given the weak topology, and T : Theorem 3.1. The distribution of the τ -quantile of the random probability measure generated from the SQA sequence of level τ is λ 1 : Proof. The proof follows from the fact that, by construction, conditional on a given Q = q, G = T (q) (where G is the limit of the sequence of probabilities G n described in Theorem 2.3) satisfies G(q 1,1 ) = τ , and thus G −1 (τ ) = q 1,1 . Since

Properties of SQA random distributions: λ fixed
In this section we study some properties of the SQA random probability measures described in the previous subsection.

Theorem 3.2. B λ1,λ -almost all SQA-RPMs distributions are continuous if and
Now suppose that λ({0}) = η > 0. Then, each X n,k = 0 (n > 1) with probability η. Let ω ∈ Ω be such that X 2,2k−1 (ω) = 0 for some k = 1, 2 (without loss of generality, let us set k = 1), while X 1,1 > 0. Then, from (3.1), we have that q 2,1 (ω) = q 1,1 (ω) = q 2,2 (ω) and consequently, also q 3,2 = q 3,3 = q 3,4 = q 1,1 . In general, q n,2 n−2 = · · · = q n,2 n−1 = q 1,1 and Thus, the distribution G has at least one point with mass probability greater than or equal to τ (1 − τ ) at q 1,1 1 . Following the same argument for arbitrary n, k, with n < ∞, it comes that, if X n,2k−1 (ω) = 0, then the sequence Q generated by (3.1) has a tie at q n,2k−1 This is an immediate consequence of the fact that, when there are no ties, This last identity can be found in Section 3.3 for arbitrary values of n and k. and the distribution G = lim G n has (at least) a jump in that point. Since Pr(ω : To prove the inverse implication, we show that This immediately implies the continuity of G because the probability that two independent r.v. with the same distribution G coincide is zero only if G is continuous (see [24]). We prove the above identity by considering the integral where {G n } is the sequence of probability distribution functions (pdf's) of the sequence of random variables {X n } defined in the proof of Theorem 2.3 and where we omitted the ω for convenience. Moreover, since λ({0}) = λ 1 ({0, 1}) = 0 for all n, k, we have X n,2k−1 ∈ (0, 1), where X n,2k−1 is from (3.1) and thus, for all n ≥ 1,

A. Fabretti and S. Leorato
Thus, Proof. Let us define the mean sum of jumps generated by the SQA-RPMs: and Δ(G) denotes the sum of the jumps of a distribution function G. It is easy to see that, if m = 0 or m = 1, then J(m) = 1. We prove that, unless λ({0}) = 1, J(m) < 1 for all m ∈ (0, 1). For every n, k, let us define the random variable Z n,k = min{X n,4k−3 , X n,4k−1 }. Let p = λ({0}) and let R = p + p(1 − p) = P r{Z n,k = 0}. Then, for all n ≥ 2, we define the sequences E n,j = {exactly j new ties occur at stage n of the extraction of the quantiles} and L n,j = {total length of the j jumps at step n}.
In particular Given these definitions, we can clearly decompose the mean jumps, for all m, to the following Note that L n,j is a random variable that depends wholly on which of the 2 n−1 possible variables Z n,k is equal to 0. Since all {X n,2k−1 } k≤2 k−1 are i.i.d., also the Z n,k are i.i.d. and therefore exchangeable. This also implies that in computing J(m) we can replace L n,j to its average value.
We know (Lemma 3.1) that the mean value of L n,j , n ≥ 2, is equal to j/2 n−2 . Then, we can write For the term Pr(E h,j ), we note that Noting that 1 − R = (1 − p) 2 , we can write, for any m ∈ (0, 1), The above equation implies that J(m) ≤ ∞ n=0 R(1 − R) n = 1 and the identity occurs only if R = 1, that is, when p = 1.
Thus, given Note that Theorem 3.3 differs from the analogous result in [17] (Theorem 3.6), because in our case by choosing a distribution λ s.t. λ({0}) > 0, we can eventually obtain both continuous and discrete random probability measures, with positive B λ1,λ -probability.  Proof. Let n = 2, L 2,1 = 1, because in this case there can be only one tie at q 2,1 = q 2,3 = m, and the total length of the jumps is τ +(1−τ ) = 1. If n = 3, and there is only one tie, it can be in q 3,1 = q 3,2 = q 3,3 (if X 3,1 = 0 or X 3,3 = 0) or in q 3,5 = q 3,6 = q 3,7 (if X 3,5 = 0 or X 3,7 = 0): the sum of the jumps before and after the tie will be τ 2 +τ (1−τ ) = τ , in the first case, and τ (1−τ )+(1−τ ) 2 = 1−τ in the second case. Because of all X n,j are identically distributed, these two values have the same weight, and the average value of L 3,1 is therefore EL 3,1 = 1/2. If at stage 3 there are two new ties, then all X 3,1 = X 3,3 = X 3,5 = X 3,7 = 0. In this case, the total jumps L 3,2 = 1. In general, it is straightforward that L n,2 n−1 = 1 for all n, and, using a binary tree representation of L n,1 (see Figure 1), it is easy to see that there are 2 n−2 possible elementary values for L n,1 , among which we The lengths of L n,2 are obtained by summing the jumps corresponding to the two ties that have occurred. Therefore, since the weighted sum of possible values of L n,1 is one, we have that, EL n,2 is obtained by considering the sum of all possible values L n,1 + L n,1 , where, given L n,1 , L n,1 is constrained to be one of the possible 2 n−2 − 1 values left (some of which might coincide numerically with L n,1 ). Since the sum of all possible values of L n,1 satisfying the constraint is 1 − L n,1 , we get and thus EL n,2 = 2 2 n−2 . 2 Now we assume that EL n,j = Ln,j Ln,j ( 2 n−2 j ) = j 2 n−2 and prove by induction that L n,j+1 = j+1 2 n−2 . By generalizing the argument used for L n,2 , we can write (j + 1) Ln,j+1 L n,j+1 = Ln,j L n,1 =Ln,j (L n,j + L n,1 ), where the second sum is over all possible values of L n,1 that are not included in L n,j . Thus, we clearly have that: = (after some computation) = (j + 1) 2 2 n−2 .
., m}. We define the following subset of E: Let G σ and q σ n,k denote the distribution function of σ and the SQA of G σ , respectively. Let G σ0 and q σ0 n,k the analogous for σ 0 . Since q σ0 n,k is dense in the support of σ 0 , let q σ0 nij ,kij and q σ0 nij ,lij be such that (q σ0 nij ,kij , q σ0 nij ,lij ] ⊂ O ij and Note that, from the definition of q σ0 n,k and q σ n,k , we have G σ0 (q σ0 n,k ) = G σ (q σ n,k ), for all n, k.
Let N = max ij n ij and consider the set The set (q σ0 nij ,kij − δ, q σ0 nij ,lij + δ] is larger than (q σ0 nij ,kij , q σ0 nij ,lij ]. We can find a δ small enough such that the left hand side of (3.4) is bounded by: For this δ, we have that, if σ ∈ D δ , then also that is equivalent to σ ∈ C. Given that λ 1 and λ have full support on [0, 1), B λ1,λ (D δ ) > 0 and since we showed that D δ ⊂ C it follows B λ1,λ (C) > 0, this ends the proof.

Properties of SQA-RPM when λ n changes with n
In this section we consider the more general definition of SQA-RPM obtained when each X n,2k−1 in equation (3.1) is generated independently from a distribution λ n that is allowed to change with n. We point out that all the results of the previous subsection are still valid, with little adjustment, if the SQA-RPM is generated by the sequence λ n , n ≥ 1. For example, if λ 1 ({0, 1}) = 0 and all measures λ n assign mass 0 to {0}, then according to Theorem 3.2, almost all SQA-RPMs are continuous.
We further show that when the measures λ n are allowed to change with n the SQA-RPM can have stronger properties, such as differentiability and full support in stronger topologies.
We assume that λ 1 ({0, 1}) = 0 and that the SQA sequence is defined through a sequence of generating measures {λ n , n ≥ 1}, such that X n,2k−1 ∼ λ n for all k and n with λ n ({0}) = 0. The probability measure in the space of SQA-RPM is now denoted by B λn,n≥1 , and, repeating the arguments of Theorem 3.2, it can be seen that it contains only continuous distribution functions and, because of Lemma 2.2(ii), the set {q n,k } is dense in (0, 1). Under this condition, the main equations in (3.1) simplify to Let us denote by 1{A} the indicator function, 1{A} = 1 if A is true and is 1{A} = 0 otherwise. For all u ∈ [0, 1] and n > 1, we have the following representation of G n .
where z n,k is the number of zeros in the n-cyphers binary representation of k −1.
Hence we have G(u) = lim n→∞ G n (u).
Every G n depends in a complex way on the distributions λ n , through each q n,k . Indeed, it is possible to see that all q n,k can be written as a sum of products of X m,2j−1 : Thus, each summand in q n,k depends on products of independent variables X m,2j−1 , but summands are not independent because they can have terms in common.

A. Fabretti and S. Leorato
We have that the RPM G is differentiable if the limit lim n g n (u) exists for all u. Now, let us defineg n (u) as where Z n = X n,2j n,k −1 /τ n and W n = (1 − X n,2j n,k −1 )/(1 − τ n ). Clearly, Z n and W n are independent ong n−1 and both have mean 1. Further, if all τ n = τ , theng n = g n for all n, while We define the processs n (u) = 1/g n (u) and show that it is a martingale; in fact, We can apply the martingale convergence theorem if Es 1 (u) < ∞. This is an immediate consequence ofs 1 (u) = Z 1 1{u ∈ (0, X 11 ]} + W 1 1{u ∈ (X 11 , 1]}, which implies E(s 1 (u)) ≤ 2 for all n and u. Thus, the limit,s(u) = lim nsn (u) exists for all u.
In order to be able to defineg(u) = lim n 1/s n (u), we need s(u) > 0 for all u ∈ (0, 1). Then, by writingZ n = Z n − 1 andW n = W n − 1, we have It is then enough to show that m≤∞ log(1 + |Z m |) < ∞ and m≤∞ log(1 + |W m |) < ∞, and we can limit ourselves to do it for the term in Z, because for W it is the same. We invoke the two-series theorem (see [33], IV.2.2): the series m≤∞ log(1 + |Z m |) < m |Z m | converges if n E|Z n | < ∞ and n V ar|Z n | < ∞. Both conditions are satisfied if ∞ n=1 σ 2 n /τ 2 n < t −2 ∞ n=1 σ 2 n < ∞. Finally, to prove that exists, we now need to show the convergence of the series lim n log(g n (u)) = lim n log(g n (u)) + m≤n log Provided that each τ n is bounded below by a positive constant t > 0, we have that the series in the above equation converges if n |1 − τ τn | < ∞. Since τ n > t > 0 for all t, we must have |τ n − τ | = O(1/n 1+δ ), for δ > 0. Finally, the summability of the sequence of variances also permits to apply Corollary in [21], implying that G(x) = x 0 g(u)du. From the above theorem, the only differentiable distribution generated by a SQA sequence, when λ n = λ is constant with n, is the uniform distribution, obtained for λ = δ τ .
For arbitrary measures {λ n , n ≥ 1}, the characterization of G and of its density can't be easily used to derive a closed form expression for G and, through this, of the expected random measureḠ. However, as [18] pointed out, with the recent advent of simulation based inference the need for clear-cut conjugacy and analytically tractable posteriors is no longer critical, since it deflates the importance of analytical tractability. Theorem 3.4 proves the large support of the SQA-RPM in the weak topology; while, in applications, it is often desirable that the random probability measures have full support in a stronger sense.
As in [22] and [18], we consider the extension of Theorem 3.4 to Kullback-Leibler topology, i.e. the topology induced by relative entropy neighborhoods. We recall the definition of relative entropy (also called KL-divergence): given two probability distributions G 0 , G with densities g 0 , g, we have KL is not absolutely continuous w.r.t. (the probability measure associated to) G.
In this topology, neighborhoods of an arbitrary probability distribution function G 0 have the form {G : KL(G 0 , G) < }. Theorem 3.6. Let all the assumptions of Theorem 3.5 be satisfied and assume that each λ n , n ≥ 2 is such that n σ n < ∞. Let the probability distribution G 0 have a density g 0 and a finite entropy, namely g 0 (x) log(g 0 (x))dx < ∞.
Then, for all ε, B λn,n≥1 (G : KL(G 0 , G) < ε) > 0 Proof. We follow the same reasoning in [22]. We first write The first integral is the entropy of g 0 and is finite in absolute value by assumption. The second term can be written as: where the indices k n (x) of the sequence I n,kn(x) = (q n,kn(x)−1 , q n,kn(x) ] satisfy x ∈ I n,kn(x) for all n and ν is the Lebesgue measure 3 . Then, we can write In the last equality we used the fact that ν(I n,2k | I n−1,k ) = q n,2k −q n,2k−1 We consider that and we take the expectations, since if its expectation converges we obtain that the second term of (3.6) is bounded: We apply the conditional Jensen's inequality to the concave function log(x) Applying the inequality log(1 + x) ≤ x to log The identity E(X) = Pr(X ∈ A)E(X | A) + Pr(X ∈Ā)E(X |Ā) implies E(X) ≥ Pr(X ∈ A)E(X | A) for any X ≥ 0: under the assumptions of the theorem.
Since it converges in expected value, the term log g(x)g 0 (x)dx is bounded with a positive probability. Finally, similarly to [22], we can say that, because of the assumption that all λ n have full support in (0, 1), then g 0 (x) log g(x)dx can be found to be at most different by a constant δ from g 0 (x) log g 0 (x)dx with a positive probability.

Stopping rule
In building a random probability measure the problem of repeating (3.1) infinitely many times occurs. This is a common feature of most of the procedures based on dyadic expansions or similar ideas (including Polya trees, quantile pyramids, SBA-RPMs among the others). However, the SQA presents also the drawback that with a small τ the random distribution generated after a relatively small number of steps is extremely dense in the left tail and still sparse in the right and the lower is τ the higher the number of steps necessary to obtain a random distribution that is dense everywhere. This clearly causes the algorithm to become very slow for small values of τ , because a large number n of levels is necessary to cover in a reasonably dense way the whole support of the distribution and, for each n, q n,2k−1 should be computed (from X n,2k−1 ) for k ≤ 2 n−1 . However, similar as in [22] it is not necessary to compute any n but just as far as a certain level h and also not every q n,k for all n = 1, ..., h.
The SQA procedure can be implemented efficiently by fixing a threshold ε > 0, and by simulating only those values of {q n,2k−1 } such that G((q n,k−1 , q n,k ]) > ε. This reduces drastically the number of iterations. As example, Fig. 2 plots for n = 1, ..., 12 the intervals for which q n,k must be computed according to the truncation procedure with ε = 10 −3 and τ = 0.01. In this case the procedure stops after 134 levels but in the figure we report only the first 12 levels. In practice, it is easy to implement the truncation procedure. Note that, given n, the probability of each interval (q n,k−1 , q n,k ] is given by G((q n,k−1 , q n,k ]) = τ z n,k (1 − τ ) n−z n,k where we recall that z n,k is the number of zeros in the binary representation of k − 1 (adding zeros to the left if necessary in order to have exactly n cyphers). So, for example, the interval (q 6,k−1 , q 6,k ] for k = 13 has probability G((q n,k−1 , q n,k ]) = τ 4 (1 − τ ) 2 , because the binary representation of 12 is 001100. Let us assume, without loss of generality, that τ < 1/2 and fix a threshold ε. Then if we set n ε = min n {τ n < ε}, we can limit to the simulation of the X n,k corresponding to the indices K n,nε = {k ≤ 2 n : z n,k < n ε } ⊇ {k ≤ 2 n : G((q n,k−1 , q n,k ]) > τ nε }.
Let moreover N ε = min n {(1 − τ ) n < ε}. Because of the truncation mechanism, whenever G n (I n,k ) < ε, no other quantiles inside I n,k are generated. Then, the numbers n ε and N ε are, respectively, the number of iterations before the first truncation and the last iteration before the procedure stops 4 . Once we have all q n,kj , n ≤ N ε , k j ∈ K n,nε , the truncating measure is G ε (q n,k ) = G n (q n,kj ) and it is extended to (0, 1) through linear interpolation.
Summing up this procedure is both a stopping rule because it provides the largest N ε level to compute and a truncation procedure because for any level n provides the intervals needing to be still partitioned.
When defining a stopping rule, it is important to ascertain that the approximated random measure obtained by truncation, that we denote by G ε to stress its dependence on the threshold ε, can be arbitrarily close to the random probability measure G generated by the procedure. Because of the definition of the stopping rule, it is easy to see that, under very mild assumptions, G ε can be arbitrarily close to G with respect to the Kolmogorov-Smirnov distance: If the generating measures are allowed to vary with the level n, then it is also possible to prove that ε can be chosen such that G lies in an arbitrarily small KL-neighborhood of G ε .
Proof. (i) The Kolmogorov-Smirnov distance is given by: where n > N ε . For the last term of (3.7) we have where q n,k(x) is the largest quantile (of level n) smaller than x. Then, because of G n (q n,k ) = G(q n,k ), The first term of (3.7) is equal to (ii) To prove the second part, we write, Note that, for n < n ε , G ε (I n,k ) = G n (I n,k ), for all k, then G ε (I n,2k−1 | I n−1,k ) = τ and G ε (I n,2k | I n−1,k ) = 1 − τ , implies nε n=1 Then, For n > N ε +1, the partition {I n,k } k≤2 n is finer than the partition {I Nε,kj } j . For n ε +1 < n ≤ N ε , some of the intervals I n,k coincide with I Nε,kj for some j ≤ M , while for the others we can find j such that I n,k ⊆ I Nε,kj . In the first case, we have, by construction G ε (I n,2k−1 | I n−1,k ) = τ , while in the second case, since G ε (x) is defined by linear interpolation, we have G ε (I n,2k−1 | I n−1,k ) = X n,2k−1 Then, we have the upper bound 2k ) .
We now prove that the series is convergent in mean (and therefore in probability). Note that in the following we exploit the fact that, although I n,k is random, G(I n,k ) is a nonrandom function of τ , n and k: (using concavity of log(x) and Jensen's inequaliy) Now, we use the sharp Kantorovich's inequality (see eq. (4) in [8]), with M = 1 − l n and m = l n : we have that where the last inequality holds for all l n < 0.5. The same inequality holds for E 1−τ 1−X n,2k−1 . Then, we can write the following upper bound for the sum above: where the last identity follows from where ρ 1 (u) = τ − 1{u ≤ 0}. This corresponds to solving the optimization: If one replaces to the function ρ 1 the function ρ 2 (u) = (τ − 1{u ≤ 0})|u|, a different function, called expectile function (Newey and Powell (1987) [27]), is defined. [5] proposed a natural generalization of quantiles and expectiles, by replacing the constant and modulus functions by the modulus of a more general odd function, ψ: The function ψ is related to the choice of the function in an M -estimation procedure. In particular, if τ = .5, then the corresponding quantity is equal to the M -estimator of the location parameter. For this reason [5] proposed the name M -quantiles for the solution to (4.2). [20] proved that, if ψ is odd, monotone nondecreasing, continuous and piecewise differentiable, the M -quantile corresponds to the quantile of level τ of the distribution function Thus, in particular, one obtains the standard quantiles by taking ψ equal to the odd function ψ(u) = sign(u), and in that case it is easy to see that G(y) = F (y).
Expectiles have been experiencing a growing interest in the recent financial literature (as well as in all fields where it is crucial to manage extreme events, such as actuarial sciences), in particular since the introduction of the concept of elicitability ( [14]) and the proof by [35], that expectiles give the only risk measure that is both coherent and elicitable 5 . See [2] and references therein for their use as risk measure in comparison with VaR and ES. However, expectiles are known to be less robust to extreme events relative to the VaR. [10] consider 5 A risk measure is elicitable if there exists a scoring function such that the risk under a given distribution is obtained by minimizing the expected value of the score under that distribution. All M -quantiles are elicitable. A risk measure is coherent if it is simultaneously (i) translation invariant, (ii) monotonic, (iii) positively homogeneous, (iv) subadditive (see [1] for more details). While the ES is coherent but not elicitable, the VaR is elicitable but not coherent.
The SQA procedure can be used to generate RPMs controlling for the distribution of a given M-quantile. We illustrate the procedure by focusing on the case of expectiles, namely to the case ψ = |x|. This clearly includes the case, if τ = 1/2, of random distributions with mean distributed according to λ 1 .
The idea behind this procedure is straightforward: exploiting the fact that M -quantiles of a distribution F are ordinary quantiles of the distribution G in (4.3), we first generate a random distribution from a SQA sequence of level τ as described in the previous section. Then, we invert the transformation (4.3) to get the distribution F := F G .
In particular, the inverse transformation of (4.4) can be easily found once we notice that the median of the G distribution and the mean of F G coincide. Then, for all y, from we can get: The link between quantiles of G and expectiles of F G guarantees that, if the τ -level quantile of G has distribution λ 1 B λn,n≥1 -a.s., then also the τ -expectile of F G follows the same distribution. We underline that equation (4.5) is independent on the τ level. If in particular we choose τ = 1/2, this procedure gives a distribution F G whose mean has distribution λ 1 . Besides the already mentioned SBA by [17], the problem of generating RPMs whose mean follow a given distribution, has been studied with a focus on Dirichlet means, motivated by applications in nonparametric statistics and in combinatorics (see [23], [30] and [31]). Our procedure thus also permits to define an alternative approach to tackle this issue.

Simulations
In the following we report some examples of random probabilities obtained via our SQA construction. The truncation procedure has been used to reduce the computational time with ε = 10 −3 .
In Figure 3 five examples of random distributions obtained with q 1,1 = 0.1 and τ = 0.05 are reported; these five examples are derived with the same λ distribution equal to a Beta (3,30), plotted on the left up corner. In contrast, in Figure 4, five examples of random distributions obtained with the same q 1,1 = 0.1 and τ = 0.05, but with five different λs, are reported. In this case the λ distributions, plotted on the left up corner, are Beta(3, 3a 2 ), with a = 1, .., 5. The more the distribution is symmetric the higher the probability assigned to one. Indeed a symmetric λ tends to select more likely the mid point of each interval leaving τ probability on the left and 1 − τ on the right. Given a fixed  Differently from [6] the SQA procedure allows to build random distributions with quantile following a given distribution λ 1 ; in Fig 5 five examples are reported. In this case λ is Beta (3,15) and is the same for all the five examples, while the given quantile is drawn by a λ 1 equal to a uniform from 0 and 0.2.
Differently from [17] SQA-RPM can be continuous even if λ assigns some probability to 0. Fig 6 shows on the left hand side five SQA-RPMs and on the right hand side five SBA-RPM. Both SQA and SBA are generated with λ that gives 0.1 probability to 0 and Uniform(0,1) elsewhere; the quantile and the mean are both fixed to 0.4 and for an easier comparison τ is 0.5. This figure offers a graphical intuition about the difference in continuity of the two constructions as formalized in our Theorem 3.3 and Theorem 3.6 by [17]. Indeed in Theorem 3.6 by [17] for any distribution μ (the equivalent of our λ) giving positive mass in zero they get discrete distributions, while we can have also continuous distributions.
In Figure 7 (lhs) five simulations of SQA with λ n changing with n are represented. The distributions λ n are chosen to satisfy conditions of Theorems 3.5, 3.6 and 3.7, namely λ n are Beta(a n , b n ) in [l n , 1 − l n ] with l n = 0.8τ n √ n , a n = τ n 2 , b n = 1−ln−τn τn−ln a n and τ n = τ + 1 n 2 . The quantile q 1,1 = 0.2 and τ = 0.2. Finally in Figure 8 five simulations of RPM on R are reported, indeed they are the trans-

Comparison with similar approaches
Polya tree and mixture of Polya tree, are widely used to construct random probability distributions. [24] showed that Polya trees can give probability 1 to the  set of continuous distribution functions, they constitute a conjugate family with an easy update to get the posterior distribution and they have full support. However a well known drawback is the dependence of the model on the partition specified to construct the tree, indeed the partition can affect the posterior distribution. [28] addressed this issue "jittering" the partition. Similarly [18] proposed quantile pyramids to build random partition with fixed mass instead of having random mass in a fixed partition. Our approach builds a random par-tition assigning back the distribution, in this sense it looks more alike to quantile pyramids. In terms of Polya tree, SQA is like a realization from a Polya tree distribution with partition given by our SQA realization and degenerate Beta. In terms of quantile pyramids instead SQA can be seen as a generalization that uses τ quantiles instead of dyadic quantiles. With both methods we share the property of producing continuous distributions with probability one, whenever the partition has not degenerate intervals, and we share also the property of large support on the space of probability measures.
Polya trees and Mixtures of Polya trees, that are less sensible to the partition choice, can be used also to generate RPMs with given median or quantiles and this finds application in Bayesian quantile regression (see [34], [16]). Relative to those methods, the SQA procedure permits an even easier simulation of prior distributions, with any distributional constraint on one or more quantiles. As quantile pyramides, however, SQA distribution is less analitically tractable than Polya tree and requires a computational effort to derive the posterior distribution. In Figure 7 (rhs) five simulations of Polya tree RPM are represented jointly with five simulations of SQA with λ n changing with n. In both the distribution generating the partition in SQA and the weights of the partition in PT are chosen to satisfy conditions that ensures continuity and differentiability. Polya Tree RPMs are simulated according to [34] with the Beta distributions with parameters a n = b n = n 2 . A comparison of the graphs produced by the two methods (also considering different choices of the parameters), shows that SQA-RPMs can be a valid substitute for the Polya tree RPMs.

Concluding remarks
In this paper we studied a procedure for the generation of RPMs based on a sequence of conditional quantiles. This procedure offers an alternative with respect to the popular approaches based on Polya trees, and can be seen as a generalization of the more recent quantile pyramids ( [18]). The SQA procedure is able to produce families of both continuous and discrete random distributions, and also with full support, under the appropriate conditions. As the Polya tree or other procedures for the generation of random probability measures, the SQA procedure can be used in Bayesian nonparametric statistics, and it represents a suitable choice especially in cases when it is necessary to control for one or more quantiles (see for instance the approach in [29]). A thorough analysis of the application of SQA-RPM to Bayesian nonparametric statistics is worth doing and will be object of future research.
Further, this procedure can be used in other types of applications as those addresses by [17]. We here mention some examples for which a procedure fixing one (or more) quantile distribution is convenient. As a particular case, we can consider the problem of finding the KL projection of a probability distribution onto a set of probability distributions defined by nonlinear constraints 7 . Given a two distributions F 0 , F with Radon-Nikodym derivative dF/dF 0 , the KL-divergence KL(F, F 0 ) = log (dF/dF 0 ) dF . For any subset Ω ⊂ P ([0, 1]), a KL projection of F 0 onto Ω is the pdf F * ∈ Ω satisfying KL(F * , F 0 ) ≤ KL(F, F 0 ), for every F ∈ Ω. Suppose that Ω is the set of distribution functions with a constraint on a quantile (i.e. F −1 (τ ) = c). This is a nonlinear constraint, therefore the minimum KL-divergence between F 0 and Ω can be approximated by taking the minimum over a finite set of simulated SQA distributions {F i , i = 1, . . . , N}: KL(F * , F 0 ) = min i KL(F i , F 0 ). Example 6. The SQA procedure can be used to generate scenarios in which the distributions must satisfy some condition on the risk measure if this is defined by quantiles (VaR for example) or expectiles.
SQA could be also applied in generating RPMs for which the First degree Stochastic Dominance (FSD) holds. Indeed given that, according to the quantile formulation of FSD, F dominates G if q τ (F ) ≥ q τ (G) ∀τ ∈ [0, 1], fixing one of the two distributions, a specific construction using SQA and involving a sequence of quantiles can generate randomly the other distribution such that the FSD holds.