Exchangeable Bernoulli random variables and Bayes’ postulate

: We discuss the implications of Bayes’ postulate in the setting of exchangeable Bernoulli random variables. Bayes’ postulate, here, stipulates a uniform distribution on the total number of successes in any number of trials. For an inﬁnite sequence of exchangeable Bernoulli variables the conditions of Bayes’ postulate are equivalent to a uniform (prior) distribution on the underlying mixing variable which necessarily exists by De Finetti’s representation theorem. We show that in the presence of exchangeability, the conditions of Bayes’ postulate are implied by a considerably weaker assumption which only speciﬁes the probability of n successes in n trials, for every n . The equivalence of the Bayes’ postulate and the weak assumption holds for both ﬁnite and inﬁnite sequences. We also explore characteri- zations of the joint distribution of ﬁnitely many exchangeable Bernoulli variables in terms of probability statements similar to the weak assump- tion. Finally, we consider extensions of Bayes’ postulate in the framework of exchangeable multinomial trials.


Introduction
Bayes' postulate, ascribed to the Reverend Thomas Bayes, was published posthumously by Richard Price, exactly 250 years ago, in the Philosophical Transactions of the Royal Society of London (1753), under the name 'An Essay towards 2194 M. Banerjee and T. Richardson solving a problem in the doctrine of chances'. 1 Among other things, it deals with the problem of eliciting posterior probabilities for the chance of a success in a single Bernoulli trial in the wake of observed evidence based on a finite number of identical trials and prescribes equi-distribution of ignorance when assigning probabilities to events about which no prior stance can be taken. An excellent discussion of this material is available in the work of Stigler (1986), Chapter 3, but see also Molina (1931).
We formulate Bayes' postulate following the discussion on Bayes' scholium in (Stigler, 1986, Ch. 3). Consider a game between two rivals to be played n times where we 'absolutely know nothing' of the probability that player 1 will win any single match. We define random variables X 1 , X 2 , . . . , X n where X i equals 1 if the i'th match is won by player 1, and 0 otherwise. Let S n = X 1 + X 2 + · · · + X n denote the total number of matches won by player 1. It is not assumed that the outcomes of these matches are independent. Ruling out temporal trends, the random variables may however be assumed to be identically distributed. A natural assumption is to take the joint distribution of the X i 's to be exchangeable: i.e. if π is any permutation of the integers 1 through n, the joint distribution of (X π(1) , X π(2) , . . . , X π(n) ) is the same as the joint distribution of (X 1 , X 2 , . . . , X n ).
As explained by Stigler, what Bayes postulates in this scenario of absolutely knowing nothing is that the distribution of S n should be taken to follow a discrete uniform distribution: i.e. P (S n = m) = 1/(n + 1) for m = 0, 1, . . . n. A standard situation in which the outcomes are exchangeable is what is today the classical Bayesian situation where the unknown chance Θ that player 1 will win any match is considered to be a random variable, and the vector of outcomes X 1 , X 2 , . . . , X n considered to be i.i.d. Bernoulli(θ), conditional on a realized value θ of Θ -indeed, this is essentially the situation considered in Bayes' Billiard Table Problem (Stigler, 1986, pp. 124-125). Stigler (1982) notes that several commentators on Bayes' postulate have unfortunately misinterpreted it as specifying a uniform (prior) distribution on Θ as a way of summarizing perfect ignorance and argued that this notion of perfect ignorance about Θ is not invariant to reparametrizations, since monotone (nonlinear) transformations of a uniform random variable are not necessarily uniform. According to Schrieber (1987), the source of this (flawed) argument appears to be Fisher's address (interestingly enough!) to the Royal Society in the 1920s. In Schrieber's opinion, it is Fisher's criticism of Bayes' idea that eventually led to 'the emergence of different schools and is the main root of subjective Bayesianism!' But, Bayes' postulate simply assigns a uniform distribution to the outcome variable S n and does not correspond to a unique specification of the distribution of Θ; see the appendix, also Edwards (1978). Observe that, Bayes' postulate, correctly interpreted, is indeed invariant to monotone transformations owing to the discreteness of the random variable S n .
Variants and extensions of Bayes' postulate have been proposed by a number of authors. Schrieber (1987) wrote about an extended version of Bayes' postulate and its potential effect on statistical methods: the extended postulate requires the statement of two prior uniform distributions and provides a unique parameter representation (leaving no freedom for nonlinear parameter transformations that arise in Fisher's argument) and unique posterior statements which are useful for both small and large sample sizes. Coolen (1998A,B) also revisited the postulate, formulated a revised version based on Hill's assumption A (n) (Hill, 1968), and used it for statistical inferences about future trials given past trials via the notion of imprecise probabilities. Hill's assumption, A (n) , asserts that given past observations, O 1 , . . . , O n of a continuously valued quantity, a future observation, O n+1 , is equally likely to fall in any of the open intervals between successive order statistics of the past observations: i.e., Hill (1968) goes on to show that if ties have probability zero, for no n can we find a countably additive exchangeable distribution on the space of observations such that the conditional probabilities conform to A (n) for almost all O 1 , . . . , O n ; however, it is possible to exploit A (n) to make valid statistical inferences if one is willing to work with finitely additive distributions as pursued in a later paper Hill (1993). Very recently, an interesting extension of Bayes' postulate to trinomial trials was achieved by Diniz and Polpo (2012), who also discussed the philosophical implications of Bayes' postulate and its misinterpretation, but see also Good (1979). Their work appears in connection with the material in Section 5.
In this paper, we revisit the implications of Bayes' postulate for a sequence of exchangeable Bernoulli trials. For an infinite sequence of Bernoulli trials, the conditions of Bayes' postulate, which stipulates a discrete uniform distribution on S n for all n, imply a uniform prior distribution on the probability of success, a fact independently (and differently) established by Murray (1930) andde Finetti (1930). We are able to show that under the exchangeability assumption, the conditions of Bayes' postulate are implied by a considerably weaker assumption on the distribution of S n which only specifies its distribution at a single point. We show that this is true not only for an infinite sequence of exchangeable Bernoulli variables but also for a finite exchangeable sequence in which situation a prior need not exist. In the process, we also develop a characterization of the distribution of a vector of exchangeable Bernoulli random variables in terms of a vector of monotonically decreasing probabilities. This characterization is closely related to prior results of de Finetti (1964) and also Wood (1992). Finally, we provide a generalization of Bayes' postulate to multinomial trials.

Bayes' postulate and the uniform prior
Consider the classical Bayesian situation as considered in the last paragraph of the previous section. In this situation, we have: where G is the (prior) distribution on Θ. If G is the uniform (0, 1) distribution, it is easy to check using properties of beta integrals that P (S n = m) = 1/(n+1) for all 0 ≤ m ≤ n, so that the conditions of Bayes' postulate are satisfied. However, the converse is not true. Indeed, take any integer N > 0 and assume that Proposition. For any finite N , the above set of conditions is not sufficient to guarantee that G is the Uniform (0, 1) distribution.
This proposition is established in the appendix where for every fixed N we show that there exists a non-uniform G N that satisfies all of the above conditions. Note that Bayes' postulate therefore does not imply uniformity of the underlying Θ, a point raised in the previous section. However, if N is allowed to go to infinity, the conditions of Bayes' postulate do imply that Θ has a Uniform (0, 1) distribution.
The Weak Assumption. For an exchangeable Bernoulli sequence X 1 , X 2 , . . . , X n , Bayes' postulate would stipulate a uniform distribution on S n . Suppose we now make the considerably weaker assumption that P (S n = n) = 1/(n + 1). Thus, we only specify the distribution of S n at a single point, as opposed to the entire distribution.
The following then holds.
Theorem 2.2. Consider an infinite sequence of exchangeable Bernoulli random variables X 1 , X 2 , . . .. Then there exists a random variable Θ such that the X i 's are conditionally i.i.d. given Θ = θ and provided that the weak assumption is satisfied for every n, Θ has a Uniform (0, 1) distribution.
Remarks. We avoid a formal proof of the theorem but outline the main ideas involved. The first part of the theorem about the existence of Θ is De Finetti's representation theorem. See, for example, Hewitt and Savage (1955) and Aldous (1985). Letting S denote the symmetric sigma-field, Θ can be taken to be E(X 1 | S) which is also the almost sure limit of X n . We elaborate on this a bit further. Letting ǫ 1 , ǫ 2 , . . . , ǫ n be a sequence of 0's and 1's, the crux of the argument lies in the fact that: Using these facts, showing that Θ can be taken to be E(X 1 | S).
That the weak assumption (for every n) implies a uniform prior on Θ follows easily, since only the corresponding conditions were used in the proof of Theorem 2.1. Thus, the weak assumption implies Bayes' postulate for an infinite exchangeable sequence. This implication also holds true for a finite exchangeable sequence as shown next.

Bayes' postulate, the weak assumption and finitely many exchangeable Bernoulli variables
Our goal in this section is to explore the connection between the Bayes' postulate and the weak assumption when we have a finite sequence of exchangeable Bernoulli random variables X 1 , X 2 , . . . , X n . With a finite sequence, it may no longer be assumed that there is a 'mixing' random variable Θ assuming values in [0, 1], conditional on which, the X i 's are conditionally independent Bernoulli(Θ) random variables. In other words, exchangeability can no longer be viewed in the classical Bayesian set-up. Indeed, for every n, it is possible to construct an exchangeable sequence of n Bernoulli random variables, such that no random Θ assuming values in [0, 1] can act as a mixing variable. For example, consider simple random sampling without replacement from a population of n individuals where np are of Type A and the remaining are of Type B, 0 < p < 1. Let the size of the sample be n as well (we exhaust the population) and let X 1 , X 2 , . . . , X n be the random sample obtained, with X i = 1 if the i'th draw yields an individual of Type A and 0 otherwise. It is not difficult to show that the Bernoulli random variables X 1 , X 2 , . . . , X n are exchangeable, i.e. any permutation of the X i 's has the same distribution as the original sequence. However, a mixing variable Θ cannot exist in this situation. For if it did, given Θ, the X i 's would be conditionally independent and identically distributed Bernoulli(Θ) and it would follow that Cov(X i , X j ) = Var(Θ) ≥ 0 for all i = j. It is easy to check that in the given situation Cov(X i , X j ) < 0 for i = j. See, also, the example in the introduction in Diaconis and Freedman (1980). However, as the following Theorem shows, the weak assumption does imply Bayes postulate in this setting as well. The equivalence of the weak assumption and Bayes' postulate is purely a consequence of exchangeability and not of a De Finetti type representation. For finite forms of De Finetti's theorem, see Diaconis (1977); for approximations to the distribution of a finite exchangeable Bernoulli sequence by a mixture of i.i.d. random variables under appropriate conditions, see Diaconis and Freedman (1980); and for finite exchangeability in the context of binomial mixtures, see Wood (1992).
Theorem 3.1. Let X 1 , X 2 , . . . , X N be a sequence of exchangeable Bernoulli random variables. Assume that P (S n = n) = 1/(n + 1) for n = 1, 2, . . . , N , where S n = n i=1 X i is the number of successes in n trials. Then S n has a discrete uniform distribution for each n ≤ N .
denote the probability of k successes in n trials for generic n and k ≤ n. By exchangeability,: Now the left side of the above display is simply: which, by exchangeability, equals which, by exchangeability again, is It follows that for each k with 0 ≤ k ≤ n, The proof now follows by induction. For n = 1, it is trivially seen that S n ≡ X 1 assumes the values 0 and 1 with equal probability 1/2. So we assume that the result holds for n and establish it for n + 1.
= 1/(n + 2) and the induction hypothesis holds, so that Remark. The weak assumption could also have been formulated as: P (S n = 0) = 1/(n + 1) for 1 ≤ n ≤ N . This would imply that P (S n = n) = 1/(n + 1) for all n, whereS n = Y 1 + Y 2 + · · · + Y n with Y i = 1 − X i . Since the X i 's are exchangeable Bernoullis, so are the Y i 's. Theorem 3.1 then leads to the conclusion that eachS n is uniformly distributed on {0, 1, . . . , n}. Hence, each S n = n −S n also has a discrete uniform distribution.

A characterization of exchangeable distributions on {0, 1} N
The considerations of the previous section raise a natural question: for which sequences 1 ≥ p 1 ≥ p 2 ≥ · · · ≥ p N ≥ 0 do there exist bona-fide probability distributions for a sequence X 1 , X 2 , . . . , X N of exchangeable Bernoulli random variables such that P (S n = n) = p n for all 1, 2, . . . N ? We have already seen that there is at least one sequence that works: namely, p n = 1/(n + 1) for 1 ≤ n ≤ N .
We show below that such sequences are well-characterized and that, in fact, there is a one-to-one correspondence between such sequences and the class of all exchangeable probability distributions on {0, 1} N . So, consider the set {(q 0 , q 1 , . . . , q N ), q i ≥ 0, q i = 1}. Each vector in this set corresponds to a unique exchangeable distribution on {0, 1} N which is completely determined by the requirement that P (S N = n) = q n , for 0 ≤ n ≤ N , the probability mass being assigned to a generic sequence (ǫ 1 , ǫ 2 , . . . , ǫ N ) ∈ {0, 1} N by this distribution being q n / N n where n = ǫ i . We now compute P (S n = n) for this distribution. This is simply the sum, over 0 ≤ k ≤ N − n, of the probability that X 1 = X 2 = · · · = X n = 1, k of the remaining X i 's are 1 and the remaining N −(n+k) are all 0. For a fixed k, there are N −n k such points in {0, 1} N and the probability of each such sequence is simply q n+k / N n+k . It follows that where C n,k = (k + 1)(k + 2) . . . (k + n) (N − n + 1)(N − n + 2) . . . N .
Letting q = (q 1 , q 2 , . . . , q N ) and p = (p 1 , p 2 , . . . , p N ) (where p n denotes P (S n = n)), we see that the two vectors are connected by the equation where A N is an N × N upper triangular matrix with the n'th row given by (0 1×(n−1) , C n,0 , C n,1 , C n,2 , . . . , C n,N −n ). Since A N is (clearly) non-singular, it follows that the family of exchangeable probability distributions on {0, 1} N can equally well be characterized by (p 1 , p 2 , . . . , p N ). The equation in the above display therefore provides a complete characterization of all sequences 1 ≥ p 1 ≥ · · · ≥ p N ≥ 0 that correspond to (and determine) valid exchangeable distributions on {0, 1} N .
Remark 4.2. Given a sequence, 1 ≥p 1 ≥ · · · ≥p N ≥ 0, it can be easily ascertained whether this corresponds to a bona fide exchangeable distribution on {0, 1} N by computingq = A N −1p (wherep = (p 1 , . . . ,p N ) T ) and checking that the entries ofq are non-negative and sum up to at most 1.
The next result provides an explicit characterization of the distribution of S n for each n in terms of the vector p. n }. In terms of the vector p, 1 ≡ p 0 ≥ p 1 ≥ · · · ≥ p N ≥ 0, where, for each m > 0, p m = p (m) m , we may obtain the pmf for S n as follows: See also de Finetti (1964), p. 122, Eq. (5) for an alternative formulation in terms of finite difference operators. Proof.
Here the second equation follows by exchangeability; the third via Möbius inversion (see, e.g. Proposition 1 in Drton and Richardson (2008)); the fourth via exchangeability; the fifth by counting subsets of a given size; the conclusion follows by definition of a trinomial coefficient: n (k,t) := n!/(k! t! (n − k − t)!). Note that in the third line, when k = 0, the term in the sum corresponding to C = ∅ is (−1) |C| P (X i = 1, i ∈ C) = 1, since the event is vacuously true.
Remark 4.3. Notice that taking n = N , in (4.1) gives: is a linear function of p k , . . . , p N . Further note that for k ≥ 0 we have with equality when k = 0. Hence given valid values for p N , . . . , p k+1 we have the following upper bound: Note that this bound is achievable, since given valid values of p k+1 , . . . , p N , we can first determine {p (N ) j } N j=k+1 uniquely, using Proposition 4.1, and then set p r , and p (N ) j = 0 for j < k. Conversely, since p (N ) k ≥ 0 we have the following lower bound on p k : Again, for k > 0 the bound is achievable, as it corresponds to setting p N .) Lastly, simple algebra shows that the difference between the upper and lower bounds is: These bounds provide a means of specifying a joint distribution over the set of permissible vectors p: first, one specifies the marginal distribution of p N , and conditional on a value of p N , one specifies the conditional distribution of p N −1 so that it is concentrated on the interval given by the lower and upper bounds (which obviously depend on p N ) and so on. Remark 4.5. The correspondence that we have established between p and q also bears similarity to the mapping between distributions on the number of successes out of n and distributions over exchangeable sequences considered by Wood (1992). Good (1979) observed that Bayes' billiard ball argument could be easily extended to a multinomial context. More recently, Diniz and Polpo (2012) considered Bayes' postulate in the context of trinomial trials. They show, using the strategy in de Finetti (1930), that the assumption of equiprobability a priori for the possible outcomes of trinomial trials, in the spirit of Bayes' postulate, implies that the parameter vector must have a Dirichlet(1, 1, 1) prior, which can be viewed as a uniform distribution on the 2 dimensional simplex. Their proof relies on using a recursive equation for multinomial probabilities under exchangeability to obtain a differential equation for the probability generating function, the solution to which is then deployed to obtain a limiting characteristic function. In this section, we provide a proof of a general version of their result in the multinomial setting, using a somewhat different argument that relies on moments.

Bayes' postulate and multinomial trials
To set up the problem, letX 1 ,X 2 , . . . be an infinite sequence of exchangeable multinomial vectors withX 1 ∼ Mult(1, p 1 , p 2 , . . . , p k , p k+1 ) with p k+1 = 1 − k i=1 p i . EachX i is then a vector containing 1's and 0's, of length k + 1, with a solitary 1 at the l'th position if outcome l was realized among the k + 1 exhaustive and mutually exclusive outcomes 1 through k + 1. Let S n ≡ (S n,1 , S n,2 , . . . , S n,k , n − k j=1 , S n,j ) be defined as n j=1X j . Then, S n is simply the status of the multinomial trials at Stage n. The natural extension of Bayes' postulate to this situation proceeds thus: under the assumption that we absolutely know nothing of the underlying mechanism, we should postulate that, for each n, all outcomes of S n are equally likely, i.e. for each n, S n has a discrete uniform distribution on the set {(x 1 , x 2 , . . . , x k , n − k j=1 x j ) : Since the cardinality of this set is n+k k , P (S n = (x 1 , x 2 , . . . , x k , n − k j=1 x j )) = 1/ n+k k for a generic vector in the above set. Invoking the general De Finetti representation (Hewitt and Savage, 1955) 2 , there exists a random vector Θ = (Θ 1 , Θ 1 , . . . , Θ k , 1 − k j=1 Θ j ) such that {X i } are i.i.d. conditional on Θ; further,X 1 | Θ ∼ M ult(1, Θ) and S n /n → a.s Θ. Letting G denote the distribution of (Θ 1 , Θ 2 , . . . , Θ k ) on the k dimensional simplex S k ≡ {(θ 1 , θ 2 , . . . , θ k ) : k i=1 θ i ≤ 1, θ i ≥ 0}, we can write: In the above display n (x1,x2,...,x k ) := n!/( is the usual multinomial coefficient. The above display implies that for all {(y 1 , y 2 , . . . , y k ) : y i ≥ 0, y i = n}, It follows that all possible moments of (Θ 1 , . . . , Θ k ) are determined by the postulate and provided that these correspond to the moments of a valid distribution G 0 on S k , the G in question must be G 0 , since distributions supported on bounded domains of Euclidean space are completely determined by their moments. Note that we defined the k dimensional simplex, above, as a subset of R k , rather than R k+1 as is usually done. Of course, there is a 1-1 correspondence between the two sets. We show that choosing G 0 to be the Dirichlet(1, 1, . . . , 1) distribution on S k works. Recall that the Dirichlet(β 1 , β 2 , . . . , β k+1 ) distribution on S k is given by the following density function: Now we take G 0 to be the Dirichlet with all β i 's equal to 1 and compute a generic moment of (Θ 1 , . . . , Θ k ) under G 0 : We conclude that the prior distribution implied by Bayes' postulate in the multinomial setting is precisely the uniform distribution over S k (note that S k has volume 1/k!).

Multinomial random variables and insights into the 'weak assumption'
Recall that for a finite exchangeable sequence of Bernoullis, say X 1 , X 2 , . . . , X N , the weak assumption specified the distribution of S n at the point n for each 1 ≤ n ≤ N . Our assumption was that P (S n = n) = 1/(n + 1) and we showed that this forced the distribution of each S n to be uniform. The argument relied on the recursion formula (3.1) which can be viewed as a system of (n + 1) linear equations in {p = 1). It follows that the distribution of S n+1 cannot be uniquely recovered from that of S n (via these equations) unless one of the p (n+1) k is pre-specified. And this, by specifying {p , is precisely what the weak assumption does. Specifying p (1) 1 pins down the distribution of S 1 ≡ X 1 as Bernoulli(1/2), and now, using the pre-specified values of p 3 , . . ., the distribution of each S n can be recovered uniquely. Note that alternative specifications would also have worked, as the remark prior to Section 4 shows. In fact, all that is needed for unique recovery is a specification of the vector {p where (k 1 , k 2 , . . . , k N ) is an arbitrary set of integers with 0 ≤ k n ≤ n; and, provided p (n) kn is set to 1/(n + 1) for each n, the distributions of the S n 's will be seen to be discrete uniforms as before. The specifications we discussed in Section 3, where we set all k n 's to be n (or 0), are aesthetically somewhat more pleasing as they correspond to probability statements about all n trials resulting in successes (failures).
It is natural to ask whether there are analogous weak assumptions in the general multinomial setting. We discuss this briefly in the trinomial setting as considered in Diniz and Polpo (2012). The analogue of (3.1) in the trinomial context is equation (14) of their paper: x1,x2 = P (S n = (x 1 , x 2 , n−x 1 −x 2 )), using our notation from the general multinomial setting with k = 2. The distribution of S n−1 is determined by the vector {ω x1,x2 would, at least, need to be specified. 3 Thus, a 'weak assumption' in this setting would need to specify the distribution of S 1 at 2 points, the distribution of S 2 at 3, and so on till S N . A 'weak assumption' that, for each n ≤ N , sets the value of each of some n+1 pre-specified ω (n) x1,x2 's to be 1/ n+2 2 = 2/(n+1)(n+2), would force the distribution of each S n , 1 ≤ n ≤ N to be the discrete uniform, as can be readily verified via an inspection of the equation (5.1): it is satisfied if we set ω (n−1) x1,x2 = 1/ n+1 2 = 2/n(n + 1) and ω x1,x2 = 2/(n + 1)(n + 2). We note, as before, that with finitely many exchangeable trinomials, it is not possible to talk about a prior distribution on the trinomial parameter and any analysis of the distributions of the S n 's has to start from (5.1), which follows purely from the exchangeability hypothesis.
3 Note that as the variables have to add up to 1, there are only n+2 2 − 1 'free' variables; on the other hand, there are only n+1 2 − 1 linearly independent equations, since the ω n−1 x 1 ,x 2 have to add up to 1. So, the difference between the number of 'free' variable and the number of independent equations is n + 1.
Following the discussion for the trinomial situation, it is not difficult to see, that for a k + 1 compartment multinomial, a unique recovery of the distribution of {S n } N n=1 using ( Remark. The proof in Section 4 of Diniz and Polpo (2012), of the fact that Bayes' postulate implies the uniform (Dirichlet) prior in the trinomial situation, was based upon their equation (14) which is what (5.3) reduces to, for k = 2. It would seem that their method, involving difference-differential equations, should be extendible to the multinomial case, where the relevant differential equations would be derived from (5.3).

Concluding remarks
As we have seen, the equal assignment of probabilities to all possible outcomes in an exchangeable multinomial experiment (with infinitely many trials) -i.e. postulating that each S n is uniformly distributed -forces a uniform distribution on the prior parameter. However, it is important to note the fundamental distinction between imposing uniformity on the outcome variable of an experiment and that of imposing uniformity on a prior, a point that has often been glossed over. We have noted that with finitely many exchangeable Bernoullis (the simplest possible multinomial variables), many different prior distributions can produce uniformity of the outcome distribution. It would be interesting to investigate the connections between outcome distributions in a general infinitely exchangeable experiment and the prior distribution that would uniquely exist in this case. More specifically, can we characterize all infinitely exchangeable experiments for which the assumption of equiprobability of outcomes forces a uniform distribution on the prior? As Diniz and Polpo (2012) note in their conclusion, for certain experiments, uniformity of the outcome distribution might actually translate to a non-uniform prior, in which case, the uniform prior should not be regarded as a non-informative prior. Such considerations would go a long way towards extending our understanding of Bayes' postulate and its implications in general.

Appendix
We establish a slightly stronger version of the Proposition stated in Section 2.
Proof. For a positive integer P , let g denote a density on [0, 1] that assumes values L 1 , L 2 , . . . , L P on [0, 1/P ], (1/P, 2/P ], . . . respectively and such that S n is uniformly distributed for each 1 ≤ n ≤ N under g. The constraints that P (S n = m) = 1 0 n m θ m (1−θ) n−m g(θ) dθ = 1 n + 1 , for 0 ≤ m ≤ n, 1 ≤ n ≤ N and the fact that g integrates to 1 give rise to a system of linear equations in L 1 , L 2 , . . . , L P , namely P i=1 I m,n,i L i = 1/(n + 1), for 0 ≤ m ≤ n, 1 ≤ n ≤ N and P i=1 L i /P = 1, where I m,n,i = i/P (i−1)/P n m θ m (1 − θ) n−m . This can be written in matrix notation as M l = v for a (c(N ) + 1) × P matrix M whose last row is the vector (1/P, 1/P, . . . , 1/P ) and the rows above are given by (I m,n,1 , I m,n,2 , . . . , I m,n,P ), stacked in order of increasing n, and increasing m within n; v is a c(N ) + 1 dimensional vector of the form (1/2, 1/2, 1/3, 1/3, 1/3, 1/4, . . . , 1/(N + 1), 1/(N + 1), 1) T ; and, l = (L 1 , L 2 , . . . , L P ) T . Here c(N ) = N n=1 (n + 1). Now, P can be chosen to be larger than c(N ) + 1 in which case the columns of M are linearly dependent and there must exist a non-zero vector x such that M x = 0 (c(N )+1)×1 . Since x is orthogonal to the last row of M , whose entries are all equal, the components of x cannot all be equal. We know that there