Projective Limit Random Probabilities on Polish Spaces

A pivotal problem in Bayesian nonparametrics is the construction of prior distributions on the space M(V) of probability measures on a given domain V. In principle, such distributions on the infinite-dimensional space M(V) can be constructed from their finite-dimensional marginals---the most prominent example being the construction of the Dirichlet process from finite-dimensional Dirichlet distributions. This approach is both intuitive and applicable to the construction of arbitrary distributions on M(V), but also hamstrung by a number of technical difficulties. We show how these difficulties can be resolved if the domain V is a Polish topological space, and give a representation theorem directly applicable to the construction of any probability distribution on M(V) whose first moment measure is well-defined. The proof draws on a projective limit theorem of Bochner, and on properties of set functions on Polish spaces to establish countable additivity of the resulting random probabilities.


Introduction
A variety of ways exists to construct the Dirichlet process. For this particular case of a random probability measure, the spectrum of construction approaches ranges from the projective limit construction from finite-dimensional Dirichlet distributions proposed by Ferguson [8] to the stick-breaking construction of Sethuraman [25]; see e.g. the survey by Walker et al. [27] for an overview. Most of these constructions are bespoke representations more or less specific to the Dirichlet. An exception is the projective limit representation, which can represent any probability distribution on the space of probability measures. However, several authors [e.g. 12,13] have noted technical problems arising for this construction. The key role of the Dirichlet process, and the proven utility of its representation by stick-breaking or by Poisson processes, may account for the slightly surprising fact that these problems have not yet been addressed in the literature.
The purpose of this paper is to provide a projective limit result directly applicable to the construction of any probability distribution on M(V ). We do so by first modifying and then proving a construction idea put forth by Ferguson [8]. Intuitively speaking, our main result (Theorem 1.1) allows us to construct distributions on M(V ) by substituting the Dirichlet distributions used in the derivation of the Dirichlet process by other families of distributions, and by verifying that these families satisfy the two necessary and sufficient conditions of the theorem. Stick-breaking, urn schemes [3] and other specialized representations of the Dirichlet process all rely on the latter's particular discreteness and spatial decorrelation properties. Our approach may facilitate the derivation of models for which no such representations can be expected to exist, for example, of smooth random measures. For Bayesian nonparametrics, the result provides what currently seems to be the only available tool to construct an arbitrary prior distribution on the set M(V ). It also makes Bayesian methods based on random measures more readily comparable to other types of nonparametric priors constructed in a similar fashion, notably to Gaussian processes [2,28].
The technical difficulties arising for the construction proposed in [8] can be summarized as three separate problems, which Appendix A reviews in detail. In short: i Product spaces. The product space setting of the standard Kolmogorov extension theorem is not well-adapted to the problem of constructing random probability measures. ii Measurability problems. A straightforward formalization of the construction in terms of an extension or projective limit theorem results in a space whose dimensions are labeled by the Borel sets of V , and is hence of uncountable dimension. As a consequence, the constructed measure cannot resolve most events of interest. In particular, singletons, and hence the event that the random measure assumes a specific measure as its value, are not measurable [13,Sec. 2.3.2]. iii σ-additivity. The constructed measure is supported on finitely additive probabilities (charges), rather than σ-additive probabilities (measures); see Ghosal [12,Sec. 2.2]. Further conditions are necessary to obtain a measure on probability measures.
To make the projective limit construction feasible, we have to impose some topological requirements on the domain V of the random measure. Specifically, we require that V is a Polish space, i.e. a topological space which is complete, separable and metrizable [17]. This setting is sufficiently general to accommodate any applications in Bayesian nonparametrics-Bayesian methods do not solicit the generality of arbitrary measurable spaces, since no useful notion of conditional probability can be defined without a modicum of topological structure. Polish spaces are in many regards the natural habitat of Bayesian statistics, whether parametric or nonparametric, since they guarantee both the existence of regular conditional probabilities and the validity of de Finetti's theorem [16,Theorem 11.10]. The restriction to Polish spaces is hence unlikely to incur any loss of generality. We address problem (i) by means of a generalization of Kol-mogorov's extension theorem, due to Bochner [4]; problem (ii) by means of the fact that the Borel σ-algebra of a Polish space V is generated by a countable subsystem of sets, which allows us to substitute the uncountable-dimensional projective limit space by a countable-dimensional surrogate; and problem (iii) using a result of Harris [14] on σ-additivity of set functions on Polish spaces.
The remainder of the article is structured as follows: The main result is stated in Sec. 1.1, which is meant to provide all information required to apply the theorem, without going into the details of the proof. Related work is summarized in Sec. 1.3. A brief overview of projective limit constructions is given in Sec. 2, to the extent relevant to the proof. Secs. 3 and 4 contain the actual proof of Theorem 1.1: The projective limit construction of random set functions is described in Sec. 3. A necessary and sufficient condition for these random set functions to be σ-additive is given in Sec. 4. Appendix A reviews problems (i)-(iii) above in more detail.

Main result
To state our main theorem, we must introduce some notation, and specify the relevant notion of a marginal distribution in the present context. Let M(V ) be the set of Borel probability measures over a Polish topological space (V, T V ); recall that the space is Polish if T V is a metrizable topology under which V is complete and separable [1,17]. Throughout, the underlying model of randomness is an abstract probability space (Ω, A, P). A random variable X: Ω → M(V ), with the image measure P := XP as its distribution, is called a random probability measure on V . Our main result, Theorem 1.1, is a general representation result for the distribution P of such a random measure. To define measures on the space M(V ), we endow it with the weak * topology T w * (which in the context of probability is often called the topology of weak convergence) and with the corresponding Borel σ-algebra B w * := σ(T w * ). Since V is Polish, the topological space (M(V ), T w * ) is Polish as well [17,Theorem 17.23].
Let I = (A 1 , . . . , A n ) be a measurable partition of V , i.e. a partition of V into a finite number of measurable, disjoint sets. Denote the set of all such partitions H(B V ). Any probability measure x ∈ M(V ) can be evaluated on a partition I to produce a vector x I := (x(A 1 ), . . . , x(A n )), and we write φ I : x → x I for the evaluation functional so defined. Clearly, x I represents a probability measure on the finite σ-algebra σ(I) generated by the partition. Let △ I be the set of all measures x I = φ I (x) obtained in this manner, where x runs through all measures in M(V ). This set, △ I = φ I M(V ), is precisely the unit simplex in the n-dimensional Euclidean space R I , (1.1) Let J = (B 1 , . . . , B m ) and I = (A 1 , . . . , A n ) be partitions such that I is a coarsening of J, that is, for each A i ∈ I, there is a set J i ⊂ {1, . . . , m} of indices such that A i = ∪ j∈Ji B j . The sets J i form a partition of the index set {1, . . . , m}. If I is a coarsening of J, we write I J. Let In other words, φ I x is completely determined by φ J x, and invariant under any changes to x which do not affect φ J x. Therefore, the implicit definition f JI (φ J (x)) := φ I (x) determines a well-defined mapping f JI : △ J → △ I . With notation for J and I as above, f JI can equivalently be defined as (1.2) Figure 1 illustrates the mapping f JI and the simplices △ J and △ I . The image f JI x J ∈ △ I constitutes a probability distribution on the events in I. The following intuition is often helpful: The space M(V ) is convex, with the Dirac measures on V as its extreme points, and we can roughly think of M(V ) as the infinite-dimensional analogue of the simplices △ I . Similarly, we can regard the evaluations maps φ I : M(V ) → △ I as analogues of the maps f JI : △ J → △ I . Even though both M(V ) and the spaces △ I are Polish, however, we have to keep in mind that the weak * topology on M(V ) is, in many regards, quite different from the topology which △ I inherits from Euclidean space. For further properties of the space M(V ), we refer to the excellent exposition given by Aliprantis and Border [1,Chapter 15]. Suppose that P is a probability measure on M(V ). Denote by φ I P the image measure of P under φ I , i.e. the measure on △ I defined by (φ I P )(A I ) := P (φ −1 I A I ) for all A I ∈ B(△ I ). We refer to φ I P as the marginal of P on △ I . Similarly, if P J is a measure on △ J , then for any I J, the image measure f JI P J is called the marginal of P J on △ I . The following theorem, our main result, states that a measure P on M(V ) can be constructed from a suitable family of marginals P I on the simplices △ I . The notation E Q [ . ] refers to expectation with respect to the law Q.
and satisfies (2) There exists a unique probability measure P on (M(V ), B w * ) satisfying The two conditions of Theorem 1.1 serve two separate purposes: Condition (1.3) guarantees that the family P I H(BV) defines a unique probability measure P H(BV) . The support of this measure is not actually M(V ), but a larger set-specifically, the set C(Q) of finitely additive probability measures (charges) defined on a certain subsystem Q ⊂ B V , which we will make precise in Sec. 3. The set C(Q) contains the set M(Q) of σ-additive probability measures on Q as a measurable subset, and M(Q) is in turn isomorphic to M(V ), by Carathéodory's extension theorem [16,Theorem 2.5]. To obtain the distribution of a random measure, we need to ensure that P H(BV) concentrates on the subset M(Q) ∼ = M(V ), or in other words, that draws from P H(BV) are σ-additive almost surely. Condition (1.4) is sufficient-and in fact necessary-for P H(BV) to concentrate on M(V ), and therefore for a random variable X H(BV) with distribution P H(BV) to constitute a random measure. If (1.4) is satisfied, the measure constructed on C(Q) can be restricted to a measure on M(V ), resulting in the measure P described by Theorem 1.1. Sec. 3 provides more details.
The technical restriction that V be Polish is a mild one for all practical purposes, a fact best illustrated by some concrete examples of Polish spaces: The real line is Polish, and so are R n and C n ; any finite space; all separable Banach spaces (since Banach spaces are complete metric spaces), in particular L 2 and any other separable Hilbert space; the space M(V ) of probability measures over a Polish domain V , in the weak * topology [1,Chapter 15]; the spaces C([0, 1], R) and C(R + , R) of continuous functions, in the topology of compact convergence [2, §38]; and the Skorohod space D(R + , R) of càdlàg functions [24,Chapter VI]. Any countable product of Polish spaces is Polish, in particular R N , C N , and the Hilbert cube [0, 1] N . A subset of a given Polish space is Polish in the relative topology if and only if it is a G δ set [17,Theorem 3.11]. A borderline example are the spaces C(T, E) of continuous functions with Polish range E. This space is Polish if T = R ≥0 or if T is compact and Polish, but not e.g. for T = R [10, §454]. In Bayesian nonparametrics, this distinction may be relevant in the context of the "dependent Dirichlet process" model of MacEachern [21], which involves Dirichlet processes on spaces of continuous functions. For more background on Polish spaces, see [1,10,17].

Examples
Theorem 1.1 yields straightforward constructions for several models studied in the literature, and we consider three specific examples to illustrate the result. First, by choosing the finite-dimensional marginals P I in Theorem 1.1 as a suitable family of Dirichlet distributions, we obtain a construction of the Dirichlet process in the spirit of Ferguson [8].
, define P I as the Dirichlet distribution on △ I ⊂ R I , with concentration α and expectation φ I G 0 ∈ △ I . Then there is a uniquely determined probability measure P on M(V ) with expectation G 0 and the distributions P I as its marginals, that is, φ I P = P I for all I ∈ H(B V ).
A similar construction yields the normalized inverse Gaussian process of Lijoi et al. [20]. The inverse Gaussian distribution on R ≥0 is given by the den- with respect to Lebesgue measure. Lijoi et al. [20] define a normalized inverse Gaussian distribution NIG(α 1 , . . . , α n ) on the simplex △ n ⊂ R n as the distribution of the vector w = ( z1 i zi , . . . , zn i zi ), where z i is distributed according to p IG (z i |α i , γ = 1). The density of w can be derived explicitly [20,Equation (4)]. Applicability of Theorem 1.1 is a direct consequence of the results of Lijoi et al. [20]  There is a uniquely determined probability measure P on M(V ) with expectation G 0 and φ I P = P I for all I ∈ H(B V ).
Although both the Dirichlet process and the normalized inverse Gaussian process are discrete almost surely, Theorem 1.1 is applicable to the construction of continuous random measures. The Pólya tree random measures introduced by Ferguson [9] provide a convenient example. They can be obtained as projective limits as follows: Choose V = R and let G 0 ∈ M(R) be a probability measure with cumulative distribution function g 0 . For each n, let I n be the partition of . . , 2 n . All sets in I n have identical probability 1/2 n under G 0 . Since each partition I n is obtained from I n−1 by splitting each set in I n−1 at a single point, the sequence (I n ) satisfies I 1 I 2 . . . . It can be represented as a binary tree whose nth level corresponds to I n , each node representing one constituent set. There are two natural ways of indexing sets in the partitions: One is to write A n,k for the kth set in I n , i.e. n indexes tree levels and k enumerates sets within each level. The other is to index sets as A m1,...,mn by a binary sequence encoding the unique path from the root node R and the set in question, where m i = 1 indicates passing to a right child node. Let [m] 2 denote the binary representation of an arbitrary positive integer m. Then It is useful to use both index conventions interchangeably. With each node A m1···mn , we associate a pair (Y m1···mn0 , Y m1···mn1 ) ∼ Beta(α m1···mn0 , α m1···mn1 ) of beta random variables: To apply Theorem 1.1, define probability measures P In on the simplices △ In as follows: Suppose a particle slides down the tree, moving along each edge with the associated probability Y m1···mn . The probability of reaching the set A n,k is a random variable X n,k , defined recursively in terms of the beta variables as X m1···mnmn+1 := X m1···mn Y m1···mnmn+1 . Choose P I n as the distribution of X In = (X n,1 , . . . , X n,2 n ). Applicability of Theorem 1.1 follows from two results of Ferguson [9]: (a) The partitions I n generate the Borel sets B(R) and (b) each random measure X I n ∈ △ I n has expectation E[X I n ] = (G 0 (A n,1 ), . . . , G 0 (A n,2 n )). Property (a) implies that the sequence P In induces a complete family P I of probability measures on all simplices △ I , I ∈ H(B(R)). By construction, P I satisfies (1.3). According to (b), (1.4) holds. Theorem 1.1 and the well-known continuity properties of Pólya trees [19, Theorem 3] yield: Corollary 1.5 (Pólya tree). Let P I be a family of measures defined as above. There is a unique probability measure P on M(R) satisfying φ I P = P I . The distribution P is a Pólya tree in the sense of Ferguson [9], with parameters G 0 and (α [n]2 ) n∈N . The random probability measure X on R with distribution P has expected measure E P [X] = G 0 . If α n,k = cn 2 for some c > 0, then X is absolutely continuous with respect to Lebesgue measure on R almost surely.

Related work
Theorem 1.1 was effectively conjectured by Ferguson [8]. Although he only considered the special case of the Dirichlet process, and despite the technical difficulties already mentioned, he recognized both the usefulness of indexing spaces by measurable partitions (a key ingredient of the construction in Sec. 3), and the connection between σ-additivity of random draws from the Dirichlet process and σ-additivity of its parameter measure [cf. 8, Proposition 2]. Authors who have recognized problems to the effect that such a construction is not feasible on an arbitrary measurable space V include Ghosh and Ramamoorthi [13] and Ghosal [12]; both references also provide excellent surveys of the different construction approaches available for the Dirichlet process. Ghosal [12] additionally points out, in the context of problem (ii), that a countable generator may be substituted for B V , provided the underlying space is separable and metrizable.
To resolve the σ-additivity problem (iii), we appeal to a result of Harris [14], which reduces the conditions for σ-additivity of random set functions to their behavior on a countable number of sequences. This result is well-known in the theory of point processes and random measures [7,15]. Although Sethuraman was aware of Harris' work and referenced it in his well-known article [25], it has to our knowledge never been followed up on in the nonparametric Bayesian literature.
For the specific problem of defining the Dirichlet process, it is possible to forego the projective limit construction altogether and invoke approaches specifically tailored to the properties of the Dirichlet [12,13,27]. On the real line, both the Dirichlet process and the closely related Poisson-Dirichlet distribution of Kingman [18] arise in a variety of contexts throughout mathematics, each of which can be regarded as a possible means of definition [e.g. 23,26]. On arbitrary Polish spaces, the Dirichlet process can be derived implicitly as de Finetti mixing measure of an urn scheme [3], or as special case of a Pólya tree [9].
Sethuraman's stick-breaking scheme [25] is remarkable not only for its simplicity. In contrast to all other constructions listed above, it does not require V to be Polish, but is applicable on an arbitrary measurable space with measurable singletons. The stick-breaking and projective limit representations of the Dirichlet process trade off two different types of generality: Stick-breaking imposes less restrictions on the choice of V , but is not applicable to represent other types of distributions on M(V ). The projective limit approach requires more structure on V , but can represent any probability measure on M(V ). The trade-off is reminiscent of similar phenomena encountered throughout stochastic process theory. For example, probability measures on infinite-dimensional product spaces can be constructed by means of Kolmogorov's extension theorem. If the measure to be constructed is factorial over the product, the component spaces of the product may be chosen as arbitrary measurable spaces [2, Theorem 9.2]. To model stochastic dependence across different subspaces, however, a minimum of topological structure is indispensable, and Kolmogorov's theorem hence requires the component spaces to be Polish [16,Theorem 6.16]. The Dirichlet process, as a purely atomic random measure whose different atoms are stochastically dependent only through the global normalization constraint, can be regarded as the closest analogue of a factorial measure on the space M(V ). In analogy to a factorial measure, it can be constructed on very general spaces, whereas the projective limit approach, which can represent arbitrary correlation structure, requires stronger topological properties.

Background: Projective limits
A projective limit is constructed from a family of mathematical structures, indexed by the elements of an index set D [5,22]. For our purposes, the structures in question will be topological measurable spaces (X I , B I ), with I ∈ D. The projective limit defined by this family is again a measurable space, denoted (X D , B D ). This projective limit space is the smallest space containing all spaces (X I , B I ) as its substructures, in a sense to be made precise shortly. To obtain a meaningful notion of a limit, the index set D need not be totally ordered, but it must be possible to form infinite sequences of suitably chosen elements. The set is therefore required to be directed : There is a partial order relation on D and, whenever I, J ∈ D, there exists K ∈ D such that I K and J K. A simple example of a directed set is the set D := F (L) of all finite subsets of an infinite set L, where D is partially ordered by inclusion.
The component spaces X I used to define the projective limit need to "fit in" with each other in a suitable manner. This idea is formalized by defining a family of mappings f JI between the spaces which are regular with respect to the structure posited on the point sets X I . For measurable spaces, the adequate notion of regularity is measurability. Since we assume each σ-algebra B I to be generated by an underlying topology T I , we slightly strengthen this requirement to continuity. Definition 2.1 (Projective limit set). Let D be a directed set and (X I , T I ), with I ∈ D, a family of topological spaces. For any pair I J ∈ D, let f JI : X J → X I be a function such that The functions f JI are called generalized projections. The family {X I , T I , f JI |I J ∈ D}, which we denote X I , T I , f JI D , is called a projective system of topological spaces. Define a set X D as follows: For each collection {x I ∈ X I |I ∈ D} of points satisfying x identify the set {x I ∈ X I |I ∈ D} with a point x D , and let X D be the collection of all such points. The set X D is called the projective limit set of X I , f JI D .
Denote the Borel σ-algebras on the topological spaces X I by B I := σ(T I ). For each I ∈ D, the map defined as f I : x D → x I is a well-defined function f I : X D → X I . These functions are called canonical mappings. They define a topology T D and a σ-algebra on the projective limit space X D , as the smallest topology (resp. σ-algebra) which makes all canonical mappings f I continuous (resp. measurable). In particular, In analogy to the set X D , the topological space (X D , T D ) is called the projective limit of X I , T I , f JI D , and the measurable space (X D , B D ) the projective limit of X I , B I , f JI D . A measure P D on the projective limit (X D , B D ) can be constructed by defining a measure P I on each space (X I , B I ). By simultanously applying the projective limit to the projective system X I , B I , f JI D and to the measures P I , the family P I D is assembled into the measure P D . The only requirement is that the measures P I satisfy a condition analogous to the one imposed on points by (2.1). More precisely, P I has to coincide with the image measure of P J under f JI , We refer to the measures in the family P I D as the marginals of the stochastic process P D . Since the marginals completely determine P D , some authors refer to P I D as the weak distribution of the process, or as a promeasure [6]. Theorem 2.2 was introduced by Bochner [4, Theorem 5.1.1], for a possibly uncountable index set D. The uncountable case requires an additional condition known as sequential maximality, which ensures the projective limit space is non-empty. For our purposes, however, countability of the index set is essential: Measurability problems (problem (ii) in Sec. 1) arise whenever D is uncountable, and are not resolved by sequential maximality.
The most common example of a projective limit theorem in probability theory is Kolmogorov's extension theorem [16,Theorem 6.16], which can be regarded as the special case of Bochner's theorem obtained for product spaces: Let D be the set of all finite subsets of an infinite set L, partially ordered by inclusion. Choose any Polish measurable space (X 0 , B 0 ), and set X I := i∈I X 0 . The resulting projective limit space is the infinite product X D = i∈L X 0 , and B D coincides with the Borel σ-algebra generated by the product topology. For product spaces, the sequential maximality condition mentioned above holds automatically, so L may be either countable or uncountable. Once again, though, the measurability problem (ii) arises unless L is countable. The product space form of the theorem is typically used in the construction of Gaussian process distributions on random functions [2]. For random measures, a more adequate projective system is constructed in following section.

Projective limits of probability simplices
This section constitutes the first part of the proof of Theorem 1.1: The construction of a projective limit space X D from simplices △ I , and the analysis of its properties. The space X D turns out to consist of set functions which are not necessarily σ-additive, and the remaining part of the proof in Sec. 4 will be the derivation of a criterion for σ-additivity.
The distinction between finitely additive and σ-additive set functions will be crucial to the ensuing discussion. We consider two types of set systems Q on the space V : Algebras, which contain both ∅ and V , and are closed under complements and finite unions, and σ-algebras, which are algebras and additionally closed under countable unions. A non-negative set function µ on either an algebra or σ-algebra Q is called a charge if it satisfies µ(∅) = 0 and is finitely additive. If a charge is normalized, i.e. if µ(V ) = 1, it is called a probability charge. A charge is a measure if and only if it is σ-additive. If Q is an algebra, and not closed under countable unions, the definition of σ-additivity only requires µ to be additive along those countable sequences of sets A n ∈ Q whose union is in Q.

Definition of the projective system
For the choice of components in a projective system, it can be helpful to regard the elements x D of the projective limit space X D as mappings, from a domain defined by the index set D to a range defined by the spaces X I . The simplest example is once again the product space X D = X L 0 in Kolmogorov's theorem, for which each x D ∈ X D can be interpreted as a function x D : L → X 0 . Probability measures on (V, B V ) are in particular set functions B V → [0, 1], so it is natural to construct D from the sets in B V . It is not necessary to include all measurable sets: If Q is an algebra that generates B V , any probability measure on Q has, by Carathéodory's theorem [16, Theorem 2.5], a unique extension to a probability measure on B V . In other words, the space M(Q) of probability measures on Q is isomorphic to M(V ), and Q can be substituted for B V in the projective limit construction.
Desiderata for the projective limit are: (1) The projective limit space X D should contain all measures on Q (and hence on B V ). (2) Q should be countable, to address the measurability problem (ii) in Sec. 1. (3) The marginal spaces X I should consist of the finite-dimensional analogues of measures on Q, and hence of measures on finite subsets of events in Q. (4) The definition of the system should facilitate a proof of σ-additivity. In this section, we will recapitulate the projective limit specified in Sec. 1.1 and show it indeed satisfies (1)-(3); that (4) is satisfied as well will be shown in Sec. 4.

Structure of the projective limit space
Let (X D , B D ) be the projective limit of △ I , B I , f JI D . We observe immediately that X D contains M(Q): If x is a probability measure on Q(U), let x I := f I x for each partition I ∈ D. The collection {x I |I ∈ D} satisfies (2.1), and hence constitutes a point in X D . The following result provides more details about the constructed measurable space (X D , B D ), which turns out to be the space C(Q) of all probabiliy charges defined on Q(U). By B w * , we again denote the Borel σ-algebra on M(V ) generated by the weak * topology. Part (ii) implies that a projective limit measure P D constructed on C(Q) by means of Theorem 2.2 can be restricted to a measure on M(Q) without further complications, in particular without appealing to outer measures. According to (iii), there is a measure P on M(V ) which can be regarded as equivalent to P D , namely the image measure P := ψ −1 P D under the inverse of the restriction map ψ. This is of course the measure P described in Theorem 1.1, though some details still remain to be established later on. Since ψ is a Borel isomorphism, P constitutes a measure with respect to the "natural" topology on M(V ).
Proof. Part (i). Let x D ∈ X D . The trivial partition I 0 := (V ) is in D, which implies x D (V ) = f I0 x D = 1 and x D (∅) = 0. To show finite additivity, let A 1 , A 2 ∈ Q(U) be disjoint sets and choose a partition J ∈ D such that A 1 , A 2 ∈ J. Let I J be the coarsening of J obtained by joining the two sets. As the elements of each space △ I are finitely additive, Hence, x D is a charge. Conversely, assume that x D is a probability charge on Q(U). The evaluation f I x D of x D on a partition I ∈ D defines a probability measure on the finite σ-algebra σ(I), and thus f I x D ∈ △ I . Since additionally f JI (f J x D ) = f I x D , the set f I x D D forms a collection of points f I x D ∈ △ I satisfying (2.1), and hence x D ∈ X D . Part (ii). Regard the restriction map ψ as a mapping into C(Q), with image M(Q). By Caratheodory's extension theorem, ψ is injective [16,Theorem 2.5]. If an injective mapping between Polish spaces is measurable, its inverse is measurable as well [16,Theorem A1.3]. Thus, if we can show ψ to be measurable, First observe that ψ relates the evaluation functionals f I : C(Q) → △ I on probability charges to the evaluation functionals φ I : M(V ) → △ I on probability measures via the equations for all I ∈ D . (3.5) We will show that the mappings φ I generate the σ-algebra B w * on M(V ). Since the canonical mappings f I generate B D on C(Q) by definition, (3.5) then implies B w * -B D -measurability of ψ: Since M(V ) is separable, the Borel sets of the weak * topology coincide with those generated by the maps φ A [11, Theorem 2.3], thus Hence equivalently, B w * = σ(φ (A,A c ) |A ∈ B V ), and with (3.5), Clearly, the maps f (A,A c ) for A ∈ Q are sufficient to express all information expressible by the larger family of maps f I , I ∈ D, and thus generate the projective limit σ-algebra, In summary, ψ is B w * -B D -measurable, and we deduce M(Q) = ψ(M(V )) ∈ B D . Part (iii). As shown above, ψ is injective and measurable, and regarded as a mapping onto its image M(Q), it is trivially surjective. What remains to be shown is measurability of the inverse. By part (ii), the image ψ(M(V )) = M(Q) is a Borel subset of C(Q). As a countable projective limit of Polish spaces,

σ-additivity of random charges
The previous section provides the means to construct the distribution P D of a random charge X D : Ω → C(Q) as a projective limit measure. To obtain random measures rather than random charges in this manner, we need to additionally ensure that P D concentrates on the measurable subspace M(V ), or in other words, that X D is σ-additive P-almost surely. Consider a projective limit random charge X D , distributed according to a projective limit measure P D on C(Q). The following proposition gives a necessary and sufficient condition for almost sure σ-additivity of X D , formulated in terms of its expectation E PD [X D ]. It also shows that the expected values of P D and the projective family P I D are themselves projective, in the sense that f I E PD [X D ] = E PI [X I ], and accordingly f JI E PJ [X J ] = E PI [X I ] for any pair I J. The latter makes the criterion directly applicable to construction problems: If we initiate the construction by choosing an expected measure G 0 ∈ M(V ) for the prospective measure P D , and then choose the projective family such that E PI [X I ] = f I G 0 , random draws from P D will take values in M(V ) almost surely.
for any I ∈ D . (4.1) The proof requires a criterion for σ-additivity of probability charges expressible in terms of a countable number of conditions. Assuming that G 0 is σadditive, we will deduce from the projective limit construction that, if a fixed sequence of sets is given, the random content X D is countably additive along this sequence with probability one. This only implies almost sure σ-additivity of X D on Q(U) if the condition for σ-additivity can be reduced to a countable subset of sequences in Q(U) (cf. Appendix A.3). Such a reduction was derived by Harris [14,Lemma 6.1]. For our particular choice of Q(U), his result can be stated as follows: Since clearly also E PD [X D ](∅) = 0 and E PD [X D ](V ) = 1, the expectation is an element of X D . To verify (4.1), note the mappings f JI : △ J → △ I are affine, and hence Part (ii). First assume that G 0 is σ-additive. Let (A m n ) n be any of the set sequences given by Lemma 4.2. As n → ∞, the random sequence (X D (A m n )) converges to 0 almost surely: σ-Additivity of G 0 implies − − → 0. The sequence (A m n ) is decreasing and the random variable X D is charge-valued, which implies X D (A m n+1 ) ≤ X D (A m n ) a.s. In particular, the sequence (X D (A m n )) forms a supermartingale when endowed with its canonical filtration. For supermartingales, convergence in the mean implies almost sure convergence [2,Theorem 19.3], and thus indeed X D (A m n ) a.s.
− − → 0. Consequently, there is a P-null subset N m of the abstract probability space Ω such that The union N := ∪ m∈N N m of these null sets, taken over all sequences (A m n ) required by Lemma 4.2, is again a P-null set. The charge X D (ω) satisfies (4.2) for all m whenever ω ∈ N . Therefore, X D is σ-additive P-a.s. by Lemma 4.2, and hence almost surely a probability measure.
Conversely, let X D assume values in M(V ) ∼ = M(Q) almost surely. Since A m n ց ∅, the sequence of measurable functions ω → (X D (ω))(A m n ) converges to 0 almost everywhere. By hypothesis, C(Q) M(Q) is a null set, hence where the second identity holds by dominated convergence [16,Theorem 1.21]. Since

A.1. Product spaces
The Kolmogorov extension theorem used in the construction is not well-adapted to the problem of constructing measures on measures, because the setting assumed by the theorem is that of a product space: A finite-dimensional marginal of a measure P on M(V ) is a measure P I on the set of measures over a finite σ-algebra C of events. Any such σ-algebra can be generated by a partition I of events in B V . The set consisting of the marginals on I of all measures x ∈ M(V ) is necessarily isomorphic to the unit simplex in |I|-dimensional Euclidean space. Hence, the marginals of a measure P defined on M(V ) always live on simplices of the form △ I as described in Sec. 1.1. In other words, when we set up a projective limit construction for measures on M(V ), the choice of possible finitedimensional marginal spaces is limited-either the simplices are used directly, as in Sec. 1.1, or they are embedded into some other finite-dimensional space. If the projective limit result to be applied is the Kolmogorov extension theorem, the simplices must be embedded into Euclidean product spaces, as proposed in [8]. The problem here is that it is difficult to properly formalize marginalization to subspaces, as required by the theorem. For constructions on [0, 1] BV , the problem can be illustrated by the example in Fig. 1: For J = (B 1 , B 2 , B 3 ), the simplex △ J is a subspace of R J ∼ = R 3 . Marginalization corresponds to merging two events, such as B 1 and B 2 in the example. The resulting simplex △ I for I = (B 1 ∪ B 2 , B 3 ) is a subspace of R I . However, R I is not a subspace of R J , nor is △ I a subspace of △ J . Hence, in the product space setting of the Kolmogorov theorem, the natural way to formalize a reduction in dimension for measures on a finite number of events does not correspond to a projection onto a subspace.

A.2. Measurability problems
A general property of projective limit constructions of stochastic processes is that the index set-intuitively, the set of axes labels of a product, or of dimensions in a more general setting-must be countable to obtain a useful probability measure. This is due to the fact that all projective limit theorems implicitly generate a σ-algebra on the infinite-dimensional space-the σ-algebra B D specified by (2.2)-based on the σ-algebras on the marginal spaces used in the construction. The constructed measure lives on this σ-algebra. AI of some event AI ⊂ R I , that is, if the set AD is of "axis parallel" shape in direction of X 3 . The event AI in the figure occurs if (X 1 , X 2 ) ∈ AI, or equivalently, if (X 1 , X 2 , X 3 ) ∈ f −1 JI AI.
If the dimension is uncountable, the resolution of the σ-algebra is too coarse to resolve most events of interest. In particular, it does not contain singletons. The problem is most readily illustrated in the product space setting: Suppose the Kolmogorov theorem is used to define a measure P on an infinite-dimensional product space X D := R L , where L is some infinite set. The measure P is constructed from given measures P I defined on the finite-dimensional sub-products R I , where I ∈ D are finite subsets of L. The σ-algebra on R L on which P D is defined is generated as follows: Denote by f I the product space projector R L → R I . For any measurable set A I ∈ R I , the preimage f −1 I A I is a subset of R L , which is of "axis-parallel" shape in direction of all axis not contained in I. The finite-dimensional analogue of this situation is illustrated in Fig. 2, where A I is assumed to be an elliptically shaped set in the plane R I , and the overall space R L is depicted as three-dimensional. Preimages f −1 I A I of measurable sets are, for obvious reasons, called cylinder sets in the probability literature. The σ-algebra defined by the Kolmogorov theorem is the smallest σ-algebra containing all cylinder sets f −1 I A I , for all measurable sets A I ∈ R I and all finite sub-products R I . Since σ-algebras are defined by closure under countable operations, the sets in this σ-algebra can be thought of as cylinder sets that are of axis-parallel shape along all but a countable number of dimensions. If the overall space is of countable dimension, any set of interest can be expressed in this form. If the dimension is uncountable, however, these events only specify the joint behavior of a countable subset of random variables-in Fig. 2, R I would represent a subspace of countable dimension of the uncountable-dimensional space R L .
For example, consider the set R L := R R , regarded as the set of all functions x D : R → R, which arises in the construction of Gaussian processes. Although the constructed measure P D is a distribution on random functions x D , this measure cannot assign a probability to events of the form {X D = x D }, i.e. to the event that the outcome of a random draw is a particular function x D . The only measurable events are of the form {X D (s 1 ) = t 1 , X D (s 2 ) = t 2 , . . . } and specify the value of the function at a countable subset of points s 1 , s 2 , . . . ∈ R.

A.3. σ-additivity
The marginal distributions used in the construction specify the joint behavior of the constructed measure P D on any finite subset of measurable sets. σ-additivity requires additivity along an infinite sequence, and cannot be deduced directly from additivity of the marginals. Suppose that some sequence A 1 , A 2 , . . . of measurable sets in V is given, and that x D is a random set function drawn from P D . Countable additivity of x D along the sequence can be shown to hold almost surely (with respect to P D ) by means of a simple convergence argument [8,Proposition 2]. However, as a σ-algebra, B V is either finite or uncountable. Hence, if V is infinite, B V contains an uncountable number of such sequences. Even though x D is additive along any given sequence with probability one, the null sets of exceptions aggregate into a non-null set over all sequences, and x D is not σ-additive with probability one. Substituting a countable generator Q for B V does not resolve the problem, since the number of sequences in Q remains uncountable.