Relative Set Theory: Some External Issues

The paper establishes equivalence of two axiomatizations of the relative set theory GRIST, examines dependence of nonstandard concepts on the choice of the level, and proves that several nonstandard definitions of Lebesgue measure are equivalent. The last section extends GRIST to a theory of external sets.


Introduction
Relative set theory is an axiomatic framework for nonstandard analysis distinguished by many "levels of standardness."It was proposed by Péraire in [22] and several subsequent papers and given the acronym RIST; for early mathematical applications see for example [23].The present author extended RIST further, to FRIST [14,15] and then to GRIST [16], a theory that is, in a technical sense, complete over ZFC.In 2004, O'Donovan and the author realized that a simple fragment of relative set theory might serve as a vehicle for presentation of nonstandard methods at a very elementary level.In a joint paper with Lessmann [17] we propose such an elementary theory of "relative analysis" and show how to develop some basic calculus concepts in it.Calculus courses based on relative analysis have been implemented in two high schools in Geneva; for a report on the pedagogical aspects of the project see O'Donovan [21].
The purpose of this article is to bridge the gap between the technical papers [14,16], concerned with metamathematics of GRIST, and the mathematical and pedagogical applications of GRIST.The common theme is the employment of external sets and classes.GRIST, like Nelson's IST, is an internal set theory in the sense that objects of the theory are what nonstandard analysts think of as internal sets; we call them just sets and identify them with the "usual" sets of traditional mathematics.However, in relative set theory (as in IST) it is possible, and often useful, to describe and put to use also collections that are not internal.In the attempts to present a fragment of GRIST as simply as possible it was found that an axiomatization in terms of certain external classes ("levels of standardness") is more intuitive.Section 1 of this paper states the axioms of GRIST both in the original ∈--language and in the language of levels, and provides a translation between the two.Thus it formally establishes the consistency of the elementary system employed in [17].
External concepts, such as infinitesimals, S-continuity, S-integrability and many others, are the tool characteristic of every version of nonstandard analysis.In relative set theory with its many "levels of standardness," such concepts are always relative to some level.In Section 2 we showcase the techniques available in GRIST for the study of dependence of external concepts on the choice of level.We focus on S-continuity and prove that, for a fixed function, it can change only finitely many times as the level varies.
Another feature of GRIST which is not present in the usual treatments of nonstandard analysis is the possibility to define concepts in ways that involve quantification over levels.One example is the notion of superfine partition, suggested by the problem of integrating an arbitrary derivative.In the Appendix to Section 3 we show that superfine partitions in the strong sense do not exist.But there is a weaker sense of relative standardness, proposed by Benninghofen and Richter [6] and Gordon [9,10], which can be used to define superfine partitions and develop a theory of integration.As the focus of this paper is on the methods of relative set theory, we show in Section 3 only that the usual definition of Lebesgue measure is equivalent to two definitions using superfine partitions, and also to one based on the well-known Loeb measure approach; a more systematic development of integration in GRIST can be found in [18].
In order to enable construction of Loeb measures and many other important nonstandard concepts, GRIST has to be extended to a substantial formal theory of external sets.In Section 4 we propose such a theory, in four stages of increasing power, and prove its relative consistency with ZFC.The paper concludes with a discussion of why a more comprehensive theory would be desirable.

and
(also denoted st).x < y is shorthand for (x y ∧ ¬ y x) and x y for (x y ∧ y x).
If P(x 1 , . . ., x k ) is a formula (with all its free variables among x 1 , . . ., x k ) and u is a variable, the formula P u (x 1 , . . ., x k ) is obtained by replacing each occurence of by u , defined by x u y if and only if (x y ∨ x u).
Below, we state the axioms of GRIST as given in [16].0 is the empty set and P fin A is the set of all finite subsets of A.
GRIST includes the axioms of ZFC and the following additional axioms.

(R) Relativization
The conjunction of:

(S) Standardization
For all u = 0 and all A, x 1 , . . ., x k , there exist v < u and B v such that, for every w with v w < u,

(I) Idealization
For all A < v and all x 1 , . . ., x k , (∀a For all x 1 , . . ., x k , if (∃u)P u (x 1 , . . ., x k ), then As a vehicle for elementary presentations of nonstandard methods, and the first step towards enriching it by external objects, it is convenient to formulate relative set theory as a second-order theory, or, equivalently [8], a first-order theory in a two-sorted language.
In this formulation (see [17]), there are individual variables, intended to range over sets, and 1-place predicate variables (class variables) denoted by V 1 , V 2 , . . ., U, V, W, . .., intended to range over certain proper classes called levels.Equality, both between individual variables and between class variables, is denoted by =.We write We say that a formula P(x 1 , . . ., ] if all quantifiers over levels are of the form (∀U ⊇ V) or (∃U ⊇ V), and we indicate it by a semicolon thus: P(x 1 , . . ., The axioms of GRIST in this language are given below; the superscript ♥ is used to distinguish them from their first-order counterparts.P is an arbitrary V-formula, with at most one free variable over levels.
GRIST ♥ postulates ZFC and The conjunction of: For all U ⊆ V and all x 1 , . . ., x k ∈ U, For all U and all A, x 1 , . . ., x k , either (∀V)(U ⊆ V) or there exist V ⊂ U and B ∈ V such that, for every W with V ⊆ W ⊂ U, For all U, V, A such that A ∈ U ⊂ V, and all x 1 , . . ., x k , At the semantic level, every structure M for the first-order language can be made into a general pre-structure N for the second-order language [8, § 4.4], by letting predicate variables V range over subsets of |M| of the form {x ∈ |M| : x M a}, where a ∈ |M| (and dropping M ).Conversely, every general pre-structure N for the second-order language can be made into a structure M for the first-order language via defining M by: a M b if and only if N (∀V)(b ∈ V → a ∈ V) (and droppping the levels).Moreover, M GRIST if and only if N GRIST ♥ .In the following we obtain stronger syntactic results that give an algorithm for translating statements from one language into the other in such a way that theorems get translated into theorems.
To facilitate the translations, we partition the variables of GRIST and the individual variables of GRIST ♥ into two infinite classes and use x 1 , x 2 , . . ., x, y, z, a, A, ... for variables of the first kind and v 1 , v 2 , . . ., u, v, w, . . .for the second kind.We fix a oneone correspondence v → V between the GRIST variables of the second kind and the class variables of GRIST ♥ .A GRIST formula P(x 1 , . . ., x k , v 1 , . . ., v n ) is suitable if the variables of the second kind do not occur in the scope of ∈ or =.Formally: x j , v i v j are suitable, and applications of logical connectives and quantifiers to suitable formulas yield suitable formulas.A We define a syntactic translation Γ Γ Γ of the suitable formulas of GRIST language into the language of GRIST ♥ , and a syntactic translation ∆ ∆ ∆ in reverse direction.We use .= for metamathematical identity, and , as metaparentheses.
Let R denote the conjunction of R (i) and (ii), and R ♥ the conjunction of R ♥ (o), (i) and (ii).
, for all suitable P and Q (of the appropriate languages).
Proof By induction on the complexity of formulas.We verify the cases that are not entirely trivial.
The last formula implies y y → x y, and then x y, using R(i); the converse direction follows from R(ii).
, as in the preceding case.This is further equivalent to v x.
Proof Wlog we can assume that all formulas occuring in the proof of Q are suitable (if an individual variable of the second kind appears in the proof, replace it by some variable of the first kind that does not appear in the proof).It is trivial to verify that ∆ ∆ ∆ preserves logical axioms and modus ponens.The axioms for equality between classes are translated into easy consequences of R(i, ii).
The analogous proposition for Γ Γ Γ is not so immediate.
Proposition 1.4 If R P , then R ♥ Γ Γ Γ P , for all suitable P .
The primary difficulty is that, while P is assumed to be a suitable formula, some other formulas that appear in the proof of P may not be suitable, and the translation Γ Γ Γ may be undefined for them.Renaming variables does not help here; if v occurs in P and say v ∈ x appears somewhere in the proof, replacing v with V does not make sense.
To circumvent this difficulty, we define another, more "literal" translation Γ Γ Γ 0 that does not suffer from this problem; on the other hand, Γ Γ Γ 0 does not satisfy Proposition 1.2.
In Proposition 1.6 we establish a relationship between the two translations from which Proposition 1.4 follows.
If P is any formula of GRIST, Γ Γ Γ 0 P is the formula obtained from P by replacing each occurence of ξ η by (∀U)(η ∈ U → ξ ∈ U) [ξ, η are variables of either kind].
Proposition 1.5 If R P , then R ♥ Γ Γ Γ 0 P , for all P .
Proof Trivially, Γ Γ Γ 0 preserves logical axioms and modus ponens.Γ Γ Γ 0 u u and The translation Γ Γ Γ 0 , unlike Γ Γ Γ and ∆ ∆ ∆, does not take suitable formulas into suitable formulas, and Proposition 1.2 does not hold with Γ Γ Γ 0 in place of Γ Γ Γ: ), which does not even have the same free variables as Q.In general, if P is a suitable formula with free variables x 1 , . . ., x k , v 1 , . . ., v n , then Γ Γ Γ 0 P has the same free variables, while Γ Γ Γ P has free variables x 1 , . . ., We next establish the relationship between Γ Γ Γ and Γ Γ Γ 0 .
Axioms (o) and (i) of R ♥ imply that for every x there is a unique coarsest level V such that x ∈ V.In R ♥ we can thus define a function x → V(x) with arguments of the individual sort and values of the class sort such that Proof Proceed by induction on complexity of suitable formulas.Let x be shorthand for x 1 , . . ., x k .The interesting cases are: Before extending Propositions 1.3 and 1.4 to all of GRIST, we need a technical result.We use x as shorthand for x 1 , . . ., x k .Lemma 1.7 (a) If P(x) is a formula where only variables of the first kind occur, then for some formula P(x) where only variables of the first kind occur.
The rest of (a) follows trivially by induction on the complexity of P .
(b) We show by induction on the complexity of formulas that for every suitable Vformula Q(x; V, V 1 , . . ., V n ) there is a suitable formula P(x, v 1 , . . ., v n ), in which all variables except v 1 , . . ., v n are of the first kind, such that In particular, letting n = 0 gives R ∆ ∆ ∆ Q(x; V) ↔ P v (x) and proves (b).
We focus on the nontrivial cases.
Replace the bound variable v 1 in the second conjunct by some variable of the first kind that does not occur in P , and the proof is complete.
Proof We can assume that all variables that occur in the axioms of GRIST (in particular, in the formula P ), except for u, v, w, are of the first kind.It is then easy to verify, with the help of Lemma 1.7(a), that each of the remaining axioms of GRIST: R(iii, iv, v), T, S, I, G, translates into a formula that is equivalent to an instance of the corresponding axiom of GRIST ♥ .This establishes the "only if" direction in (a).
For the converse, we can assume that all individual variables occuring in the axioms of GRIST ♥ are of the first kind.Then all of the remaining axioms of GRIST ♥ translate into equivalents of the corresponding axioms of GRIST, using Lemma 1.7(b).This proves the "only if" direction in (b).
The "if" directions then follow from Proposition 1.2.
Corollary 1.9 Corollaries 12.1 -12.10 in [16] are valid for GRIST ♥ [in place of SST ], with obvious adjustments.In particular, GRIST ♥ is a conservative extension of ZFC.
From now on, we do not formally distinguish between GRIST and GRIST ♥ .
A number of consequences of GRIST that are useful in relative analysis have been derived in [16].Below we give a translation of these consequences into the language of levels.It follows easily from GRIST that for every x 1 , . . ., x k there is a coarsest level where x 1 , . . ., x k appear; see Axiom I below.We denote it V(x 1 , . . ., x k ) and call it the level of x 1 , . . ., x k .In particular, V(•) is the coarsest level [• is the empty list].
Proposition 1.11 (Saturation) If F ∈ U ⊂ V and F has the finite intersection property, then there exists y ∈ V such that y ∈ X , for all X ∈ F ∩ U.
Proof Let A := F , B := F , and let P(X, y; V) be the formula (y ∈ X) ∧ (y ∈ V); then apply FRIST Idealization.
As outlined in [17], an elementary exposition of calculus in relative analysis does not need the full strength (and complexity) of GRIST; a much weaker system suffices.
Here and in the next section we derive the axioms of [17] from GRIST.
Axiom I For every x 1 , . . ., x k there is a level V such that x 1 , . . ., x k ∈ V and, for all levels U, x 1 , . . ., Remark A stronger result follows from Proposition 1.10(13):For every finite set {x 1 , . . ., x k } there is a coarsest level V such that {x 1 , . . ., x k } ⊆ V.
The axioms II and VI (Stability) are R ♥ (iii) and T ♥ , respectively, and V is a consequence of VI.The statements and proofs of axioms III, IV (Neighbor Principle) and VIII (Density of Levels) are given in Section 2.
Let P(x; V) be the V-formula obtained from an internal P(x) by replacing each (∀U) with (∀U ⊇ V) and each (∃U) with (∃U ⊇ V).
Proof Let P(y, x; V) be the V-formula obtained from the internal P(y, x) as above, and let By Transfer (Stability) applied to ( * ), for any Given an arbitrary y, let V 1 := V(y, A, x); Lemma 1.12 gives y ∈ B ↔ y ∈ A ∧ P(y, x).
Mathematical practice enriches the set-theoretic language by new defined concepts.We conclude this section by proving that the axioms of GRIST remain valid for formulas in the language enriched by internal predicates, where a predicate R(x 1 , . . ., x k ) defined by R(x 1 , . . ., x k ) ↔ R(x 1 , . . ., x k ) is internal if its defining formula R is internal.
V-formulas and internal formulas of the language of GRIST ♥ with an additional predicate symbol R are defined in the same way as for the original language.
Proposition 1.13 (a) If P(x; V) is a V-formula in the language with an additional predicate R and P (x, V) is obtained from P by replacing each occurence of R by its internal defining formula R, then P is equivalent to a V-formula in the original language.
(b) If P(x 1 , . . ., x n ) is an internal formula in the language with an additional predicate R and P (x) is obtained from P by replacing each occurence of R by its internal defining formula R, then P is equivalent to an internal formula in the original language.
Proof (a) is obvious from Lemma 1.12 [replace R(x) by R(x; V)].

S-continuity in GRIST.
If V is a fixed level and st(x) [x is standard] is defined as x ∈ V, all axioms of BST become provable in GRIST [16, Corollary 12.18].Therefore, nonstandard analysis in the style of Internal Set Theory of Nelson [20,7] can be practiced in GRIST, with the additional advantage that the notion of standardness is not fixed: Every set a can be considered as standard relative to any level with a ∈ V.The paper [17] outlines an elementary presentation of nonstandard methods in analysis in the framework of GRIST, and discusses its advantages.
The existence of many "levels of standardness" in GRIST raises the question of dependence of nonstandard concepts on the choice of the level.It also enables definitions of concepts that involve quantification over levels.In this and the next section we illustrate some of the techniques available in GRIST for handling of such issues.
Definition 2.1 Given a level V: (1) A real number is ultrasmall relative to V if | | < r for all r > 0, r ∈ V.
(2) A real number x is ultralarge relative to V if |x| > r for all r > 0, r ∈ V; x is limited relative to V if it is not ultralarge relative to V.
(3) Real numbers a and b are ultraclose relative to V, written a Proof (a) For every W ⊂ V and every finite a ⊆ N, a ∈ W, there is k Proposition 2.3 (Axiom III) For every level V there exist real numbers = 0 ultrasmall relative to V.
Proposition 2.4 (Axiom VIII, Density of Levels) If a real number = 0 is ultrasmall relative to V, then there is V + and a real number δ ∈ V + , δ = 0, such that δ is ultrasmall relative to V and is ultrasmall relative to V + .
Proof By Local Transfer there is Proposition 2.5 (Axiom IV, Neighbor Principle) For every real number x limited relative to V there is a real number r ∈ V such that x V r.The number r is uniquely determined; we call it the V-neighbor of x and denote it n V (x).
Proof By FRIST Standardization 1.10(1) there is hence B has a least upper bound r, and r ∈ V, again by Transfer.One easily verifies that r V x.
If x ∈ R, x = 0, then x is not ultrasmall relative to V(x).Granularity implies that for every x = 0 there is a coarsest level V 0 such that x is not ultrasmall relative to V 0 .If V 0 = V(•), the coarsest level, then x is not ultrasmall relative to any level.Otherwise, x is ultrasmall relative to V ⊂ V 0 and is not ultrasmall relative to V ⊇ V 0 .
More interesting behavior is exhibited by the various S-concepts that play a key role in nonstandard analysis: S-continuity, S-integrability, etc.Here we study the dependence of S-continuity on the choice of level.
Definition 2.6 Given a set A ⊆ R and a level V, there is a unique set We call this B the V-shadow of A and denote it sh V (A).1

Proposition 2.7 (a) For
Proof (a) A well-known nonstandard characterization of closed subsets of R is: This is precisely the statement that A = sh V (A).
(b) Let B := sh V (A); by Local Transfer 1.10( 6), there is a level Remark The usual proof of (b) uses -neighborhoods of r and Idealization; the argument given here is more "nonstandard." [it is defined because ξ is limited relative to U, hence also relative to V].We have ξ V y, so y ∈ sh V (A), and x U y, so x ∈ sh U (sh V (A)).
For the converse, let x ∈ sh U (sh V (A)) ∩ U. Let B := sh V (A).We know that x U y for some y ∈ B. It suffices to prove Claim: x U y for some y ∈ B ∩ V because then y V ξ for some ξ ∈ A and so x U ξ and x ∈ sh U (A).
Proof of Claim Let U vary over neighborhoods of x in U; then (∃y The notion of shadow makes sense for subsets of R × R. We let (x, y) V (x , y ) if and only if x V x ∧ y V y .Definition 2.6 and Propositions 2.7, 2.8 then have obvious analogs for A ⊆ R × R.
In the rest of this section we study real-valued functions.For simplicity, we consider only functions f : Theorem 2.10 For every function f there is a finite set {v 0 , . . ., Conversely, for every finite set {v 0 , . . ., v n } as above there is a function f with the above properties.
The first part of Theorem 2.10 is an immediate consequence of the Support Principle 1.10( 5).Here we prove a stronger result [Theorem 2 .13]showing that every function has only finitely many distinct shadows as V ranges over all levels.We recall that sh holds for all r, s ∈ V.
The S-versions of the following two facts are well-known (eg see [7]).
, for some x, x .It follows that x V x , so by V-continuity, f (x) V f (x ), and finally s 1 V s 2 .As s 1 , s 2 ∈ V, we get Thus F is a function.
Proof By Local Transfer 1.10( 6), there is by V-continuity of f , and so F(r) V F(r ).The statement just proved for V and a particular V ⊃ V: is true for all V ⊃ V by Polytransfer 1.10(8).Hence r V r → F(r) V F(r ) holds for all r, r ∈ dom F , and F is (uniformly) continuous.
Theorem 2.13 For every function f there is a finite set {F 0 , . . ., Proof We fix {v 0 , . . ., v n } and consider the following statement about V: The statement is trivially true for V ⊇ V(v n ).So by Granularity there is a coarsest level V for which it is true.
We can now prove the second part of Theorem 2.10.
For 1 ≤ i ≤ n let f i : [0, 1] → [0, 1] be defined by If n is even, take n i=2 i even instead; the pattern of continuity versus not-continuity is reversed.Both cases are easily modified to produce the opposite pattern.
The notion of V-continuity has a natural generalization.
We forgo the detailed study of (V 1 , V 2 )-continuity and prove only the basic result.
This section is concerned with some aspects of Lebesgue measure and integral.Our goal is to showcase the tools available in GRIST for dealing with such matters, not to give a systematic development of the theory of integration.For this reason, we limit ourselves to the representative simplest case, that of Lebesgue measure on [0, 1].A more complete treatment of the theory of measure and integration in (a weak subsystem of) GRIST can be found in [18].
We give three nonstandard definitions of Lebesgue measure on [0, 1] and prove their equivalence to the usual one.Letters A, B denote subsets of [0, 1], and I, J are finite non-degenerate intervals; (I) is the length of I .
The most useful nonstandard approach to integration is due to Loeb (see eg [1]).Loeb showed that every finitely additive measure µ on an algebra of sets A gives rise to an external (countably additive) measure L(µ) on an external σ -algebra L(A) generated by A. This construction uses external sets in an essential way (see Section 4).Loeb finite system of non-overlapping intervals , there is a level V ⊃ V such that N / ∈ V .By Polytransfer 1.10(8), we can assume that the system {J j } m j=0 with the above properties is in V .Let {I i } n i=0 be the collection of all intervals of the form [t k , t k+1 ) that have a nonempty intersection with some J j .Then n −1 In order to motivate the next two definitions of Lebesgue measure (Definitions 3.9 and 3.10), we briefly summarize the nonstandard approach to the Riemann integral and the integrals of McShane and Henstock-Kurzweil.For the standard theory of these integrals see for example Bartle [4] and Pfeffer [24].
Definition 3.4 A tagged interval is a pair (I, t) where I is a closed interval and t ∈ R.
A tagged covering is a finite system A tagged partition is a Riemann partition if t i ∈ I i for all i = 1, . . ., n.A Riemann partition I is fine relative to V if all (I i ) are ultrasmall relative to V. Definition 3.5 Let f : [0, 1] → R be a function.For all tagged coverings I, the Riemann sum ( f ; I) is defined as n i=1 f One of the inadequacies of Riemann integration is that it is not a true inverse to the operation of differentiation: If f is continuous, then f is Riemann integrable and the (indefinite) integral gives back f (up to a constant), but f need not be Riemann integrable in general.Relative analysis sheds some light on the reasons for this phenomenon.Let us consider a differentiable function f and a fine Riemann partition I of [0, 1], where and x Adding these equations then gives where i • (I i ) V(f ) 0, and hence f is integrable and However, if f is merely "pointwise" differentiable, ( * ) is valid only under the assumption that (I i ) is ultrasmall relative to V(f , t i ); the assumption that (I i ) is ultrasmall relative to V(f ) is not sufficient.These considerations suggest that every derivative would become integrable if we used, in place of fine Riemann partitions, superfine partitions, defined as those tagged partitions where each (I i ) is ultrasmall relative to the level V(f , t i ), dependent on t i .It turns out that superfine partitions of [0, 1] in this strong sense do not exist-see Theorem 3.12 in the Appendix to this section.The idea does work if one employs instead a weaker notion of relative ultrasmallness due to Benninghofen and Richter [6] and Gordon [9,10].
Definition 3.6 Given a ∈ R, we say that a real number is a-ultrasmall relative to V if | | < ϕ(a) for all positive functions ϕ ∈ V defined on R.
relative to V for all x ∈ I i and all i = 1, . . ., n.For Riemann partitions, this is equivalent to (I i ) being t i -ultrasmall, for all i = 1, . . ., n.
Let ϕ be a positive function on [0, 1].We say that a tagged covering {(I i , t i )} n i=1 is subordinate to ϕ if (∀x ∈ I i )(|x − t i | < ϕ(t i )), for all i = 1, . . ., n.A well-known classical result (Cousin's Lemma) states that for each positive ϕ there exist Riemann partitions of [0, 1] subordinate to ϕ.The existence of Riemann partitions of [0, 1] superfine relative to V follows from this by Idealization.
If the word "fine" is replaced by "superfine" in Definition 3.5, one obtains a notion of integral that is equivalent to the one introduced by Henstock and Kurzweil.The standard definition is as follows.Definition 3.8 A function f on [0, 1] is Henstock-Kurzweil integrable if there is a real number R such that for every > 0 there is a positive function ϕ such that | ( f ; I) − R | < holds for all Riemann partitions I of [0, 1] subordinate to ϕ.
Henstock-Kurzweil integral agrees with Lebesgue integral on nonnegative functions (more generally, on absolutely integrable functions), but there exist functions that are Henstock-Kurzweil integrable but not Lebesgue integrable; in particular, all derivatives are Henstock-Kurzweil integrable.The nonstandard theory of Henstock-Kurzweil integral is worked out in some detail in Benninghofen [5] and in [18].
For every tagged partition I = {(I i , t i )} n i=1 anchored in A we construct a Riemann partition J = {(J j , s j )} m j=1 anchored in A with n i=1 I i ⊆ m j=1 J j , hence n i=1 (I i ) ≤ m j=1 (J j ), in such a way that if I is superfine relative to V, then J is also superfine relative to V. [More strongly, if I is subordinate to ϕ, then J is subordinate to ϕ, for any ϕ > 0.] From this result, m 2 (A) ≤ m 3 (A) follows immediately.
Let I i = [a i , b i ] and let r 1 , . . ., r p be a one-one enumeration of {t 1 , . . ., t n }.If r = t i 1 = . . .= t i k and r = t i for i = i 1 , . . ., i k , let Note that > 0, and if Claim: There is a Riemann partition J = {(J j , s j )} m j=1 such that p =1 C = m j=1 J j and for each j there is such that s j = r and J j ⊆ C .
Clearly, if C is superfine relative to V [subordinate to ϕ], the same holds for J, and this concludes the proof.
Proof of Claim.The construction is by recursion.Wlog we assume that 1 = max{ 1 , . . ., p }.By inductive assumption, for the covering {(C , r )} p =2 there is J = {(J j , s j )} m j=1 satisfying the Claim.First, we omit from J those (J j , s j ) where J j ⊆ C 1 .Let j − and j + be such that r 1 − 1 ∈ J j − and r 1 + 1 ∈ J j + , if they exist.

GRIST and external sets.
Relative set theory enriches the ∈-language of set theory by additional means: the binary relative standardness predicate , or equivalently, variables over levels.In this extended language it is possible to describe subcollections of sets that are not themselves sets.We use the term external sets in the inclusive sense, to refer to such collections as well as to the "usual" sets, sometimes called internal sets for emphasis.
There are two important reasons for expanding relative set theory to a theory of external sets.The first is foundational: the tendency to abstraction, so prominent in mathematics since at least the time of Cantor, makes us employ such collections almost automatically; "they are there."The second reason is pragmatic: while the work of Nelson and the IST school shows that much can be accomplished by purely internal means, and while the techniques based on GRIST further facilitate and extend the internal approach, most of the practitioners of nonstandard analysis use some version of the Robinsonian model-theoretic framework, grounded in superstructures and characterized by heavy use of non-internal sets.If relative set theory is to serve as a universal vehicle for nonstandard analysis, it has to accomodate non-internal sets and the model-theoretic framework.
Here we consider four increasingly powerful extensions of GRIST to a theory of external sets, motivated mostly by pragmatic considerations.We believe that all arguments of current nonstandard practice can be formalized in (the strongest of) these systems.The issues at play in this section are similar to those that arise from attempts to extend BST to a theory of external sets.We rely heavily on the monograph [19] of Kanovei and Reeken, which contains a systematic comparative study of such extensions.
The most elementary use of external sets in relative set theory is as extensions of formulas of the language of GRIST.For example, given r ∈ R, we can define V-monad of r: m V (r) := {x ∈ R : |x − r| is ultrasmall relative to V}; V-galaxy of r: g V (r) := {x ∈ R : |x − r| is limited relative to V}; the monad of r ∈ R is then m V(r) (r), and the galaxy of r ∈ R is g V(r) (r); in proving countable additivity of Loeb measures.Kanovei and Reeken [19, Section 9.5] observe that the essence of the construction of Loeb measures can be carried out already in much weaker systems, such as E 1 −GRIST [this is why we included External Saturation and Standard Size Choice in it], except that the Loeb algebra and the measure itself are not sets.E ω −GRIST adds the External Power Set axiom to remedy this last difficulty.
The universe of E ω −GRIST resembles a superstructure over I; with the addition of two more axioms we can obtain something like the familiar model-theoretic framework.
E Ω −GRIST Following [19, Definition 8.1.3],a set x ∈ S is condensable if there is an external transitive set T x and a map y → y defined on T∩S and such that y = { z : z ∈ y∩S} holds for all y ∈ T ∩ S.
E Ω −GRIST is E ω −GRIST plus the axioms: Let H be the class of all external sets and WF the subclass of all externally wellfounded external sets.We let W := { x : x ∈ S} be the class of all feasible well-founded sets [W = WF feas in the notation of [19]].It can be shown that either W = WF or W = V Ω Ω Ω for some external ordinal Ω Ω Ω, where V ξ , ξ ∈ Ord is the external von Neumann cumulative hierarchy.It then follows [19,Exercise 8.2.6] that there is a uniquely determined ∈-isomorphism * : W → S ⊆ I. Hence W is an interpretation of ZFC (ie, P W holds for all axioms P of ZFC) and * : W → I is an ∈-elementary embedding [ie, (∀x ∈ W)(P W (x) ↔ P I (x)) holds for all ∈-formulas P ].This is the scheme "WF feas * → I [ in H ]" in the terminology of [19].The analogy with the model-theoretic framework is very close.In the model-theoretic terminology, elements of W are the standard sets, elements of S are the standard copies, I is the universe of internal sets, and the entire universe H of E Ω −GRIST is the "superstructure" of external sets.Most arguments of model-theoretic nonstandard analysis readily transfer into this setting; see Kanovei and Reeken [19,Chapter 2] for an exposition of nonstandard analysis in the closely related framework of "WF * → I[ in H]."

E−GRIST
We extend the theory once more, again for both practical and fundamental reasons.The external universe of E Ω −GRIST satisfies all of ZC, but not necessarily Replacement.
From the point of view of applications there is also something missing.For example, the Loeb measure L(µ) is an external set; but we might like to have a measure with the properties of L(µ) in the standard universe S, or in its well-founded counterpart W.These shortcomings are overcome in the theory obtained from E Ω −GRIST by adding yet two more principles.
In the absence of Regularity, External Collection is stronger than External Replacement.E−GRIST implies that every external set is in one-one correspondence with some element of WF (in fact, with some external ordinal).Hence L(µ) is isomorphic to some measure m 0 ∈ WF, in any appropriate sense of "isomorphic."Next, by External Transfer, there is a measure m ∈ W that has "the same properties" as m 0 , at least as far as these properties can be expressed by ∈-formulas in WF.Finally, * m ∈ S is a standard measure isomorphic to m, via the isomorphism * .
One consequence of E−GRIST is perhaps unexpected and worth pointing out.By This observation suggests that, in E−GRIST, the universe H of external sets, or perhaps its well-founded part WF, should be regarded as the "usual" universe of sets, while I, S, and W are just "models" of set theory, with * : W → S I being close to the Robinsonian model-theoretic framework.Before discussing this matter further, we consider the simpler nonstandard set theory IST.
IST is a conservative extension of ZFC, and can thus be viewed as merely a formal tool for proving theorems of ZFC.However, most users of IST wish to identify the "usual" sets with some objects provided by IST.For this purpose there are two choices.
The "official" philosophy of Nelson, enshrined by the IST terminology, is to regard the internal sets as the "usual" sets.This is the view we follow in GRIST as well.It has significant pedagogical advantages, as discussed in detail in [17].However, one can equally well take the view that standard sets of IST are the "usual" sets.This is the philosophy of the author's [13]; many working mathematicians seem to find it more palatable.The point we wish to stress is that there is no mathematical reason for preferring either alternative.The standard universe S and the internal universe I have equal claims to being the "usual" universe of set theory.
On the other hand, if we look at not just the universes themselves, but also at the way they are embedded in the wider "cosmos" of IST, we notice some essential differences; most important for mathematical applications, infinitesimals (ultrasmall numbers) exist relative to the standard universe (level) S, but not relative to the internal universe I.It is precisely this asymmetry that is remedied by relative set theory.In fact, the guiding principle behind the development of GRIST was the desire to make all levels have the same view of the surrounding "cosmos" (technically, to make Transfer hold for all ∈--formulas, a feature we refer to elsewhere as "Full Relativization").
Coming back to E−GRIST: We have already noted that there are several universes that with some justification can be regarded as the "usual" universe of sets: I, S, W, WF, H (and possibly any level V ⊆ I).If we follow the ideas that led from IST to GRIST, a picture of an extension of E−GRIST emerges wherein every universe can be regarded as the "usual" one, and all universes have the same view of the "cosmos"; in particular, every universe has a stratification into levels that satisfies GRIST, has its own external universe, and in this external universe it is isomorphic to a transitive universe, via its own * .
The "relativistic" perspective on axiomatic nonstandard set theory was advocated and developed by Ballard in [2].In a later unpublished paper [3] Ballard realized that the foundational issues raised by many universes of axiomatic nonstandard analysis are similar to those raised by many universes obtainable by forcing in traditional set theory, and proposed a coherent, uncompromisingly relativistic theory of the mathematical "cosmos" that accomodates both nonstandard and forcing extensions.A truly universal and philosophically satisfying theory of the nonstandard appears to require

( b )
Refer to the proof of Lemma 1.12 and the notation therein.We have R(y) ↔ R * (y) ↔ R * (y; V(y)) ↔ R * (y; V(y, x)) [by Transfer].The last formula is equivalent to the formula R(y, x) obtained from R by replacing each occurence of the universal quantifier (∀U)(y 1 , . . ., y k ∈ U . ..) by (∀U)(y 1 , . . ., y k , x 1 , . . ., x n ∈ U . ..) [and similarly for ∃].It is easy to see that replacing each occurence of R(y) in P by R(y, x) yields an internal formula equivalent to P .
and r ∈ sh V (A) = B. Hence B satisfies the characterization of closed sets from the proof of (a).(c) follows from (a) and (b), and (d) follows from the definition of sh by Transfer.

[ 16 ,
Proposition 12.33], in GRIST one can define a mapping of the (internal) set N of natural numbers onto the class of all standard ordinals.The composition of this mapping with the internal von Neumann cumulative hierarchy ξ → V ξ maps N onto {V ξ : ξ ∈ S}, and External Collection implies that the latter is an external set.Hence [External Union] ξ∈S V ξ = I is an external set.We conclude that I, S ⊆ I and W [the inverse image of S by * ] are external sets! [This conclusion does not hold in weaker theories, such as E Ω −GRIST; see the proof of Theorem 4.1.]

Theorem 4 . 1 E
−GRIST is a conservative extension of ZFC.Proof and further discussion of Theorem 4.1 We rely heavily on the material in Kanovei-Reeken [19, Sections 8.1, 8.2].The preceding remarks indicate that E−GRIST is similar to Kawaï's theory KST.The principal differences are: