An examination of the SEP candidate analogical inference rule within pure inductive logic

RationalityUncertain reasoning Within the framework of (Unary) Pure Inductive Logic we investigate four possible formulations of a probabilistic principle of analogy based on a template considered by Paul Bartha in the Stanford Encyclopedia of Philosophy [1] and give some characterizations of the probability functions which satisfy them. In addition we investigate an alternative interpretation of analogical support, also considered by Bartha, based not on the enhancement of probability but on the creation of possibility.©2015 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).


Introduction
Paraphrasing his article in the Stanford Encyclopedia of Philosophy, SEP [1], Paul Bartha  plausible that Q * holds in the target domain because of certain known (or accepted) similarities with the source domain, despite certain known (or accepted) differences.
In turn he examines a corresponding candidate analogical inference rule, CAIR for short: Suppose S and T are the source and target domains. Suppose P 1 . . . , P n (with n ≥ 1) represents the positive analogy, A 1 , . . . , A r and ¬B 1 , . . . , ¬B s represent the (possibly vacuous) negative analogy, and Q represents the hypothetical analogy. In the absence of reasons for thinking otherwise, infer that Q * holds in the target domain with degree of support p > 0, where p is an increasing function of n and a decreasing function of r and s.
The primary intention of this paper is to formulate, as principles, mathematically more precise versions of CAIR within the framework of (Unary) Pure Inductive Logic, PIL for short, where 'degree of support' is identified with (subjective) probability, and to determine which probability functions satisfy these versions in the presence of certain other, widely accepted, symmetry requirements. We should point out that this differs somewhat from the 'Applied Inductive Logic' framework in which Bartha considers and dismisses CAIR, even as a non-starter. Following that we shall suggest and investigate within this formal framework an alternative interpretation of analogical support based on the creation of possibility, also considered by Bartha as what he terms 'the modal conception', see [1,Section 2.3].
This (Unary) Pure Inductive Logic framework is explained in, for example, [19,20]. In short we work in a first order predicate language L q with finitely many predicate, i.e. unary relation, symbols R 1 , R 2 , . . . , R q , countably many constants a 1 , a 2 , a 3 , . . . , which are intended to name all the elements of the universe, and no function symbols nor equality. Let SL q and QFSL q denote, respectively, the sentences and quantifier free sentences of this language.
A probability function on L q is a function w : SL q → [0, 1] such that for all θ, φ, ∃x ψ(x) ∈ SL q : (P1) If |= θ then w(θ) = 1 (P2) If |= ¬(θ ∧ φ) then w(θ ∨ φ) = w(θ) + w(φ) (P3) w(∃x ψ(x)) = lim m→∞ w( m i=1 ψ(a i )), this last condition reflecting the intention that the constants a i exhaust the universe. The primary goal of PIL, as we would present it, is to investigate which such probability functions are logical or rational in the sense of corresponding to the subjective probabilities assigned by a rational agent in the absence of any further knowledge or intended interpretation of the constant and predicate symbols.
Whilst we have no precise definition of what we mean by 'logical' or 'rational' here, indeed such a clarification is essentially equivalent to the above goal, we do at least seem to have some intuitions about what constitutes being rational, or perhaps more usually what constitutes being irrational. For example in the circumstances of such zero knowledge it would seem to be irrational to treat any one constant differently from any other. Precisely then a rational probability function w should satisfy:
Similarly there would seem to be no rational reason to give two predicates different properties, nor even between a predicate and its negation. This leads to imposing two further requirements on a rational probability function to satisfy:

The Principle of Predicate Exchangeability, Px
If R i , R j are predicate symbols of L q then for θ ∈ SL, w(θ) = w(θ ) where θ is the result of transposing R i , R j throughout θ.

The Strong Negation Principle, SN
For θ ∈ SL q , w(θ) = w(θ ) where θ is the result of replacing each occurrence of the predicate symbol R in θ by ¬R.
In what follows we shall restrict our attention to probability functions w satisfying these three principles Ex, Px, SN.
There is a further principle which we will need subsequently and whose rationality may be argued for as follows. Suppose that on the basis of some considerations we have made the probability function w L q our rational choice of probability function on L q and the probability function w L r our rational choice of probability function on L r where r ≥ q. Then since SL q ⊆ SL r it would seem perverse if w L r did not agree with w L q on SL q , since it would mean that what we considered a rationally justified value for the probability of θ ∈ SL q depended on the presence or absence of relation symbols in the language which were not even mentioned in θ.
Given our earlier argument for the rationality of Px + SN this leads to the following 'meta-rationality' principle which it is desirable for a probability function w L q on L q to satisfy, though unlike Ex, Px and SN we will not actually assume it as the default:

Unary Language Invariance with Strong Negation, 3 ULi + SN
A probability function w on L q satisfies Unary Language Invariance with SN if there is a family of probability functions w L r , one on each language L r where r ∈ N + = {1, 2, 3, . . .}, satisfying Ex + Px + SN and such that w = w L q and whenever r ≤ s then w L s agrees with w L r on SL r .
Whilst the rationality of observing symmetries and language invariance as expressed by the above principles seems to us hard to question, the rationality of arguments by analogy appears much less forceful. Nevertheless in the real world we often are somewhat influenced by analogies, for clear accounts of such within mathematics see [22,23], and there have been several attempts, starting with Rudolf Carnap, to capture facets of analogy as a rational or logical principle within the framework of Inductive Logic, see for example [3], Carnap and Stegmüller [4] and later Festa [5], Hesse [9,10], Maher [15,16], di Maio [17], Romeijn [24], Skyrms [25].
In each of the next four sections we will add to this list of 'Principles of Analogy' by proposing interpretations, or variants, of CAIR within the framework of PIL and, in Theorems 1, 2, 3, 4, investigating the probability functions that satisfy them (in the presence of our standing assumptions Ex, Px, SN). Subsequently we will broaden the remit by proposing a principle of analogy (Dolly's Principle) based on the idea of analogy as a source of possibility (the modal conception as Bartha terms it) rather than increase of probability. Since mathematical results in these sections contain on occasions technicalities that some readers may wish to simply accept we shall now spend a little time introducing the terms that appear in their statements in order that they become directly accessible. A general overview of these results will then be given in the final section.
An atom of L q is a formula of the form ULi alone, see [20], only requires each of the probability functions w L r to satisfy Ex + Px.
So there are 2 q atoms for L q , which we denote α 1 (x), α 2 (x), . . . , α 2 q (x), corresponding to the 2 q different choices for the . Notice that because we only have unary relation symbols in the language, knowing which atom a constant satisfies tells us all there is to know about that constant.
Similarly for (distinct) constants 4 b 1 , b 2 , . . . , b n the state description that holds for them, that is the sentence tells us all there is to know about these constants. By a theorem of Gaifman, see [6] or [20,Theorem 7.1], a probability function is uniquely determined by its values on state descriptions.
Let D 2 q be the set of vectors For c ∈ D 2 q the probability function w c on L q is defined by In other words w c treats the α h i (b i ) as stochastically independent with individual probabilities c h i , i = 1, . . . , n. This probability function satisfies Ex but not Px nor SN except under special circumstances. For future reference we recall (see [20,Chapter 8]) that the w c are characterized by satisfying the
The functions w c are fundamental since any probability function satisfying Ex can be expressed from them as an integral over D 2 q : De Finetti's Representation Theorem. Let w be a probability function on SL q satisfying Ex. Then there is a normalized and countably additive measure μ on the Borel subsets of D 2 q such that where for j = 1, 2, . . . , 2 q , m j = |{i | h i = j}|, the number of times that j occurs amongst h 1 , h 2 , . . . , h n .
De Finetti's Theorem finds numerous important, and slick, applications in PIL; for example Humburg's proof (see [14], or [20,Chapter 11]) of a result of Gaifman [7], that we shall need later, that Ex implies
A second important family of probability functions on L q are the c These c L q λ satisfy Ex + Px + SN and even ULi with SN, the corresponding language invariant family being obtained by fixing the λ and letting the q range over N + .
To simplify the notation in what follows we shall omit the superscript L q in c L q λ when q is clear from the context.
Notice that any permutation of predicates, or transposition of R j , ¬R j , generates a permutation of the atoms α i . We shall say that a permutation σ of atoms is licensed by Px + SN if it can be formed as a composition of such permutations. Notice that if w satisfies Px + SN then for such a σ, Furthermore, since by Ex the left (and right) hand side is the same for any choice of distinct constants we shall, to simplify the notation, sometimes omit the instantiating constants and denote it simply as where m j is the number of times that j appears in h 1 , h 2 , . . . , h n . For future reference note that Atom Exchangeability is the assertion that (4) holds for any permutation σ of the set of atoms, not just those licensed by Px + SN.

The general analogy principle
The first question we might feel obliged to address vis-a-vis CAIR is what exactly the forms of the Q, P, A, B are and what exactly is being treated analogously in the relationship between Q and Q * etc.what we shall refer to as the carrier of the analogy. The four versions we shall consider are really centered around possible answers to these questions within the framework of Unary 5 PIL. In our first attempt at a formulation the Q, P are just quantifier free sentences and it is the constants which are the carriers: The General Analogy Principle, GAP For a = a 3 , a 4 , . . . , a k and ψ(a 1 , a), φ(a 1 , a) ∈ QFSL, a)).
In this principle then ψ(a 1 , a) ∧ ψ(a 2 , a) provides 'evidence' that a 1 , a 2 are similar and hence should enhance (or at least not decrease) the probability that a 2 should again be similar to a 1 in satisfying φ(x, a) given that a 1 does.
A few comments are in order here. Firstly we shall identify (5) with a convenience which satisfactorily allows us to dispense with the problem of conditioning on sentences with probability zero. 6 Secondly, in this formulation we have taken as vacuous the negative analogies A 1 , . . . , A r and ¬B 1 , . . . , ¬B s . In particular then the monotonicity element of Bartha's representation has been reduced to a single inequality. 7 Thirdly, notice that by Ex the choice of constants a 1 , a 2 . . . , a k is not relevant since it implies the same principle for any distinct choice of constants. Finally, within this formulation we are restricting φ(a 1 , a), ψ(a 1 , a) to be quantifier free, again for reasons which will shortly become clear.
As we now show GAP fails to satisfactorily capture our (presumably viable) intuitions about analogy.
Theorem 1. Let w be a probability function on the unary language L q satisfying Px + SN. 8 Then Proof. (A): That c 0 (on L 1 ) satisfies GAP will be shown in part (B) below.
which provides the required counter-example. 5 Several of our results apply also to Polyadic Inductive Logic, see [19,20] for further details, but for simplicity we shall limit ourselves here to the purely unary. 6 More generally we shall identify (a/b) ≥ (c/d) with ad ≥ bc. 7 Subsequent results will somewhat vindicate this decision. 8 Recall the standing assumption that all probability functions considered satisfy Ex.
So now suppose that w is not of the form νc ∞ + (1 − ν)c 0 for any 0 ≤ ν ≤ 1. Then the de Finetti prior of w must have a support point c, 1 − c with 0 < c < 1/2. (In other words every open set containing this point has non-zero measure.) Let φ(a 1 ) = R 1 (a 1 ). Then by the Extended Principle of Instantial Relevance, (3), and SN, where as usual [mc] is the integer part of mc and for large m (see for example [20,Chapter 12]). Comparing with (6) we have the required counterexample.
(B): Here we shall show the result even without the a being present. We first need to introduce another probability function, , on L q . For α i an atom of L q let α c i be that atom of L q which disagrees with α i on every R j (x), in other words, for j = 1, 2, . . . , q, Now let e 1 , e 2 , . . . , e 2 q−1 run through all vectors in D 2 q which have zeros at all coordinates except for two, say the ith and jth, with α c i = α j , and in those places the entry is 1/2. Set We now show that in the case of a unary language L with q ≥ 2 predicates and a probability function w on L satisfying Ex, Px and SN and not of the form λ + (1 − λ)c 0 for some 0 < λ ≤ 1 there are φ(a 1 ), ψ(a 1 ) for which (5) fails. 9 To this end let G ⊂ {1, 2, . . . , 2 q }, |G| = 2 q−1 and let x = w(α i α i ) and y ij = w(α i α j ). Notice that x is independent of i since for any atoms α i , α j there is a permutation σ of atoms licensed by SN such that σ(α i ) = α j .
Since |G| = 2 q−1 and i∈G i =j∈G Let Notice that Then with the above abbreviations, and the inequality (5) becomes Multiplying out gives Canceling out terms now gives By (7) this right hand side is at least 0. We now show that for a suitable initial choice of G it must be strictly positive.
Since w is not of the form λ . Then there is a permutation σ licensed by SN such that σ(α i ) = α k , σ(α j ) differs from α k on r predicates 10 and by . Then because 1 ≤ r < q we can find a permutation τ licensed by SN + Px such that α k = τ (α k ) and τ σ(α j ) |= ¬R 1 (x), and of course In this case then we can take G to be the set of those atoms α i such that α i |= R 1 (x) and obtain the required contradiction to (9). So GAP does not hold.
Turning now to the case where and the inequality (5) becomes which fails since λ > 0, and gives the required counter-example.
To complete case (B) we now show that GAP holds for c 0 on L q for any q. Indeed we shall show that it holds even in the case where φ, ψ are sentences rather than just quantifier free. To this end notice (or see for example [8] where a similar result is derived) that for the unary language L q , any sentence mentioning constants a 1 , . . . , a n is logically equivalent to a sentence in the form where the i,m ∈ {0, 1} and as usual ψ is ψ if = 1 and ¬ψ if = 0. 10 The permutations licensed by Px + SN are precisely those that preserve Hamming distance, see [11,Theorem 2].
Noticing that we can see that for each of the disjuncts in (10) either It is perhaps worth remarking here that in the case of q = 1 the extra constants a 3 , a 4 , . . . , a m employed in forming the counter-example to GAP cannot be dispensed with. In fact when q = 1 and the extra constants are absent GAP trivially holds for any w (and even continues to hold when φ, ψ may contain quantifiers provided w not of the form λc 0 + (1 − λ)w , with 0 < λ < 1 and w = c 0 , see [18]).
As with GAP it is the constants which are the carriers of the analogy and presumably, judging from their similarity, EAP's justification is based on the same intuitions, so one might have expected that they would again have the same solutions, or more aptly lack of solutions. This is indeed almost the case provided we allow the additional constants a in EAP. However dropping these constants gives a somewhat different, but still very restricted, set of solutions, in contrast to any supposedly similar intuitions.
Since for θ, ξ ∈ SL q , w(θ ∧ ξ) = w(ξ) whenever w(θ) = 1, it follows that (11) holds with equality. 2 Note that the solution c L q 0 to (11) actually satisfies the stronger condition ULi + SN. It is clear that in the proof of Theorem 2 the additional constants a are playing an important role, and indeed that is the case. In particular, see [13], for q ≥ 2 the probability functions c L q λ satisfy the weaker version of EAP without the additional constants a, when, in fact exactly when, 11 equivalently when Consequently the only λ for which this principle can hold for all q is λ = 0. Condition (14) is interesting because, to our knowledge, there are currently no other 'rational principles' considered in Inductive Logic which differentiate between the λ in the open range (0, ∞). In this case the often preferred value for λ of 2 q (which corresponds to the uniform de Finetti prior for c L q λ ) lies above the bound given in (14) (for q ≥ 2) so that with that somewhat popular choice this weaker version of EAP fails.

The constant analogy principle
The previous two attempts to capture even a part 13 of Bartha's representation of analogy can at best be said to tell us what is not possible in the presence of Px + SN. Perhaps there may be more probability functions satisfying these analogy principles if we dropped Px and/or SN, but given the obvious strong attraction of Px and SN on grounds of symmetry compared with the apparently hazy intuitions which begat GAP and EAP this would hardly seem a worthwhile investigation in the context.
An alternative, which also seems closer to Bartha's intention, is to restrict the P, Q, A, B etc. to having the particularly simple form of just R(a), i.e. a unary relation applied to a constant. This yields two further principles depending on whether we take the carrier of the analogy to be the constants or the relations. 14 In this section we take the analogy to be between the properties of two constants, the known positive analogies being instances where a predicate agreed on these two constants and a negative analogy when it disagreed. Precisely, for we define the 'distance' between φ and ψ to be is a decreasing (not necessarily strictly) function of φ − ψ . 12 In more detail suppose that L is a polyadic language and w = c 0 is a probability function on L satisfying Px + SN. Then there must be some relation symbol P of L and a, b such that w(P ( a) ∧ ¬P ( b)) > 0. By considering w(P ( a) ↔ ¬P ( d)) where d are new constants (not occurring in a and b) and using Ex we can see that we must have w(P ( c 1 ) ∧ ¬P ( c 2 )) > 0 where the c 1 , c 2 are disjoint. Now define the probability function v on where c i are disjoint blocks of constants. Since w satisfies SN and Ex so does v. Let φ, ψ ∈ QFSL 1 provide counter-examples required for Theorems 1 and 2 for v and let φ * , ψ * be the result of replacing each R 1 (a i ) in φ, ψ respectively by P ( c i ). Then these provide the counter-examples required for Theorems 1 and 2 for w.
In the other direction to show that c L 0 is a solution notice that if L has q relation symbols P 1 , . . . , P q then for θ ∈ QFSL, c L 0 (θ) = c L q 0 (θ ) where θ is the result of replacing each P i (a j 1 , a j 2 , . . . , a j r ) from L by R i (a j 1 ). 13 Since negative analogies do not figure. 14 It is also possible to consider a formalization in which it applies to sentences as carriers, see the Counterpart Principle of [12], [20,Chapter 22].
Given that in this principle no particular emphasis is being placed on the number of predicates R 1 , R 2 , . . . , R n that we have at our disposal it seems natural to assume not just Px + SN but rather ULi + SN. In that case we have the following somewhat satisfying result.
Theorem 3. Let the probability function w on L q satisfy ULi + SN. Then w satisfies CAP.
Proof. Since w is part of a ULi family w L r for r ∈ N + we may take the union of all these probability functions to produce a probability function defined on sentences of the language L = ∞ r=1 L r and extending w. To avoid introducing any additional notation, and because it will not cause any confusion, we will also use w to denote this probability function.
Let β 1 , β 2 , . . . , β 2 n+1 denote the atoms of L n+1 , say where φ, ψ are as in the definition of CAP. Then By using the permutations of atoms licensed by Px + SN we may suppose here that where n − m = φ − ψ . From this it is clear that (17) is purely a function of φ − ψ (and n).
It only remains to show that it is a decreasing function of φ − ψ . Assume for the moment that the are all non-zero. Then by dividing top and bottom by its numerator we see that (17) will be a decreasing function of φ − ψ just if is an increasing function of φ − ψ .
To this end define a function u on the state descriptions of the language L 1 (whose atoms are just Notice that by Px the particular choice of distinct predicate symbols R 1 , . . . , R t here is irrelevant. Using the fact that w satisfies ULi + SN it can be checked that u is, or at least extends to, a probability function which satisfies Ex. With this definition the condition (19) becomes the requirement that is an increasing function of m. This will follow once we show that .
Since u satisfies Ex we know by de Finetti's Theorem that for some countably additive measure μ on the Borel subsets of [0, 1] that Using this and writing h( But multiplying out this reduces to which holds by Hölder's Inequality. Returning now to our earlier assumption that the w(β g (a 1 ) ∧ β h (a 2 )) are all non-zero, if this fails then defining u as above we should have for some j, k. However by considering again (22)  Not all probability functions satisfying just Px + SN satisfy CAP. For example when n + 1 = 3 and we order the β j as (in the obvious shorthand) b = 1/10, a = 1/5 and w is 1 w a,b,b,a,b,b,b,b + w a,b,b,b,b,a,b,b + w a,b,b,b,b,b,a,b   + w b,a,a,b,b,b,b,b + w b,a,b,b,a,b,b,b + w b,a,b,b,b,b,b,a   + w b,b,a,b,a,b,b,b + w b,b,a,b,b,b,b,a + w b,b,b,b,a,b,b,a   + w b,b,b,a,b,b,a,b + w b,b,b,a,b,a,b,b + w b,b,b,b,b,a,a,b then it can be checked that contrary to the requirement on (19), despite w satisfying Px + SN (but not ULi + SN of course). From Theorem 3 it follows that all the c L q λ satisfy CAP. Indeed it is quite straightforward to show, appealing to de Finetti's Theorem again, that any probability function satisfying Atom Exchangeability will satisfy CAP, whether or not it satisfies ULi + SN.
Satisfying as Theorem 3 may be however it does raise a slightly uncomfortable issue. Firstly the intuition behind it seems no different from that which initially prompted us to propose GAP. Given the failure of that principle can we really claim that Theorem 3 in some way justifies our intuition? Is it not more reasonable to conclude that this theorem is grounded not on 'analogy' but on some different basis? And indeed a study at the proof shows that the key step is an application of a provable version of the Strong Principle of Instantial Relevance, see [20,Chapter 21] or [21], in which case we could be said to be simply appealing to our intuitions about relevance. 15 Putting it another way then it could be said that the 'analogy' within CAP is really just reducible to 'relevance', raising for a moment the question whether, within the context of PIL, analogy is anything more than a special case of relevance.

The predicate analogy principle
In contrast to the three previous sections we now consider an interpretation of Bartha's representation in which we take the predicates of the language to be the carriers of the analogy. That is we take the analogy to be between the properties of two predicates, the known positive analogies being instances where these predicates agreed on a constant and a negative analogy when they disagreed. Precisely, for define the 'distance' between φ and ψ to be and set: is a decreasing function of φ − ψ (for fixed n).
Notice that since only two predicate symbols appear in (23) it is natural to first study this principle when q = 2. 16 Setting 15 Or ultimately symmetry since Ex implies SPIR for L 1 . 16 The characterization for q > 2 (with Px+SN) just requires the restriction to SL 2 to have the form we shall shortly be describing. ψ( a)) ,

this condition (23) is equivalent to
being an increasing function of φ − ψ , a fact that we shall use repeatedly in what follows. One family of probability functions on L 2 satisfying Px + SN and PAP are the where 0 ≤ b ≤ 1/2. Clearly the u (b) satisfy Px + SN. To see that they also satisfy PAP notice that for φ, ψ as in the statement of PAP and m = φ − ψ , and this right hand side, when defined, is increasing in m (for fixed n ≥ m).
A second family of probability functions on L 2 satisfying PAP + Px + SN, in this case rather trivially, are those of the form where 0 ≤ d ≤ 1. Trivially because any φ( a) ∧ ψ( a) containing atoms both from {α 1 , α 4 } and from {α 2 , α 3 } gets probability zero.
In fact the probability functions which satisfy PAP + Px + SN are precisely those whose restriction to SL 2 is a convex mixture of probability functions from these two families. Precisely: and either μ(A) = 1 where Proof. Suppose that w satisfies PAP + Px + SN. Let b = b 1 , b 2 , b 3 , b 4 be a support point of the de Finetti prior μ of w. We shall use [20,Lemma 12.1] which tells us that where as usual [rb 1 ] is the integer part of rb 1 etc. 17 Recall the convention introduced at footnote 6 concerning zero denominators. Let where the φ i , ψ i only mention the predicate R 1 , R 2 respectively. Note that First assume that b 1 = 0. Using (24) we can make by picking r suitably large. Similarly, since by Px + SN, b 1 , b 3 , b 2 , b 4 must also be a support point of μ, we can make arbitrarily close to b 3 /b 1 . But since PAP gives that these two values b 2 /b 1 , b 3 /b 1 must be equal. It follows that b 2 = b 3 . If b 2 = 0 then we already have that b 1 , b 2 , b 3 , b 4 ∈ B. If b 2 = 0 a similar argument using the support points From this it follows that μ(A ∪ B) = 1 for A, B as above.
We claim that it must be the case that μ(A) = 1 or μ(B) = 1. For otherwise we can pick support points of μ, b, 1/2 − b, 1/2 − b, b ∈ A − B and, without loss of generality, d, 0, 0, 1 − d ∈ B − A (with 0 < b < 1/2 and d = 0, 1/2). Then by PAP we must have equality between and for k, j ≤ n. Since for any x 1 , x 2 , x 3 , x 4 ∈ A we have x 1 = x 4 it must be the case that and by the existence of b, 1/2 − b, 1/2 − b, b these are non-zero. Furthermore, for n > 0, so it must be the case that Let Using which we have already met as (20) and shown to be increasing in m under the assumption of Ex. Finally in the case when μ(B) = 1 it is straightforward to check that PAP holds, trivially in fact. 2 Given Ax there are only two probability functions on L 2 satisfying this, c 0 and c ∞ . For suppose w satisfies PAP and Ax and as usual let Then by PAP But that gives that either w(α 2 (a 1 ) ∧ α 1 (a 2 )) = 0 or w(α 2 (a 1 ) ∧ α 2 (a 2 )) = w(α 2 (a 1 ) ∧ α 3 (a 2 )) and with Ax the only possibilities here are c 0 and c ∞ .

Analogy as possibility
In the previous four sections we have looked at formulations of analogical support as enhancement of probability. However as Bartha points out in [1] analogy can act to simply engender plausibility, or as we shall call it possibility. To give an example, the fact that the commonest bird in the United States in 1814 (the passenger pigeon) was extinct by 1914 may be used as an argument that 'by analogy', the monarcharguably the currently commonest butterfly in the United States, may equally regrettably be extinct a century from now. For here it seems that the argument is aimed not so much at raising the probability as creating the possibility which we will take to mean producing a non-zero probability.
One explanation why we might see this as in any sense a worthwhile argument to make in a discussion on the future of the monarch is that viewed from a certain angle monarchs and passenger pigeons may be thought of as the same thing, at least as regards the features that are actually relevant here. Thus the realization that it had happened once argues that it could happen again.
The example might seem to correspond to the Extended Principle of Instantial Relevance, (3). Here however we shall propose an alternative formulation which is about creating possibility, and also more obviously captures this idea of 'being thought of as the same thing as regards the relevant features'.
Notice that by repeated application (and Ex) it is enough that this principle holds for σ(2) = 1, σ(i) = i for i = 2.
Then K |= θ(σ(a 1 ) K , σ(a 2 ) K , . . . , σ(a m ) K ), where σ(a 1 ) K is the interpretation of the constant σ(a 1 ) in K etc. Clearly also where J is a structure for L extending K in which a J i = σ(a i ) K for i ≤ m. In other words (a 1 , . . . , a m ) and by logical equivalence, (a 1 , . . . , a m ).
Theorem 6. For the unary language L q , Ex implies DP.
Proof. Let θ(a 1 , . . . , a m ) ∈ SL q . Then as at (10) θ is logically equivalent to a disjunction of sentences of the form If w(θ(σ(a 1 ), σ(a 2 ), . . . , σ(a m )) > 0 then by Lemma 5 this must also hold for the image under σ of this representation of θ so for at least one such disjunct we must have The only way this is possible is if h i = h r whenever σ(a i ) = σ(a r ), otherwise the sentence in (26) would be inconsistent so have probability 0. So dropping repeated conjuncts (26) can equivalently be written as where g r = h i for i such that σ(a i ) = a r . From (27), de Finetti's Theorem and the Constant Irrelevance Principle give D 2 q a r ∈Rg(σ) So a r ∈Rg(σ) (∃x α j (x)) j ⎞ ⎠ must be non-zero for a non-null (with respect to μ) set of x ∈ D 2 q and as a result the same must hold for a r ∈Rg(σ) where s r is the number of a i mapped by σ to a r . Hence But the left hand side of (29) is just what we get if we apply de Finetti's Theorem to this conjunct in the representation of θ(a 1 , . . . , a m ), so the required result follows. 2 For unary languages then DP adds nothing new, it already follows from the standing assumption Ex. However this fact does not carry over to polyadic languages. For example if L is the language with a single binary relation symbol R and w is the obvious version of Carnap's m † (equivalently c ∞ ) on this language then w(∀x (R(a 1 , x) ↔ R(a 2 , x))) = 0 whilst w(∀x (R(a 1 , x) ↔ R(a 1 , x))) = 1.
Nevertheless there is still a wide class of polyadic probability functions which do satisfy DP, as we shall show in a forthcoming paper.

Conclusion
In short we have shown that in the presence of Ex+Px+SN the principles GAP, EAP and PAP place very severe demands on a probability function, and must now be considered dead ends, DP makes no demands at all whilst CAP is actually satisfied by a naturally attractive class of such probability functions, namely those that satisfy the somewhat stronger background condition of ULi + SN.
As far as Bartha's candidate representation is concerned then we can say that CAP seems to provide a viable formulation of it in the context of PIL whereas GAP, EAP and PAP do not. Still this raises the uncomfortable question of why they produce such different conclusions when they all appear to be based on similar intuitions about analogical support.
Given that CAP follows from ULi + SN it is an interesting question to ask from where in this background assumption the 'analogical support' originates. Inspecting the proof of Theorem 3 we see that the key inequality is (21) which derives from simply the assumption Ex via de Finetti's Theorem. This is exactly similar to the derivation from Ex of the Extended Principle of Instantial Relevance from which we might reasonably question whether 'analogy as enhancement of probability' is really anything more than 'relevance', an already quite well studied notion (see [20]).
Throughout this paper we have taken Ex + Px + SN, or ULi + SN, as our background assumptions. However the rather widespread acceptance of Johnson's Sufficientness Postulate (JSP) within Inductive Logic might on the contrary be used as an argument for strengthening these to Ax, or ULi + Ax, since these are consequences of JSP. Doing so would still give CAP (and DP) as a consequence but would, for q ≥ 2, restrict GAP, EAP and PAP down to the single probability function c L q 0 of Carnap's Continuum. Combined with previous results in [11,12] a pattern seems to be emerging with so called 'Analogy Principles', namely that they either hold, almost by chance, for some small family of otherwise (apparently) undistinguished probability functions, or they are actually consequences of some already established and acceptable principles such as ULi + SN or Ax. In other words, to our knowledge we do not currently have any analogy principles which genuinely introduce new concepts without also reducing the field of 'rational probability functions' down to almost a triviality (and leading to the conclusion that such a version of 'reasoning by analogy' is both very powerful and very dangerous). Of course there are numerous further formulations and variations that one might base on CAIR and perhaps some of those might yet endorse it in the context of PIL. On the basis of what we have here however the picture of analogical support as presented in CAIR seems not to have materialized, an outcome in parallel with Bartha's own criticisms of CAIR within what we would term Applied Inductive Logic.